Here's a VB programmer's introduction to developing Internet (TCP/IP) applications. We'll build a trivial web browser to start with Then we'll look at some of the issues you should consider when building client/server applications. We'll even build a simple one.
Submitted On | |
By | James Vincent Carnicelli |
Level | Beginner |
User Rating | 5.0 (55 globes from 11 users) |
Compatibility | VB 3.0, VB 4.0 (16-bit), VB 4.0 (32-bit), VB 5.0, VB 6.0, VB Script, ASP (Active Server Pages) |
Category | Internet/ HTML |
World | Visual Basic |
Archive File |
Table of Contents
Preface
In less than a decade, TCP/IP - the Internet - has emerged from the cacophony of networking protocols as the
undisputed winner. So many information protocols, from HTTP (web) to IRC (chat), have been developed to offer all
manner of electronic content. With TCP/IP dominance secured, many companies with in-house IT staffs are moving
towards developing their own client/server applications using home-grown or off the
shelf Internet protocols. This article can help you leap on board this roaring technology train.
Most Internet programmers developing for windows use some form or another of the Winsock API. You may already be aware of this API's infamy as a difficult one to master. As a VB programmer, you may also be aware of the fact that VB ships with a Winsock control that enwraps the deeply confusing Winsock API in a slightly less confusing package. But it's still confusing to most new programmers. It's also known for being buggy. It also doesn't help that all the functionality for developing clients and servers is lumped into one control, which leaves many programmers with little clue about how and when to use its features.
I recently developed a suite of controls called "Sockets" to build on the virtues of the Winsock control while masking most of its inadequacies. It's easier to use and offers sophisticated features like multi-connection management and message broadcasting. This code samples in this article will be built around the Sockets package.
Note: You can download the Sockets package from Planet Source Code. Search here for the posting's title: "Simple, clean client/server socket controls". Be sure to include the "Sockets" component ("Sockets.OCX") in any projects you create to try out the code samples. You can register the control so it appears in VB's component list from the Start | Run menu item using "regsvr32 <path_to_ocx>\sockets.ocx".
If you're already familiar with client/server and sockets concepts, you can skip right to the Sockets Package section for information specific to the controls used and how to use them.
Client / Server Concepts
Before we begin talking about Internet programming, let's give a brief introduction to the client/server
concept.
The "client/server" concept is a fundamentally simple one. Some automated entity - a program, component, machine, or whatever - is available to process information on behalf of other remote entities. The former is called a "server", the latter a "client". The most popular client/server application today is the World Wide Web. In this case, the servers are all those web servers companies like Yahoo and Microsoft run to serve up web pages. The clients are the web browsers we use to get at their web sites.
There are a number of other terms commonly used in discussing the client/server concept. A "connection" is a completed "pipeline" through which information can flow between a single client and a single server. The client is always the connection requestor and the server is always the one listening for and accepting (or rejecting) such requests. A "session" is a continuous stream of processing between a client and server. That duration is not necessarily the same as the duration of one connection, nor does a session necessarily involve only one simultaneous connection. "Client interconnection" is what a server does to facilitate information exchange among multiple clients. A chat program is a good example. Usually, nothing can be done with a given message until all of it is received. A "message", in this context, is any single piece of information that's sent one way or the other through a connection. Messages are typically single command requests or server responses. In most cases, a message can't be used until all of it is received. A "remote procedure" is simply a procedure that a client asks a server to execute on its behalf, which usually involves one command message going to the server and one response message coming back from it. Using an FTP client to rename a file on a server is an example. An "event" is the converse of a remote procedure call: the server sends this kind of message to the client, which may or may not respond to.
As programmers, we generally take for granted that a given function call does not return until it is done executing. Why would we want it to, otherwise? Having the code that calls a function wait until it is done is called "synchronous". The alternative - allowing the calling code to continue on even before the function called is done - is called "asynchronous". Different client/server systems employ each of these kinds of procedure calling modes. Usually, an asynchronous client/server system will involve attaching unique, random numbers to each message and having a response to a given message include that same number, which can be used to differentiate among messages that may arrive out of their expected order. The main benefit to this sort of scheme is that processing can continue on both sides without delays. Such systems are usually a bit complicated to create and make the most of.
There are plenty of other concepts related to the client/server concept, but this should suffice for starters.
Introduction to Internet Programming
As you might already have guessed, programming for the Internet is quintessentially client/server programming.
Your program can't connect to any other program using the Internet without that other program being an active
server. The feature that distinguishes Internet client/server systems from others is TCP/IP, which stands for
Transmission Connection Protocol / Internet Protocol. TCP/IP was developed as a generic communication protocol that
transcends the particular, lower-level network systems they rest on top of, like Ethernet LANs, phone lines, digital
cellular systems, and so on.
The Internet protocol - the IP in TCP/IP - is a complex packet-switching protocol in which messages sent through
connections are chopped up into "packets" - low-level messages our programs generally never need to directly see -
and sent across any number of physical connections to the other side of the Internet connection. These are
reassembled at the receiving end. Those packets may not arrive at the same time, though, and some may never arrive
at all. Internet phone and streaming video systems are fine with this sort of asynchronous communication, since
it's fast. Those programs use the "UDP" (User Datagram Protocol). For this article, we'll be dealing with the TCP,
in which these packets are properly assembled back into the original data stream at the receiving end, with a
guarantee that if the packets can get there, they will.
Inernet programming is also often called "sockets programming", owing to the Berkley sockets API, one of the first of its kind. Because programmers of sockets applications on windows use the "Winsock" API, it's also often called by "Winsock programming". Winsock is simply an adaptation of the Berkley sockets API for Windows.
Most Internet client/server systems use sockets to interface with TCP/IP. A socket is an abstract representation for a program of one end of an Internet connection. There are three basic kinds of sockets: client, server, and listener. A server application will have a listener socket do nothing but wait for incoming connection requests. That application will decide, when one arrives, whether or not to accept this request. If it accepts it, it will actually bind that connection to a server socket. Most servers have many server sockets that can be allocated; at least one for each active connection. The client application only needs a client socket. Either side can disconnect, which simply breaks the connection on both sides.
Once a connection is established, each side can send bytes of data to the other. That data will always arrive at the other side in the same order it was sent. Both sides can be sending data at the same time, too. This is called a "data stream". All data that gets sent between a client and server passes through this stream.
Everything else that applies to the client/server concept applies here as well, so we'll dispense with the details and get right into Internet programming with the Sockets controls.
The Sockets Package
The Sockets package, which you can download via the link in the preface, is a collection
of controls that simplify interfacing with the Winsock API and hence the Internet. There are controls for each of
the three types of sockets: client, server, and listener. There is also a control that combines one listener
socket and a bank of server sockets. This control hides the gory details of socket management that most servers
otherwise have to do themselves. A server that uses this control won't need to directly deal with the listener or
server sockets.
We won't get deeply into the details of the Sockets package here. Let me encourage you to refer to "help.html", the help file that came with the Sockets package you downloaded.
Build a Basic Web Browser
The HTTP protocol that drives the World Wide Web is surely the most used TCP/IP application. It's wonderful
that it should also be one of the easiest to master. We'll do this by building a simple web browser. It won't have
all the advanced features like WYSIWYG, scripting, and so on, but it will demonstrate the basic secrets behind HTTP.
Before we get started, you'll need to make sure you have access to the web without the use of a proxy to get through a firewall. If you're inside a corporate intranet, you may at least have access to your own company's web servers. If you're not sure about all this or can't run the program we'll be building, consult your network administrator.
Now, let's start by creating our project and building a form. Our project needs to include the "Sockets" component, which is the "Sockets.ocx" file that came with the Sockets package we downloaded. The form should look a little something like this:
Form1 | |||||||||||||
|
"CS" is a ClientSocket control. Be sure to give the button labeled "Go" the name "Go". Now enter the following code in the form module:
Private Sub Go_Click() Contents.Text = "" CS.Connect Host.Text, 80 CS.Send "GET " & Path.Text & vbCrLf & vbCrLf While CS.Connected If CS.BytesReceived > 0 Then Contents.SelText = CS.Receive End If DoEvents Wend End Sub
Hard to believe it could be that easy, but it is. Try running this with Host = "www.planet-source-code.com" and Path = "/vb/". Not surprisingly, this won't look as nice as it does in, say, Internet Explorer, but that's because we're only retrieving what the server has to offer. We're not actually reading what comes back to decide what to make of it. That's much harder. But the network interaction part is at the heart of what your Internet programming effort will most often be about. This code could form the basis of a program to grab information from one of your business partners' web sites to populate your own database: perhaps the latest pricing and availability figures; or perhaps to get a car's blue book value from a search engine.
Since this article isn't fundamentally about web browsers, we'll skip these sorts of details. Instead, we'll now build a custom client / server application from scratch.
Build a Complete Client / Server App
The Nature of the Beast
We've talked about the client / server concept and we've built a
web browser to demonstrate a client. Let's now invent an Internet protocol of our own and build client and
server programs to implement it.
Our application's purpose will be simple: to allow a number of different computers share some data variables in a way that allows all of them to not only read and write those variables, but also to be aware of any changes to that data by other computers as they happen.
What sort of information protocol do we need to make this happen? Obviously, we'll want the clients interested to be able to connect to a server that maintains the data. We'll keep it simple by not allowing any client to be disconnected during a session. We'll want to require clients to log in at the beginning of the session. The clients will need to be able to send commands to the server ("remote procedures") and get a response for each command invocation. We'll allow communication to be asynchronous, meaning the client won't have to wait for a response to a given command before continuing. We'll also need to have the server be able to trigger events the client can make use of. Here are the messages our clients and server will need to be able to exchange:
- LogIn <user> <password>
- LogInResult <true_or_false>
- GetValue <name>
- GetAllValues
- SetValue <name> <value>
- ValueEquals <name> <value>
- ValueChanged <by_user> <name> <value>
How will we represent a message? A message will begin with a message name (e.g., "GetValue") and will have zero or more parameters. Each message will be followed by <CR><LF>, the standard way Windows programs represent a new line. We'll put a space after the message name and between each parameter. Because we've given special meaning to the new-line character combination and the space character, we can't use them anywhere within the message names or the parameters. What if a parameter contains one of these special character combinations? Our protocol will include "metacharacters", or special combinations of characters that are meant to represent other character combinations. Here are the characters and what we'll be replacing them with:
=> "\b" | ("b" for "backslash") | |
=> "\s" | ("s" for "space") | |
=> "\r" | ("r" for "carriage return") | |
=> "\l" | ("l" for "line feed") |
Note that we're even replacing the backslash (\) character with a metacharacter because we're also giving special meaning to backslash as the start of a metacharacter representation.
The Code
Let's create the project. As before, the project needs to include the "Sockets" component, which is the
"Sockets.ocx" file that came with the Sockets package we downloaded. Create two forms, called "Server" and
"Client". They should look like the following:
Server | ||||||
|
Client | ||||||||||||||||||
|
"CS" is a ClientSocket control. "SSB" is a ServerSocketBank control. We'll give the button labeled "Set" the name "SetVar". We'll call the other two buttons on the client "StartServer" and "AnotherClient". Here's the code for the server:
Private VariableNames As Collection Private Variables As Collection |
The core of this code is the ProcessMessage subroutine. The message that's passed to it will be an array of strings representing the message name and its parameters. This array is generated by the ParseMessage routine, which we'll get to momentarily.
Now here's the code for the client form's module:
Private VariableNames As Collection Private Variables As Collection Private Buffer As String Private User As String |
As with the server, the core of the client's operation is the ProcessMessage subroutine. Since both the client and server use many of the same mechanisms, we'll be putting them into a shared library module we'll call "Shared" (".bas"):
'The port the server listens for connections on Public Const STANDARD_PORT = 300 |
Be sure to make "Sub Main" the start-up object in the project's properties.
Process Flow
Now let's analyze what's going on here. First, since the server has to handle multiple sessions, it needs to
maintain session data for each session. This happens as soon as the connection is established in the
SSB_Connected() event handler. The ServerSocket object passed in, called "Socket", has its
ExtraTag value set to a new Collection object, which we'll use to hold session data for this
connection/session. We add three values to it: "LoggedIn", "User", and "Buffer". "LoggedIn" is a boolean value
indicating whether or not the client has properly logged in. We don't want the client to do anything else until
that happens. "User" is the ID of the user that logged in. "Buffer" is where we'll temporarily store all data
received from the client until we detect and parse out a complete message for processing.
The ParseMessage() function in the shared module is called whenever data are received. This routine looks for the first occurrence of <CR><LF>, indicating the end of a complete message. If it finds it, it grabs everything before this new-line, splits it up by space characters, and puts the parts into the Message array. Naturally, it shortens the buffer to discard this message from it. ParseMessage() returns true only if it does detect and parse one complete message. There could be more, but this function only cares about the first one it finds.
Once a message is found, ProcessMessage is called, with the array containing the parsed message passed in. This routine will immediately exit if the client has not yet logged in, unless this message is actually the "LogIn" command. Otherwise, The "Select Case Message(0)" block directs control to whatever block of code is associated with Message(0), the message name.
Of course, the server needs to send messages to the client, too. It does this using the SendMessage() subroutine in the shared library, which takes the message parts and encodes them into our message format, being sure to translate "unsafe" characters like spaces into their metacharacter counterparts. It then sends this formatted message to the indicated socket control.
This is really all the server does. Of particular note, however, is what happens when a client sends the "SetValue" command message. Not only does the server update its list of variables. It also broadcasts a message to all the clients indicating that that value has changed using the .BroadCast() method of the ServerSocketBank control.
Now on to the client. The client form uses the same basic methodology, including the use of ParseMessage(), and SendMessage(), and ProcessMessage() (which is different for the client, of course, since it has to deal with different messages).
Where the client really differs from the server is in its initialization sequence. Upon loading, the client immediately tries to connect to the server (with the user providing details of where to find the server and whom to log in as). As soon as it's connected, it sends the "LogIn" message with the provided user information.
When the user clicks on the "Set" button, the client sends a "SetValue" message with the variable's name and value. As was mentioned before, the server responds by broadcasting to all the connected clients the new value and identifying which user changed it.
How can We Use this?
Taking a step back, it seems rather silly to imagine that anyone would want to actually use our client / server
application the way it is. But it does demonstrate a powerful concept rarely employed in even the most modern
business applications: real-time refresh. What if, for example, a typical data entry form connected to a database
were automatically updated when another user changed some part of the data this user is looking at? This paradigm
is also used in all online chat systems. It can be used for shared blackboards or spreadsheets.
The particularly neat thing about this approach to real-time refreshing is that the client is not expected to occasionally poll the server for the latest stuff - which may be a total refresh of the relevant screen or data. The server actively sends updated data to all the clients as information changes.
If we wanted to be able to pass binary data, like files or images, we could make the ParseMessage() routine a little more sophisticated by buffering bytes instead of string data (using the Sockets controls' .ReceiveBinary() methods). The ProcessMessage routine could then turn the message name into text and the individual message handlers could decide which parameters to translate into plain text and which to use in binary form. (Be aware, though, that the buffers used by the Sockets controls can only store as much as any VB byte array - about 32KB. One may need to send multiple messages if he needs to transmit a large chunk of binary data.)
Conclusion
Programming Internet applications opens up a whole new vista of opportunities. This is especially true as
organizations are realizing that they no longer have to commit their budgets to single-platform solutions.
Integrating disparate systems using TCP/IP as a common communication protocol gives unprecedented flexibility. The
Sockets package provides an excellent way to quickly and painlessly build both client and server systems. These can
be the glue that binds together lots of existing systems both inside and outside a corporate intranet. Or they can
be used to develop complete end products from web browsers to database engines.
The use of the Internet protocols will only grow in the coming years. It's not too late to jump on board. And the simple truth is that there is no single Internet protocl - not HTTP, not MessageQ, nor any other - that yet answers the needs of all applications. That's why people keep developing new ones. Starting at the essential foundation - TCP/IP itself - ensures one the greatest flexibility of choices and can even help free one from the dangers of proprietary standards that can lock one in to a single vendor and platform, like Microsoft's DCOM.
Internet programming is power. The Sockets package makes it easy.