http://www-106.ibm.com/developerworks/linux/library/l-rt6/?t=gr,Redhat=Sockets
High-performance programming techniques on Linux and Windows
Dr. Edward G. Bradford (egb@us.ibm.com)
Senior programmer, IBM
01 Nov 2001
In last month's column, Ed covered synchronization primitives and gave a reprise on pipes. This month he takes a first look at communication using sockets. Ed demonstrates some techniques for writing a sockets program and shows how his programming techniques perform in various operating system environments. Share your thoughts on this article with the author and other readers in the discussion forum by clicking Discuss at the top or bottom of the article.
This month my focus is on data communication through sockets. Sockets are a programming interface developed by contributors at the University of California at Berkeley in the 1980s. Sockets represent a complete mechanism for establishing network connections between two end points potentially on different computers. The end points are created and bound to each other using the sockets APIs.
Both Linux and Windows sockets interoperate seamlessly, and the programming challenges in writing a program that compiles on both systems are not huge.
Sockets come in a number of flavors:
- Stream
- Datagram
- Raw
- Sequenced packet
- Reliably delivered message
For transferring large amounts of data, a virtual circuit is the best choice and a socket stream is a virtual circuit. We will be looking at stream-type sockets in this installment.
Simplistically, a client creates a socket and tries to connect to a known end point. A server creates a socket, binds the socket to an endpoint name (gives it a name), and then awaits a connection. When the client connects and the server receives the connection, data communication begins at each end's discretion. Sockets support bi-directional data transfer.
Windows and Linux sockets
Windows and Linux both support Berkeley-style sockets. Windows also supports "Windows Sockets". Both versions of sockets on Windows require the following initialization code:
|
Here, 2
is a version. Using any non-zero number as the first argument works. (The first argument is an unsigned short.)
The necessity for the Windows sockets interface versus the vanilla Berkeley sockets interface is not clear to me. Windows sockets seems to support other transport protocols. However, with the uniform acceptance of the Internet and its TCP/IP protocols, I don't understand the value of the added complexity of Windows Sockets.
With one exception, I have written the program in this article using the Berkeley-style interfaces. If there are fundamental reasons why the WSA interfaces should be preferred over the Berkeley-style interfaces, I am unaware of them.
Socket creation
Sockets are created with the socket()
API supported on Linux and Windows:
|
af
is address family, and I use AF_INET
. type
is either a SOCK_STREAM
or SOCK_DGRAM
, and here I focus only on SOCK_STREAM
. protocol
is a number selected from the /etc/protocols file on Linux and the /winnt/system32/drivers/etc/protocol file on Windows. I'm sticking with 0, which is the IP protocol on both systems.
Windows also has the Microsoft proprietary interface called Windows Sockets with an API described as:
Listing 3. WSASocket interface
|
(For the Linux and non-Windows programmers, a Microsoft DWORD
is just an unsigned long.) The first three parameters are identical to the standard socket()
interface. The three new parameters are interesting and provide fertile ground for testing. The first extra parameter is the lpProtocolInfo
pointer. The structure referenced by lpProtocolInfo
is:
|
The GROUP g
is reserved. However, there is no restriction on its values. The two parameters lpProtocolInfo
and dwFlags
introduce a significant amount of complexity to programming. I did a back-of-the-envelope calculation on the number of test programs that would be required to fully test the WSASocket()
API. For instance, the WSAPROTOCOL_INFO
structure shows the following conservative estimate of the number of legal values.
Note: The (x n) values below show the possible legal values for each parameter. They are conservative in the sense that I have only attributed a single possible legal value to the szProtocol
string, when in fact the string can be any string characters up to a length of 255. None of my assumptions appears to be in conflict with the documentation. "???" means I didn't understand the documentation on the parameter or what the purpose of the parameter was. I used the February and June 2001 Platform SDK.
dwServiceFlags1 | Bit fields. 19 bits have been defined. 2^19 = 512K legal values. (x 524,288) |
dwServiceFlags2 | Reserved (x 1) |
dwServiceFlags3 | Reserved (x 1) |
dwServiceFlags4 | Reserved (x 1) |
ProviderId | A GUID that disambiguates between multiple providers providing the same protocol. (x 1) |
dwCatalogEntryId | Unique identifier assigned by the WS2_32.DLL for each WSAPROTOCOL_INFO structure. (x 1) |
ProtocolChain | A structure of 7 entries. The structure represents a protocol chain consisting of one or more protocols on top of a base protocol. (x 1) |
iVersion | Protocol Version Identifier. (x 1) |
iAddressFamily | Address family. Probably the same as in the WSASocket interface. (x 1) |
iMaxSockAddr | "Maximum Address Length" ??? (x 1) |
iMinSockAddr | "Minimum Address Length" ??? (x 1) |
iSocketType | socket type. 2 values but parameter already accounted for in socket() API. (x 1) |
iProtocol | same as in socket() API. We'll only consider one (x 1) |
iProtocolMaxOffset | Windows-specific ??? (x 1) |
iNetworkByteOrder | BIGENDIAN or LITTLEENDIAN (x 2) |
iSecurityScheme | Only one is defined. (x 1) |
dwMessageSize | Maximum message size. Three special values are defined plus anything the actual protocol supports. (x 3) |
dwProviderReserved | Reserved. |
szProtocol[WSAPROTOCOL_LEN+1] | Possibly a Unicode array of characters identifying the protocol. (x 1) |
The dwFlags
parameter has 5 bit fields defined.
WSA_FLAG_OVERLAPPED
WSA_FLAG_MULTIPOINT_C_ROOT
WSA_FLAG_MULTIPOINT_C_LEAF
WSA_FLAG_MULTIPOINT_D_ROOT
WSA_FLAG_MULTIPOINT_D_LEAF
dwFlags
- options 2^5 = (x 32)
The complexity issue is a real one. Programs are written using existing documentation, and programmers expect the documentation to be correct. If a simple API like the sockets()
API is complicated with additional parameterization and there is little likelihood that the additional parameter space is fully tested, a compelling reason should exist before using the larger parameter space. Programmers are well advised to stick to the main roads. Microsoft and Linux won't be making mistakes in simple socket open/connect/send-recv/close situations. However, the darker corners of a complex API are much less likely to be fully tested. Enormous amounts of time can be wasted trying to get obscure features of an API to perform as documented. In almost all cases a little more work on the part of the programmer would allow him or her to avoid the untested paths of complex API()s.
With this thought in mind, I computed the size of the additional parameter space of the WSASocket() API. The complexity discovered here is in addition to the still present parameterizations of the Berkeley Sockets interface. From the above possibilities, there appears to be
32 * 3 * 2 * 524,288 = 100,663,296
combinations of calls possible to present to the Windows operating system. This number does not include any parameterization on the first three parameters of the WSASocket()
API call. It is hard for me to understand how even a substantial subset of these parameterizations can be tested or verified. I confess that I don't understand the use or even meaning of some of these parameters. For the purpose of this column, unless a reader can show how a program might be improved by the use of the more obscure interfaces of WSA sockets, I will avoid them.
Dodging the admonishment that "He who anticipates disaster, suffers it twice", I used the WSASocket()
API but with a NULL WSAProtocol_Info
structure and only using the WSA_FLAG_OVERLAPPED
bit. The WSA_FLAG_OVERLAPPED
is documented to be meaningful only when the parameterizations of WSASend
, WSASendTo
, etc. reflect an overlapped IO request. I don't use them either. Thus, the parameterizations I use with Windows Sockets (the WSA interfaces) are identical to the ones I use with the standard Berkeley-style interface. Those of you who have a better understanding of the WSA interfaces and overlapped IO issues might want to try to see if you can improve the performance numbers presented here.
Here is the code that creates a socket for both Windows and Linux:
Listing 5. WSASocket interface
|
The defines for BADSOCK
are included in the code snippet for readability. In the actual program they are bunched in with other platform-specific defines.
Connecting and accepting connections
Once we get past the preliminaries, socket programming on Windows and Linux is quite similar. Here is the client code that performs the connection to a listener:
|
There are no conditional definitions needed. The parent creates a sockets and awaits a connection with this code.
Listing 7. Accepting a connection
|
Once again, no conditional definitions. Finally, the transmission and reception of data.
Sending and receiving data
Code to send data is:
|
and
Listing 9. Socket recv() operation
|
The program, sockspeedp6
Sockspeedp6 is the 6th version of the sockspeed program. It has acquired the timed test capability (see my previous column) and thus won't take forever to generate performance numbers. Sockspeedp6.cpp creates a child process and sends data to it. Because it is process oriented, the parent and child can be started independently. This feature allows you to start the parent on one machine and the child on another. That is precisely the reason processes were used rather than threads. Its usage message prints the following:
|
Sockspeedp6.cpp compiles cleanly on Red Hat 7.2 and Windows 2000/Visual Studio 6.0.
Compiler | Version |
Windows/Microsoft C/C++ | Version 12.00.8804 for 80x86 |
Linux/GNU C/C++ | 2.96 |
Sockspeed6 was used to look at the data transfer rates from one process to another process on the same machine. The values generated should give us a good idea of potentially how fast the underlying networking code can transfer data independent of the media speed. I investigated transfer sizes from 16 bytes to 1 megabyte. I did not investigate the no-delay option (turning off the Nagel algorithm), the non-blocking option, or changing the receive or transmit buffer sizes. Those investigations await another column.
The programming practices represented by sockspeedp6.cpp are straightforward. If there are better programming techniques for Windows or Linux, I would like to hear about them (use the discussion forum) and code them into sockspeedp6.cpp to demonstrate the improvements. Sockspeedp6 gains from past experiences. It changes memory before each send()
operation and reads all of the memory acquired by the recv()
operation. As mentioned, it also uses the timed test techniques presented in my previous column.
Results
All tests were run on an IBM ThinkPad 600X with 576 MB of memory and an 18-GB disk. The system boots all three operating systems. Figure 1 shows the results for Windows 2000 Advanced Server, Windows XP Professional, and Red Hat 7.2 (Linux 2.4.2).
During the development and measurements presented here, I noticed that Windows 2000 seemed to perform better when it used the localhost IP address, 127.0.0.1. I re-ran the test for all three platforms using the localhost IP address. Figure 2 shows the results. For the 127.0.0.1 tests, I just plotted skinny lines with markers on top of Figure 1. It appears that Windows 2000 does indeed transfer data faster over the 127.0.0.1 address. However, it also shows that Windows XP seems to have removed this feature.
The results show that Red Hat 7.2 provides a significantly faster socket implementation. Using the coding techniques demonstrated here, Linux achieved a 2.5 times faster transfer rate than either of the Windows platforms. Are these coding techniques optimal for Windows? I don't know the answer to that question. I do know that lots of programs are written with simple send() and recv() loops similar to the ones contained in the sockspeedp6 program. If there are techniques for improving the performance on Windows, I and the readers here, would like to hear about them in the discussion forum.
We can compare these numbers with the numbers generated for pipes in previous columns. Linux achieved a maximum speed of 400 MB/sec for "touched" memory. On the same graph (Figure 3 in my previous column), Windows achieved a 100 MB/sec speed for "touched" memory for a 4x advantage to Linux. Comparing pipes and sockets, we find that Linux pipes are roughly 5 times faster than Linux sockets, and that Windows pipes are 3 times faster than Windows sockets.
A future column will look into parameterizations of sockets in an effort to find higher transfer rates. In a future column we will also look into using sockets over a physical media.
With the knowledge gained from this study we can anticipate the answer to the following question: If the current system without using the media can transfer data at X MB/sec, how many 100 Mb/sec adapters will be needed to saturate the CPU? A simple guess says 100 Mb/sec translates to 10 megabytes per second. Divide the no-media maximum transfer rate by 10 and you get the number of adapters required to saturate the CPU. This is a front-end guess. A future column will see if this guess holds water.
Conclusion
I wrote one program sockspeedp6.cpp
and one shell script sockspeedp6-sh.sh
to demonstrate the usage and to measure the performance of sockets on Windows and Linux without using the actual media. The results show Windows (both versions) to be considerably slower at transferring data over sockets than Linux (Red Hat 7.2).
- Participate in the discussion forum on this article. (You can also click Discuss at the top or bottom of the article to access the forum.)
- View the source for Ed's sockspeedp6.cpp program and sockspeedp6-sh.sh shell script.
- Read Ed's previous RunTime columns on developerWorks:
- Introductory column
- Block memory copy
- Block memory copy, Part 2
- Pipes in Linux, Windows 2000, and Windows XP
- Synchronizing processes and threads
- Read these related articles on developerWorks:
- Browse more Linux resources on developerWorks
- Browse more Open source resources on developerWorks.
About the author Ed manages the Microsoft Premier Support for IBM Software group and writes a weekly newsletter for Linux and Windows 2000 software developers. Ed can be reached at egb@us.ibm.com. |