IBM developerworks : High performance programming(转贴)

最新推荐文章于 2024-11-04 23:11:51 发布

「已注销」

最新推荐文章于 2024-11-04 23:11:51 发布

阅读量3.3k

点赞数

分类专栏：网络编程资源文章标签： performance ibm sockets windows socket linux

网络编程资源专栏收录该内容

13 篇文章 0 订阅

订阅专栏

原贴地址：
http://www-106.ibm.com/developerworks/linux/library/l-rt6/?t=gr,Redhat=Sockets

High-performance programming techniques on Linux and Windows

Dr. Edward G. Bradford (egb@us.ibm.com)
Senior programmer, IBM
01 Nov 2001

In last month's column, Ed covered synchronization primitives and gave a reprise on pipes. This month he takes a first look at communication using sockets. Ed demonstrates some techniques for writing a sockets program and shows how his programming techniques perform in various operating system environments. Share your thoughts on this article with the author and other readers in the discussion forum by clicking Discuss at the top or bottom of the article.

This month my focus is on data communication through sockets. Sockets are a programming interface developed by contributors at the University of California at Berkeley in the 1980s. Sockets represent a complete mechanism for establishing network connections between two end points potentially on different computers. The end points are created and bound to each other using the sockets APIs.

Both Linux and Windows sockets interoperate seamlessly, and the programming challenges in writing a program that compiles on both systems are not huge.

Sockets come in a number of flavors:

Stream
Datagram
Raw
Sequenced packet
Reliably delivered message

For transferring large amounts of data, a virtual circuit is the best choice and a socket stream is a virtual circuit. We will be looking at stream-type sockets in this installment.

Simplistically, a client creates a socket and tries to connect to a known end point. A server creates a socket, binds the socket to an endpoint name (gives it a name), and then awaits a connection. When the client connects and the server receives the connection, data communication begins at each end's discretion. Sockets support bi-directional data transfer.

Windows and Linux sockets
Windows and Linux both support Berkeley-style sockets. Windows also supports "Windows Sockets". Both versions of sockets on Windows require the following initialization code:

Listing 1. WSAStartup


 
#ifdef _WIN32
    WSADATA wsadata;

    rc = WSAStartup(2, &wsadata);
    if(rc) {
        printf("WSAStartup FAILED: err=%d/n", GetLastError());
        return 1;
    }
#endif

Here, 2 is a version. Using any non-zero number as the first argument works. (The first argument is an unsigned short.)

The necessity for the Windows sockets interface versus the vanilla Berkeley sockets interface is not clear to me. Windows sockets seems to support other transport protocols. However, with the uniform acceptance of the Internet and its TCP/IP protocols, I don't understand the value of the added complexity of Windows Sockets.

With one exception, I have written the program in this article using the Berkeley-style interfaces. If there are fundamental reasons why the WSA interfaces should be preferred over the Berkeley-style interfaces, I am unaware of them.

Socket creation
Sockets are created with the socket() API supported on Linux and Windows:

Listing 2. socket() API



 SOCKET socket(
    int af,       
    int type,     
    int protocol  
);

af is address family, and I use AF_INET. type is either a SOCK_STREAM or SOCK_DGRAM, and here I focus only on SOCK_STREAM. protocol is a number selected from the /etc/protocols file on Linux and the /winnt/system32/drivers/etc/protocol file on Windows. I'm sticking with 0, which is the IP protocol on both systems.

Windows also has the Microsoft proprietary interface called Windows Sockets with an API described as:

Listing 3. WSASocket interface



SOCKET WSASocket(
    int af,
    int type,
    int protocol,
    LPWSAPROTOCOL_INFO lpProtocolInfo,
    GROUP g,
    DWORD dwFlags
);

(For the Linux and non-Windows programmers, a Microsoft DWORD is just an unsigned long.) The first three parameters are identical to the standard socket() interface. The three new parameters are interesting and provide fertile ground for testing. The first extra parameter is the lpProtocolInfo pointer. The structure referenced by lpProtocolInfo is:

Listing 4. WSAPROTOCOL_INFO structure



typedef struct _WSAPROTOCOL_INFO {
  DWORD                dwServiceFlags1;
  DWORD                dwServiceFlags2;
  DWORD                dwServiceFlags3;
  DWORD                dwServiceFlags4;
  DWORD                dwProviderFlags;
  GUID                 ProviderId;
  DWORD                dwCatalogEntryId;
  WSAPROTOCOLCHAIN     ProtocolChain;
  int                  iVersion;
  int                  iAddressFamily;
  int                  iMaxSockAddr;
  int                  iMinSockAddr;
  int                  iSocketType;
  int                  iProtocol;
  int                  iProtocolMaxOffset;
  int                  iNetworkByteOrder;
  int                  iSecurityScheme;
  DWORD                dwMessageSize;
  DWORD                dwProviderReserved;
  TCHAR                szProtocol[WSAPROTOCOL_LEN+1];
} WSAPROTOCOL_INFO, *LPWSAPROTOCOL_INFO;

The GROUP g is reserved. However, there is no restriction on its values. The two parameters lpProtocolInfo and dwFlags introduce a significant amount of complexity to programming. I did a back-of-the-envelope calculation on the number of test programs that would be required to fully test the WSASocket() API. For instance, the WSAPROTOCOL_INFO structure shows the following conservative estimate of the number of legal values.

Note: The (x n) values below show the possible legal values for each parameter. They are conservative in the sense that I have only attributed a single possible legal value to the szProtocol string, when in fact the string can be any string characters up to a length of 255. None of my assumptions appears to be in conflict with the documentation. "???" means I didn't understand the documentation on the parameter or what the purpose of the parameter was. I used the February and June 2001 Platform SDK.

Table 1. Conservative estimate of the number of legal values for WASPROTOCOL_INFO

dwServiceFlags1	Bit fields. 19 bits have been defined. 2^19 = 512K legal values. (x 524,288)
dwServiceFlags2	Reserved (x 1)
dwServiceFlags3	Reserved (x 1)
dwServiceFlags4	Reserved (x 1)
ProviderId	A GUID that disambiguates between multiple providers providing the same protocol. (x 1)
dwCatalogEntryId	Unique identifier assigned by the WS2_32.DLL for each WSAPROTOCOL_INFO structure. (x 1)
ProtocolChain	A structure of 7 entries. The structure represents a protocol chain consisting of one or more protocols on top of a base protocol. (x 1)
iVersion	Protocol Version Identifier. (x 1)
iAddressFamily	Address family. Probably the same as in the WSASocket interface. (x 1)
iMaxSockAddr	"Maximum Address Length" ??? (x 1)
iMinSockAddr	"Minimum Address Length" ??? (x 1)
iSocketType	socket type. 2 values but parameter already accounted for in socket() API. (x 1)
iProtocol	same as in socket() API. We'll only consider one (x 1)
iProtocolMaxOffset	Windows-specific ??? (x 1)
iNetworkByteOrder	BIGENDIAN or LITTLEENDIAN (x 2)
iSecurityScheme	Only one is defined. (x 1)
dwMessageSize	Maximum message size. Three special values are defined plus anything the actual protocol supports. (x 3)
dwProviderReserved	Reserved.
szProtocol[WSAPROTOCOL_LEN+1]	Possibly a Unicode array of characters identifying the protocol. (x 1)

The dwFlags parameter has 5 bit fields defined.

WSA_FLAG_OVERLAPPED
WSA_FLAG_MULTIPOINT_C_ROOT
WSA_FLAG_MULTIPOINT_C_LEAF
WSA_FLAG_MULTIPOINT_D_ROOT
WSA_FLAG_MULTIPOINT_D_LEAF

dwFlags - options 2^5 = (x 32)

The complexity issue is a real one. Programs are written using existing documentation, and programmers expect the documentation to be correct. If a simple API like the sockets() API is complicated with additional parameterization and there is little likelihood that the additional parameter space is fully tested, a compelling reason should exist before using the larger parameter space. Programmers are well advised to stick to the main roads. Microsoft and Linux won't be making mistakes in simple socket open/connect/send-recv/close situations. However, the darker corners of a complex API are much less likely to be fully tested. Enormous amounts of time can be wasted trying to get obscure features of an API to perform as documented. In almost all cases a little more work on the part of the programmer would allow him or her to avoid the untested paths of complex API()s.

With this thought in mind, I computed the size of the additional parameter space of the WSASocket() API. The complexity discovered here is in addition to the still present parameterizations of the Berkeley Sockets interface. From the above possibilities, there appears to be

32 * 3 * 2 * 524,288 = 100,663,296

combinations of calls possible to present to the Windows operating system. This number does not include any parameterization on the first three parameters of the WSASocket() API call. It is hard for me to understand how even a substantial subset of these parameterizations can be tested or verified. I confess that I don't understand the use or even meaning of some of these parameters. For the purpose of this column, unless a reader can show how a program might be improved by the use of the more obscure interfaces of WSA sockets, I will avoid them.

Dodging the admonishment that "He who anticipates disaster, suffers it twice", I used the WSASocket() API but with a NULL WSAProtocol_Info structure and only using the WSA_FLAG_OVERLAPPED bit. The WSA_FLAG_OVERLAPPED is documented to be meaningful only when the parameterizations of WSASend, WSASendTo, etc. reflect an overlapped IO request. I don't use them either. Thus, the parameterizations I use with Windows Sockets (the WSA interfaces) are identical to the ones I use with the standard Berkeley-style interface. Those of you who have a better understanding of the WSA interfaces and overlapped IO issues might want to try to see if you can improve the performance numbers presented here.

Here is the code that creates a socket for both Windows and Linux:

Listing 5. WSASocket interface



#ifdef _WIN32
#   define BADSOCK    INVALID_SOCKET
    sock1 = WSASocket(AF_INET,SOCK_STREAM,0,NULL,0,WSA_FLAG_OVERLAPPED);
#else
#   define BADSOCK    -1
    sock1 = socket(AF_INET, SOCK_STREAM, 0);
#endif
    if(sock1 == BADSOCK) {
        printf("socket FAILED: err=%d/n", errno);
        return 1;
    }

The defines for BADSOCK are included in the code snippet for readability. In the actual program they are bunched in with other platform-specific defines.

Connecting and accepting connections
Once we get past the preliminaries, socket programming on Windows and Linux is quite similar. Here is the client code that performs the connection to a listener:

Listing 6. Connecting


 
    if(connect(sock2, (struct sockaddr *)&addr1, sizeof(addr1))) {
        printf("connect FAILED: err=%d/n", errno);
        return 1;
    }

There are no conditional definitions needed. The parent creates a sockets and awaits a connection with this code.

Listing 7. Accepting a connection



    rc = listen(sock1,1);
    if(rc) {
        printf("Listen FAILED: err=%d/n", errno);
        return 1;
    }
    sock3 = accept(sock1, (struct sockaddr *)&addr2, (socklen_t *)&addr2len);
    if(sock3 == BADSOCK) {
        printf("Accept FAILED: err=%d/n", errno);
        return 1;
    }

Once again, no conditional definitions. Finally, the transmission and reception of data.

Sending and receiving data
Code to send data is:

Listing 8. Socket send() operation


 
        rc = send(sock3, (char *)&wtoken[0], 1, 0);
        if(rc == SOCKERR) {
            printf("send (1) FAILED: err=%d/n", errno);
            return 1;
        }

and

Listing 9. Socket recv() operation



        rc = recv(sock2, (char *)&rtoken[0], 1, 0);
        if(rc == SOCKERR) {
            printf("recv (1) FAILED: err=%d/n", errno);
            return 1;
        }

The program, sockspeedp6
Sockspeedp6 is the 6th version of the sockspeed program. It has acquired the timed test capability (see my previous column) and thus won't take forever to generate performance numbers. Sockspeedp6.cpp creates a child process and sends data to it. Because it is process oriented, the parent and child can be started independently. This feature allows you to start the parent on one machine and the child on another. That is precisely the reason processes were used rather than threads. Its usage message prints the following:

sockspeedp6 usage message


 
USAGE: sockspeedp6.exe [-sendbufsz N] [-recvbufsz M] [-summary] [-nodelay] /
                       [-nonblocking] hostname:port [nseconds] [bytes]
  where nseconds run time
  where bytes is the block size
   -summary - print only blocksize and transfer rate
  sockspeedp6.exe -parent [options] - Run as Parent only
  sockspeedp6.exe -child [options] - Run as Child only
  sockspeedp6.exe -sendbufsz N - set the socket send buffer size to N
  sockspeedp6.exe -recvbufsz M - set the socket receive buffer size to M

Sockspeedp6.cpp compiles cleanly on Red Hat 7.2 and Windows 2000/Visual Studio 6.0.

Compiler	Version
Windows/Microsoft C/C++	Version 12.00.8804 for 80x86
Linux/GNU C/C++	2.96

Sockspeed6 was used to look at the data transfer rates from one process to another process on the same machine. The values generated should give us a good idea of potentially how fast the underlying networking code can transfer data independent of the media speed. I investigated transfer sizes from 16 bytes to 1 megabyte. I did not investigate the no-delay option (turning off the Nagel algorithm), the non-blocking option, or changing the receive or transmit buffer sizes. Those investigations await another column.

The programming practices represented by sockspeedp6.cpp are straightforward. If there are better programming techniques for Windows or Linux, I would like to hear about them (use the discussion forum) and code them into sockspeedp6.cpp to demonstrate the improvements. Sockspeedp6 gains from past experiences. It changes memory before each send() operation and reads all of the memory acquired by the recv() operation. As mentioned, it also uses the timed test techniques presented in my previous column.

Results
All tests were run on an IBM ThinkPad 600X with 576 MB of memory and an 18-GB disk. The system boots all three operating systems. Figure 1 shows the results for Windows 2000 Advanced Server, Windows XP Professional, and Red Hat 7.2 (Linux 2.4.2).

Socket transfer speeds, single processor

During the development and measurements presented here, I noticed that Windows 2000 seemed to perform better when it used the localhost IP address, 127.0.0.1. I re-ran the test for all three platforms using the localhost IP address. Figure 2 shows the results. For the 127.0.0.1 tests, I just plotted skinny lines with markers on top of Figure 1. It appears that Windows 2000 does indeed transfer data faster over the 127.0.0.1 address. However, it also shows that Windows XP seems to have removed this feature.

Socket transfer speeds

The results show that Red Hat 7.2 provides a significantly faster socket implementation. Using the coding techniques demonstrated here, Linux achieved a 2.5 times faster transfer rate than either of the Windows platforms. Are these coding techniques optimal for Windows? I don't know the answer to that question. I do know that lots of programs are written with simple send() and recv() loops similar to the ones contained in the sockspeedp6 program. If there are techniques for improving the performance on Windows, I and the readers here, would like to hear about them in the discussion forum.

We can compare these numbers with the numbers generated for pipes in previous columns. Linux achieved a maximum speed of 400 MB/sec for "touched" memory. On the same graph (Figure 3 in my previous column), Windows achieved a 100 MB/sec speed for "touched" memory for a 4x advantage to Linux. Comparing pipes and sockets, we find that Linux pipes are roughly 5 times faster than Linux sockets, and that Windows pipes are 3 times faster than Windows sockets.

A future column will look into parameterizations of sockets in an effort to find higher transfer rates. In a future column we will also look into using sockets over a physical media.

With the knowledge gained from this study we can anticipate the answer to the following question: If the current system without using the media can transfer data at X MB/sec, how many 100 Mb/sec adapters will be needed to saturate the CPU? A simple guess says 100 Mb/sec translates to 10 megabytes per second. Divide the no-media maximum transfer rate by 10 and you get the number of adapters required to saturate the CPU. This is a front-end guess. A future column will see if this guess holds water.

Conclusion
I wrote one program sockspeedp6.cpp and one shell script sockspeedp6-sh.sh to demonstrate the usage and to measure the performance of sockets on Windows and Linux without using the actual media. The results show Windows (both versions) to be considerably slower at transferring data over sockets than Linux (Red Hat 7.2).

Resources

Participate in the discussion forum on this article. (You can also click Discuss at the top or bottom of the article to access the forum.)
View the source for Ed's sockspeedp6.cpp program and sockspeedp6-sh.sh shell script.
Read Ed's previous RunTime columns on developerWorks:
Read these related articles on developerWorks:
- Operating system flexibility
Browse more Linux resources on developerWorks
Browse more Open source resources on developerWorks.

About the author
Ed manages the Microsoft Premier Support for IBM Software group and writes a weekly newsletter for Linux and Windows 2000 software developers. Ed can be reached at egb@us.ibm.com.