Berkeley Sockets
Network I/O is more complicated than the normal file I/O. The file I/O operations are on the same system, the file descriptor is sufficient for identifying a file. The network I/O operations require both the host and the foreign process information. Berkeley sockets supports the following communication protocols:
- Unix domain (on same Unix system)
- Internet domain (TCP/IP)
Socket Addresses
The socket address specifies the family of the communication facility and the communication media. The Unix domain socket address structure sockaddr_un is defined in <sys/un.h>.
struct sockaddr_un{
short sun_family; /*AF_UNIX*/
char sun_PATH[108]; /*path name */
};
The Internet socket address structure sockaddr_in is defined in <netinet/in.h>.
struct in_addr {
u_long s_addr; /*32-bit net id */
};
struct sockaddr_in {
short sin_family; /* AF_INET */
u_short sin_port; /* 16-bit port number */
struct in_addr sin_addr;
char sin_zero[8]; /* unused */
};
socket System Call
The system call socket creates one end of the socket.
#include <sys/types.h>
#include <sys/socket.h>
int socket (int family, int type, int protocol) ;
The first parameter family specifies the communication protocol used. It can be one of:
AF_UNIX Unix
AF_INET Internet
The second parameter type specifies the type of socket. We use stream socket:
SOCK_STREAM
The third parameter protocol is usually set to zero. The socket system call returns a small integer called socket descriptor which is similar to file descriptor which can be used in other system calls. For example,
int sockfd;
sockfd = socket (AF_UNIX, SOCK_STREAM, 0);
bind System Call
The system call bind associates an address to a socket descriptor created by socket.
#include <sys/types.h>
#include <sys/socket.h>
int bind (int sockfd, struct sockaddr *myaddr, int addrlen);
The second parameter myaddr specifies a pointer to a predefined address of the socket. Its structure is a general address structure so that the bind system call can be used by both Unix domain and Internet domain sockets. The structure sockaddr is defined in <sys/socket.h>.
struct sockaddr {
u_short sa_family; /* address family; AF_xxx */
char sa_data[14]; /* protocol specific address */
};
Since Unix domain and Internet domain have different address structures as shown earlier, type casting is needed.
#define SERV_PATH "./serv.path"
struct sockaddr_un serv_addr;
int servlen;
serv_addr.sun_family = AF_UNIX;
strcpy(serv_addr.sun_path, SERV_PATH);
servlen = strlen(serv_addr.sun_path) + sizeof(serv)_addr.sun_family);
bind(sockfd, (struct sockaddr *) &serv_addr, servlen);
Note that the BSD network system calls do not assume that the Unix pathname in sun_path is terminated with a null type. We can initialize the address by:
bzero((char *) &serv_addr, sizeof(serv_addr));
After the call bind , the system knows the name of the host socket. In the Unix domain case, it is a path name.
In the Internet domain, the address structure has more fields than the structure in the unix domain. The first field sin_family is AF_INET. The second field sin_port can be any integer greater than 5000. Lower port numbers are reserved for specific services. For example, 21 is for FTP (see <netinet/in.h>). This port number should be agreed by both server and clients. The third field in_addr is the 32-bit Internet address. For the server, the constant INADDR_ANY (defined in <netinet/in.h> ) can be used to tell the system that we will accept a connection on any Internet interface for the system. Since different architectures use different byte ordering (big endian and little endian), the following four functions are used for conversions between the network (Internet)and the host: htonl (host to network long), htons, ntohl, ntohs. Here is a header file for both server and clients:
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#define SERV_PORT 5432
#define SERV_HOST_ADDR "130.113.68.1"
In the server program, we have
struct sockaddr_in serv_addr;
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons (SERV_PORT);
serv_addr.sin_addr.s_addr = htonl (INADDR_ANY);
bind (sockfd, (struct sockaddr *) &serv_addr, sizeof (serv_addr));
listen System Call
The system call listen is used by a connection-oriented server to get ready for accepting connection requests from a client:
int listen (int sockfd, int backlog);
The second parameter backlog specifies the number of requests that can be queued by the system before the server executes the accept system call. This system call is usually used after bind and before accept. For example,
listen ( sockfd, 5);
accept System Call
The system call accept is used by connection-oriented server to set up an actual connection with a client process.
#include <sys/types.h>
#include <sys/socket.h>
int accept (int sockfd, struct sockaddr *cli_addr, int *addrlen ) ;
This system call returns a new socket descriptor. In a concurrent server, after accept and fork, the child process closes the original socket descriptor and uses the new socket descriptor so that the parent process can accept more connections using the original socket descriptor.
for( ; ;) {
newsocketfd = accept (sockfd, ...);
if (fork() = 0) {
close (sockfd);
<do whatever using newsockfd>;
exit (0) ;
}
close (newsockfd);
}
This system call also returns the address of the client process through its second parameter cli_addr and the length of the address through its third parameter addrlen. In the Unix domain, since the address length depends on the length of the path name, the server usually sets addrlen to the size of the address structure. When accept returns, addrlen has the actual length of the path name. For example,
struct sockaddr_un cli_addr;
int clilen;
clilen = sizeof (cli_addr);
newsockfd = accept (sockfd, (struct sockaddr * ) &cli_addr, &clilen);
After accept, the system know the client process.
connect System Call
The system call connect is used by a client to establish a connection with the server.
#include <sys/types.h>
#include <sys/socket.h>
int connect (int sockfd, struct sockaddr *servaddr, int addrlen);
This system call is similar to accept. The server address pointed to by servaddr and its length addrlen should be known. A client does not have to bind a local address to the socket descriptor before connect.
#define SERV_PATH "./serv.path"
struct sockaddr_un serv_addr;
int servlen;
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sun_family = AF_UNIX;
strcpy(serv_addr.sun_path, SERV_PATH);
servlen = strlen(serv_addr.sun_path) + sizeof(serv_addr.sun_family);
connect(sockfd, (struct sockaddr *) &serv_addr, servlen);
In the Internet domain, the client includes the header file shown in the bind System Call section and uses
unsigned long inet_addr(char *ptr) ;
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(SERV_PORT):
serv_addr.sin_addr.s_addr = inet_addr (SERV_HOST_ADDR);
connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr));
Where function inet_addr converts a string in dotted-decimal notation (d.g. 130.113.68.1) to a 32-bit Internet address.
Read and Write on a Stream Socket
A read or write system call on a stream socket, different from file I/O, might input or output fewer bytes than requested. It is the programmer's responsibility to ensure the actual number of bytes read or written on the socket.
/* read n bytes from a socket descriptor */
int readsock(sockfd, buf, nbytes)
register int sockfd;
register char *buf;
register int nbytes;
{
int nleft, nread;
nleft = nbytes;
while (nleft > 0) {
if ((nread = read(sockfd, buf, nleft)) < 0)
return(nread); /* error, nread < 0 */
else if (nread == 0)
break; /* EOF */
/* nread > 0. update nleft and buf pointer */
nleft - = nread;
but + = nread;
} /* while */
return(nbytes - nleft);
} /* readsock() */