To this point, we have looked at two approaches for creating concurrent logical flows form the previous two essays. With the first approach, we use a separate process for each flow. The kernel schedules each process automatically. Each process has its own private address space, which makes it difficult for flows to share data. With the second approach, we create our own logical flows and use I/O multiplexing to explicitly schedule the flows. Because there is only one process, flows share the entire address space. This section introduces a third approach—based on threads—that is a hybrid of these two.
The code for a concurrent echo server based on threads is showed below.
#include <pthread.h>
void *thread(void *vargp);
int main(int argc, char **argv)
{
if (argc != 2) {
fprintf(stderr, "usage: %s <port>\n", argv[0]);
exit(0);
}
int port = atoi(argv[1]);
int listenfd = open_listenfd(port);
int *connfdp;
struct sockaddr_in clientaddr;
pthread_t tid;
while (1) {
int clientlen = sizeof(clientaddr);
connfdp = malloc(sizeof(int));
*connfdp = accept(listenfd, (SA *) &clientaddr, &clientlen);
pthread_create(&tid, NULL, thread, connfdp);
}
}
// Thread routine
void *thread(void *vargp)
{
int connfd = *((int *)vargp);
pthread_detach(pthread_self());
free(vargp);
echo(connfd);
close(connfd);
return NULL;
}
The overall structure is similar to the process-based design. The main thread repeatedly waits for a connection request and then creates a peer thread to handle the request. While the code looks simple, there are a couple of general and somewhat subtle issues we need to look at more closely. The first issue is how to pass the connected descriptor to the peer thread when we call pthread_create. The obvious approach is to pass a point to the descriptor, as in the following:
connfd = accept(listenfd, (SA *) &clientaddr, &clientlen);
pthread_create(&tid, NULL, thread, &connfd);
Then we have peer thread dereference the pointer and assign it to a local variable, as follows:
void *thread(void *vargp)
{
int connfd = *((int *)vargp); // Assignment statement
// ...
return NULL;
}
This would be wrong, however because it introduces a race between the assignment statement in the peer thread and the accept statement in the main thread. Another issue is avoiding memory leaks in the thread routine. Since we are not explicitly reaping threads, we must detach each thread so that its memory resources will be reclaimed when it terminates. Further, we must be careful to free the memory block that was allocated by the main thread.There are several advantages to using threads. First, threads have less run time overhead than processes. We would expect a server based on threads to have better throughput (measured in clients serviced per second) than one based on processes. Second, because all threads share the same global variables and heap variables, it is much easier for threads to share state information.
The major disadvantage of using threads is that the same memory model that makes it easy to share data structures also makes it easy to share data structures unintentionally and incorrectly. As we learned, shared data must be protected, functions called from threads must be reentrant, and race conditions must be avoided.