TinyHttpd学习笔记

概述

tinyhttpd 是一个由C语言写成的不到 500 行的超轻量型 Http Server,用来学习非常不错,可以帮助我们真正理解服务器程序的本质

项目地址:https://sourceforge.net/projects/tinyhttpd/

系统框架图:
在这里插入图片描述
如何运行这个项目:

  1. 给cgi脚本添加执行权限:chmod a+x *.cgi
  2. ubuntu18.04自带了perl解释器,但是路径可能与源码中不同,需要修改,查看其位置写入脚本#!即可:which perl
  3. make后即可运行,使用浏览器访问

逐函数分析

main

int main(void) {
	int server_sock = -1;
	u_short port = 4000;
	int client_sock = -1;
	struct sockaddr_in client_name;
	socklen_t client_name_len = sizeof(client_name);
	pthread_t newthread;

	server_sock = startup(&port);
	printf("httpd running on port %d\n", port);

	while (1) {
		client_sock = accept(server_sock, (struct sockaddr *)&client_name, &client_name_len);
		if (client_sock == -1) {
			error_die("accept");
		}
		/* accept_request(&client_sock); */
		if (pthread_create(&newthread, NULL, (void *)accept_request, (void *)(intptr_t)client_sock) != 0)
			perror("pthread_create");
	}

	close(server_sock);
	return (0);
}
  1. 通过startup获取服务端的套接字和端口号
  2. accept等候客户端连接,若出错就die掉,否则创建新线程执行accept_request处理客户端请求

调用的api:

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, 
					void *(*start_routine) (void *), void *arg);

startup和error_die

/**********************************************************************/
/* Print out an error message with perror() (for system errors; based
 * on value of errno, which indicates system call errors) and exit the
 * program indicating an error. */
/**********************************************************************/
void error_die(const char *sc) {
	perror(sc);
	exit(1);
}

/**********************************************************************/
/* This function starts the process of listening for web connections
 * on a specified port.  If the port is 0, then dynamically allocate a
 * port and modify the original port variable to reflect the actual
 * port.
 * Parameters: pointer to variable containing the port to connect on
 * Returns: the socket */
/**********************************************************************/
int startup(u_short *port) {
	int httpd = socket(PF_INET, SOCK_STREAM, 0);
	if (httpd == -1)
		error_die("socket");
	struct sockaddr_in name;
	memset(&name, 0, sizeof(name));
	name.sin_family = AF_INET;
	name.sin_port = htons(*port);
	name.sin_addr.s_addr = htonl(INADDR_ANY);
	int on = 1;
	if ((setsockopt(httpd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on))) < 0) {
		error_die("setsockopt failed");
	}
	if (bind(httpd, (struct sockaddr *)&name, sizeof(name)) < 0)
		error_die("bind");
	if (*port == 0) {
		socklen_t namelen = sizeof(name);
		if (getsockname(httpd, (struct sockaddr *)&name, &namelen) == -1)
			error_die("getsockname");
		*port = ntohs(name.sin_port);
	}
	if (listen(httpd, 5) < 0)
		error_die("listen");
	return (httpd);
}

  1. 创建监听套接字httpd,PF_INET和AF_INET等价
  2. 初始化服务端地址结构name
  3. 设置地址复用
  4. 绑定监听套接字httpd和地址结构name
  5. 若传入的port==0,则借助name获取httpd指向的套接字的端口号,赋给port传出
  6. 设置listen

调用的api:

int socket(int domain, int type, int protocol);
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
int getsockname(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
int listen(int sockfd, int backlog);

getsockname() returns the current address to which the socket sockfd is bound, in the buffer pointed to by addr. The addrlen argument should be initialized to indicate the amount of space (in bytes) pointed to by addr. On return it contains the actual size of the socket address. The returned address is truncated if the buffer provided is too small; in this case, addrlen will return a value greater than was supplied to the call.

listen() marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2). The sockfd argument is a file descriptor that refers to a socket of type SOCK_STREAM or SOCK_SEQPACKET. The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.

accept_request

http消息格式:https://www.runoob.com/http/http-messages.html

/**********************************************************************/
/* A request has caused a call to accept() on the server port to
 * return.  Process the request appropriately.
 * Parameters: the socket connected to the client */
/**********************************************************************/
void accept_request(void *arg) {
	int client = (intptr_t)arg;
	char buf[1024];
	char method[255];
	size_t i, j;
	int cgi = 0; /* becomes true if server decides this is a CGI program */
	char *query_string = NULL;

	size_t numchars = get_line(client, buf, sizeof(buf));
	i = 0;
	j = 0;
	//获取请求方法
	while (!ISspace(buf[i]) && (i < sizeof(method) - 1)) {
		method[i] = buf[i];
		i++;
	}
	j = i;
	method[i] = '\0';

	//如果不是GET或POST,告诉客户端未完成
	if (strcasecmp(method, "GET") && strcasecmp(method, "POST")) {
		unimplemented(client);
		return;
	}

	i = 0;
	while (ISspace(buf[j]) && (j < numchars)) {
		j++;
	}
	//获取url
	char url[255];
	while (!ISspace(buf[j]) && (i < sizeof(url) - 1) && (j < numchars)) {
		url[i] = buf[j];
		i++;
		j++;
	}
	url[i] = '\0';

	if (strcasecmp(method, "GET") == 0) {
		query_string = url;
		while ((*query_string != '?') && (*query_string != '\0'))
			query_string++;
		if (*query_string == '?') {	 //参数和 URL 之间用问号?隔开
			cgi = 1;
			*query_string = '\0';
			query_string++;
		}
	} else if (strcasecmp(method, "POST") == 0) {
		cgi = 0;
	}

	char path[512];
	sprintf(path, "htdocs%s", url);
	if (path[strlen(path) - 1] == '/') {
		strcat(path, "index.html");	 //以/结尾,访问index
	}
	struct stat st;
	if (stat(path, &st) == -1) {
		while ((numchars > 0) && strcmp("\n", buf)) {
			/* read & discard headers */
			numchars = get_line(client, buf, sizeof(buf));
		}
		not_found(client);
	} else {
		if ((st.st_mode & S_IFMT) == S_IFDIR) {
			strcat(path, "/index.html");
		}
		//指定所有人都获得执行或搜索的权限
		if ((st.st_mode & S_IXUSR) || (st.st_mode & S_IXGRP) || (st.st_mode & S_IXOTH)) {
			cgi = 1;
		}

		if (!cgi) {
			serve_file(client, path);
		} else {
			execute_cgi(client, path, method, query_string);
		}
	}

	close(client);
}
  1. get_line获取请求行数据到buf
  2. 获取请求方法到method
  3. 获取URL到url
  4. 若为get方法,query_string指向url中?后面的参数。若为post方法,令cgi=0
  5. 若url以/结尾,令其访问index.html
  6. path指明的文件不存在,则读走剩余的首部,然后给客户端返回404页面
  7. path是目录,令其访问index.html
  8. 根据cgi的值,serve_file发送文件或execute_cgi执行cgi脚本

调用的api:

int strcasecmp(const char *s1, const char *s2);
char *strcat(char *dest, const char *src);
int stat(const char *pathname, struct stat *statbuf);

说明:

The strcasecmp() function performs a byte-by-byte comparison of the strings s1 and s2, ignoring the case of the characters. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2

The strcat() function appends the src string to the dest string, overwriting the terminating null byte (’\0’) at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; buffer overruns are a favorite avenue for attacking secure programs.

int stat(const char *pathname, struct stat *statbuf);
int fstat(int fd, struct stat *statbuf);
int lstat(const char *pathname, struct stat *statbuf);

These functions return information about a file, in the buffer pointed to by statbuf. No permissions are required on the file itself, but—in the case of stat(), fstatat(), and lstat()—execute (search) permission is required on all of the directories in pathname that lead to the file.
stat() and fstatat() retrieve information about the file pointed to by pathname; the differences for fstatat() are described below.
lstat() is identical to stat(), except that if pathname is a symbolic link, then it returns information about the link itself, not the file that it refers to.
fstat() is identical to stat(), except that the file about which information is to be retrieved is specified by the file descriptor fd.
The stat structure. All of these system calls return a stat structure, which contains the following fields:

struct stat {
	dev_t st_dev;		  /* ID of device containing file */
	ino_t st_ino;		  /* Inode number */
	mode_t st_mode;		  /* File type and mode */
	nlink_t st_nlink;	  /* Number of hard links */
	uid_t st_uid;		  /* User ID of owner */
	gid_t st_gid;		  /* Group ID of owner */
	dev_t st_rdev;		  /* Device ID (if special file) */
	off_t st_size;		  /* Total size, in bytes */
	blksize_t st_blksize; /* Block size for filesystem I/O */
	blkcnt_t st_blocks;	  /* Number of 512B blocks allocated */

	/* Since Linux 2.6, the kernel supports nanosecond
	   precision for the following timestamp fields.
	   For the details before Linux 2.6, see NOTES. */

	struct timespec st_atim; /* Time of last access */
	struct timespec st_mtim; /* Time of last modification */
	struct timespec st_ctim; /* Time of last status change */

#define st_atime st_atim.tv_sec /* Backward compatibility */
#define st_mtime st_mtim.tv_sec
#define st_ctime st_ctim.tv_sec
};

unimplemented和not_found

/**********************************************************************/
/* Inform the client that the requested web method has not been
 * implemented.
 * Parameter: the client socket */
/**********************************************************************/
void unimplemented(int client) {
	char buf[1024];
	sprintf(buf, "HTTP/1.0 501 Method Not Implemented\r\n");
	send(client, buf, strlen(buf), 0);
	//#define SERVER_STRING "Server: jdbhttpd/0.1.0\r\n"
	sprintf(buf, SERVER_STRING);
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "Content-Type: text/html\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "<HTML><HEAD><TITLE>Method Not Implemented\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "</TITLE></HEAD>\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "<BODY><P>HTTP request method not supported.\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "</BODY></HTML>\r\n");
	send(client, buf, strlen(buf), 0);
}

/**********************************************************************/
/* Give a client a 404 not found status message. */
/**********************************************************************/
void not_found(int client) {
	char buf[1024];
	sprintf(buf, "HTTP/1.0 404 NOT FOUND\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, SERVER_STRING);
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "Content-Type: text/html\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "<HTML><TITLE>Not Found</TITLE>\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "<BODY><P>The server could not fulfill\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "your request because the resource specified\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "is unavailable or nonexistent.\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "</BODY></HTML>\r\n");
	send(client, buf, strlen(buf), 0);
}

返回501Not Implemented404 Not Found错误页面

为什么每次只发送少量的数据而不是一次性将html文档组装再好一次性发送呢?

调用的api:

ssize_t send(int sockfd, const void *buf, size_t len, int flags);

serve_file、headers和cat

/**********************************************************************/
/* Send a regular file to the client.  Use headers, and report
 * errors to client if they occur.
 * Parameters: a pointer to a file structure produced from the socket
 *              file descriptor
 *             the name of the file to serve */
/**********************************************************************/
void serve_file(int client, const char *filename) {
	char buf[1024];
	buf[0] = 'A';
	buf[1] = '\0';

	int numchars = 1;
	while ((numchars > 0) && strcmp("\n", buf)) {
		/* read & discard headers */
		numchars = get_line(client, buf, sizeof(buf));
	}

	FILE *resource = fopen(filename, "r");
	if (resource == NULL) {
		not_found(client);
	} else {
		headers(client, filename);
		cat(client, resource);
	}
	fclose(resource);
}

/**********************************************************************/
/* Return the informational HTTP headers about a file. */
/* Parameters: the socket to print the headers on
 *             the name of the file */
/**********************************************************************/
void headers(int client, const char *filename) {
	char buf[1024];
	//这是强转为void类型标记TODO吗?
	(void)filename; /* could use filename to determine file type */

	strcpy(buf, "HTTP/1.0 200 OK\r\n");
	send(client, buf, strlen(buf), 0);
	strcpy(buf, SERVER_STRING);
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "Content-Type: text/html\r\n");
	send(client, buf, strlen(buf), 0);
	strcpy(buf, "\r\n");
	send(client, buf, strlen(buf), 0);
}

/**********************************************************************/
/* Put the entire contents of a file out on a socket.  This function
 * is named after the UNIX "cat" command, because it might have been
 * easier just to do something like pipe, fork, and exec("cat").
 * Parameters: the client socket descriptor
 *             FILE pointer for the file to cat */
/**********************************************************************/
void cat(int client, FILE *resource) {
	char buf[1024];

	fgets(buf, sizeof(buf), resource);
	//如果一次fgets直接读完,while循环会不会导致无法send?
	while (!feof(resource)) {
		send(client, buf, strlen(buf), 0);
		fgets(buf, sizeof(buf), resource);
	}
}

serve_file:若文件不存在返回404,否则调用headerscat发送filename指定的文件

headers:发送http报文首部

cat:将文件内容发送给浏览器

调用的api:

char *fgets(char *s, int size, FILE *stream);

execute_cgi

/**********************************************************************/
/* Execute a CGI script.  Will need to set environment variables as
 * appropriate.
 * Parameters: client socket descriptor
 *             path to the CGI script */
/**********************************************************************/
void execute_cgi(int client, const char *path, const char *method, const char *query_string) {
	char buf[1024];
	int numchars = 1;
	int content_length = -1;

	buf[0] = 'A';
	buf[1] = '\0';
	if (strcasecmp(method, "GET") == 0) {
		while ((numchars > 0) && strcmp("\n", buf)) {
			/* read & discard headers */
			numchars = get_line(client, buf, sizeof(buf));
		}
	} else if (strcasecmp(method, "POST") == 0) {
		numchars = get_line(client, buf, sizeof(buf));
		while ((numchars > 0) && strcmp("\n", buf)) {
			buf[15] = '\0';
			if (strcasecmp(buf, "Content-Length:") == 0) {
				content_length = atoi(&(buf[16]));
			}
			numchars = get_line(client, buf, sizeof(buf));
		}
		if (content_length == -1) {
			bad_request(client);
			return;
		}
	} else {
	}

	int cgi_output[2];
	int cgi_input[2];
	if (pipe(cgi_output) < 0) {
		cannot_execute(client);
		return;
	}
	if (pipe(cgi_input) < 0) {
		cannot_execute(client);
		return;
	}

	pid_t pid = fork();
	if (pid < 0) {
		cannot_execute(client);
		return;
	}
	sprintf(buf, "HTTP/1.0 200 OK\r\n");
	send(client, buf, strlen(buf), 0);
	if (pid == 0) {
		/* child: CGI script */
		char meth_env[255];
		char query_env[255];
		char length_env[255];

		//子进程设置环境变量然后执行cgi脚本
		dup2(cgi_output[1], STDOUT);
		dup2(cgi_input[0], STDIN);
		close(cgi_output[0]);
		close(cgi_input[1]);
		sprintf(meth_env, "REQUEST_METHOD=%s", method);
		putenv(meth_env);
		if (strcasecmp(method, "GET") == 0) {
			sprintf(query_env, "QUERY_STRING=%s", query_string);
			putenv(query_env);
		} else { /* POST */
			sprintf(length_env, "CONTENT_LENGTH=%d", content_length);
			putenv(length_env);
		}
		execl(path, NULL);
		exit(0);
	} else { /* parent */
		close(cgi_output[1]);
		close(cgi_input[0]);
		char c;
		if (strcasecmp(method, "POST") == 0)
			for (int i = 0; i < content_length; i++) {
				recv(client, &c, 1, 0);
				write(cgi_input[1], &c, 1);
			}
		while (read(cgi_output[0], &c, 1) > 0) {
			send(client, &c, 1, 0);
		}

		close(cgi_output[0]);
		close(cgi_input[1]);
		int status;
		waitpid(pid, &status, 0);
	}
}

  1. 若为get请求,跳过剩余首部行;若为post请求,获取Content-Length的值到content_length
  2. 构造两根管道:cig_inputcgi_output
  3. fork子进程
  4. 子进程中重定向IO,设置环境变量,最后执行path指定的cgi脚本
  5. 父进程中判断若为post方法,则从client读取数据然后写入cgi_input管道中;然后从cgi_output中读取数据发送给client;最后回收子进程

管道连接示意图:

在这里插入图片描述

调用的api:

int pipe(int pipefd[2]);
int dup(int oldfd);
int dup2(int oldfd, int newfd);
int putenv(char *string);

pipe() creates a pipe, a unidirectional data channel that can be used for interprocess communication. The array pipefd is used to return two file descriptors referring to the ends of the pipe. pipefd[0] refers to the read end of the pipe. pipefd[1] refers to the write end of the pipe. Data written to the write end of the pipe is buffered by the kernel until it is read from the read end of the pipe.

The dup() system call creates a copy of the file descriptor oldfd, using the lowest-numbered unused file descriptor for the new descriptor.After a successful return, the old and new file descriptors may be used interchangeably. They refer to the same open file description (see open(2)) and thus share file offset and file status flags; for example, if the file offset is modified by using lseek(2) on one of the file descriptors, the offset is also changed for the other. The two file descriptors do not share file descriptor flags (the close-on-exec flag). The close-on-exec flag (FD_CLOEXEC; see fcntl(2)) for the duplicate descriptor is off.
dup2()
The dup2() system call performs the same task as dup(), but instead of using the lowest-numbered unused file descriptor, it uses the file descriptor number specified in newfd. If the file descriptor newfd was previously open, it is silently closed before being reused.

The putenv() function adds or changes the value of environment variables. The argument string is of the form name=value. If name does not already exist in the environment, then string is added to the environment. If name does exist, then the value of name in the environment is changed to value. The string pointed to by string becomes part of the environment, so altering the string changes the environment.

bad_request和cannot_execute

/**********************************************************************/
/* Inform the client that a request it has made has a problem.
 * Parameters: client socket */
/**********************************************************************/
void bad_request(int client) {
	char buf[1024];
	sprintf(buf, "HTTP/1.0 400 BAD REQUEST\r\n");
	send(client, buf, sizeof(buf), 0);
	sprintf(buf, "Content-type: text/html\r\n");
	send(client, buf, sizeof(buf), 0);
	sprintf(buf, "\r\n");
	send(client, buf, sizeof(buf), 0);
	sprintf(buf, "<P>Your browser sent a bad request, ");
	send(client, buf, sizeof(buf), 0);
	sprintf(buf, "such as a POST without a Content-Length.\r\n");
	send(client, buf, sizeof(buf), 0);
}

/**********************************************************************/
/* Inform the client that a CGI script could not be executed.
 * Parameter: the client socket descriptor. */
/**********************************************************************/
void cannot_execute(int client) {
	char buf[1024];
	sprintf(buf, "HTTP/1.0 500 Internal Server Error\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "Content-type: text/html\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "\r\n");
	send(client, buf, strlen(buf), 0);
	sprintf(buf, "<P>Error prohibited CGI execution.\r\n");
	send(client, buf, strlen(buf), 0);
}

返回400 Bad request500 Internal Server Error错误页面

get_line

/**********************************************************************/
/* Get a line from a socket, whether the line ends in a newline,
 * carriage return, or a CRLF combination.  Terminates the string read
 * with a null character.  If no newline indicator is found before the
 * end of the buffer, the string is terminated with a null.  If any of
 * the above three line terminators is read, the last character of the
 * string will be a linefeed and the string will be terminated with a
 * null character.
 * Parameters: the socket descriptor
 *             the buffer to save the data in
 *             the size of the buffer
 * Returns: the number of bytes stored (excluding null) */
/**********************************************************************/
int get_line(int sock, char *buf, int size) {
	int i = 0;
	char c = '\0';
	int n;

	while ((i < size - 1) && (c != '\n')) {
		n = recv(sock, &c, 1, 0);
		/* DEBUG printf("%02X\n", c); */
		if (n > 0) {
			if (c == '\r') {
				n = recv(sock, &c, 1, MSG_PEEK);
				/* DEBUG printf("%02X\n", c); */
				if ((n > 0) && (c == '\n'))
					recv(sock, &c, 1, 0);
				else
					c = '\n';
			}
			buf[i] = c;
			i++;
		} else
			c = '\n';
	}
	buf[i] = '\0';

	return (i);
}

从socket获取一行数据

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值