前言
在笔者解析nginx
之前,需要先拿简单的Tinyhttp
练手,这篇和笔者以前的u-boot一样,也是从零解析,不参考任何别人解析的文章。培养如何解决问题的方法才是更加重要的,因为Tinyhttp
实在太简单了,甚至笔者之前的UNP篇的server模型里面都有比它复杂的。所以笔者打算花费3天,每天1-2个小时来完成这个系列。
1. HTTP协议
UNP中没有涉及到应用层协议,但是如果想编写一个Web Server,必须熟悉HTTP协议,虽然该协议比较简单,如果想全面了解还是推荐《HTTP权威解析》,这里笔者推荐一篇简单介绍的博文:关于HTTP协议,一篇就够了,应付TinyHttp
完全足够了,这里不赘述了。
2. 下载,编译和运行
要解析之前,首先应该将TinyHttp
运行起来。首先下载源码:Tiny HTTPd,接着解压看看里面的文件:
[root@localhost tinyhttpd-0.1.0]# ls
htdocs httpd httpd.c Makefile README simpleclient.c
[root@localhost tinyhttpd-0.1.0]#
笔者还是推荐看看README
,里面说不定有彩蛋:
I wrote this webserver for an assignment in my networking class in 1999. We were told that at a bare minimum the server had to serve pages, and told that we would get extra credit for doing “extras.” Perl had introduced me to a whole lot of UNIX functionality (I learned sockets and fork from Perl!), and O’Reilly’s lion book on UNIX system calls plus O’Reilly’s books on CGI and writing web clients in Perl got me thinking and I realized I could make my webserver support CGI with little trouble.
作者的自述,人家在1999年写这个只是课堂的作业。我在2018年相隔19年之后解析人家的课堂作业.^_^
修改Makefile
为:
all: httpd
httpd: httpd.c
gcc -lpthread -o httpd httpd.c
clean:
rm httpd
编译并运行:
[root@localhost tinyhttpd-0.1.0]# make
gcc -lpthread -o httpd httpd.c
httpd.c: In function ‘main’:
httpd.c:495:40: warning: passing argument 3 of ‘pthread_create’ from incompatible pointer type [-Wincompatible-pointer-types]
if (pthread_create(&newthread , NULL, accept_request, client_sock) != 0)
^~~~~~~~~~~~~~
In file included from httpd.c:25:0:
/usr/include/pthread.h:234:12: note: expected ‘void * (*)(void *)’ but argument is of type ‘void (*)(int)’
extern int pthread_create (pthread_t *__restrict __newthread,
^~~~~~~~~~~~~~
httpd.c:495:56: warning: passing argument 4 of ‘pthread_create’ makes pointer from integer without a cast [-Wint-conversion]
if (pthread_create(&newthread , NULL, accept_request, client_sock) != 0)
^~~~~~~~~~~
In file included from httpd.c:25:0:
/usr/include/pthread.h:234:12: note: expected ‘void * restrict’ but argument is of type ‘int’
extern int pthread_create (pthread_t *__restrict __newthread,
^~~~~~~~~~~~~~
//上面这些警告都可以忽略,因为我们确切的知道这些警告没有影响
[root@localhost tinyhttpd-0.1.0]# ./httpd
httpd running on port 43545
然后运行浏览器试试看效果,笔者使用的平台是fedora 27
:
有些人发现主页打不开,或者颜色出不来,注意地址的端口号应该和server绑定的port一致,另外也要注意perl-CGI的相关组件是否已经安装了。_
3. 捕捉HTTP数据包
可以发现成功的实现了功能,在解析源代码之前,需要以刚刚打开主页,及显示红色为例。进行HTTP数据包的捕捉。笔者使用的火狐浏览器,F12的开发者模式非常方便,重复刚刚的过程,捕捉相关的数据包:
3.1 主页相关的数据包
3.2 红色页相关的数据包
4. 源码解析
4.1 主体架构
int main(void)
{
int server_sock = -1;
u_short port = 0;
int client_sock = -1;
struct sockaddr_in client_name;
int client_name_len = sizeof(client_name);
pthread_t newthread;
server_sock = startup(&port);
printf("httpd running on port %d\n", port);
while (1)
{
client_sock = accept(server_sock, (struct sockaddr *)&client_name, &client_name_len);
if (client_sock == -1)
error_die("accept");
/* accept_request(client_sock); */
if (pthread_create(&newthread , NULL, accept_request, client_sock) != 0)
perror("pthread_create");
}
close(server_sock);
return(0);
}
int startup(u_short *port)
{
int httpd = 0;
struct sockaddr_in name;
httpd = socket(PF_INET, SOCK_STREAM, 0);
if (httpd == -1)
error_die("socket");
memset(&name, 0, sizeof(name));
name.sin_family = AF_INET;
name.sin_port = htons(*port);
name.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(httpd, (struct sockaddr *)&name, sizeof(name)) < 0)
error_die("bind");
if (*port == 0) /* if dynamically allocating a port */
{
int namelen = sizeof(name);
if (getsockname(httpd, (struct sockaddr *)&name, &namelen) == -1)
error_die("getsockname");
*port = ntohs(name.sin_port);
}
if (listen(httpd, 5) < 0)
error_die("listen");
return(httpd);
}
上面的代码和笔者之前记录的网络编程之Threads实验一模一样,就是很简单的thread server
模型,笔者在这里就不赘述了。再看看accept_request
函数,在该函数里面需要实现HTTP协议了:
void accept_request(int client)
{
char buf[1024];
int numchars;
char method[255];
char url[255];
char path[512];
size_t i, j;
struct stat st;
int cgi = 0; /* becomes true if server decides this is a CGI
* program */
char *query_string = NULL;
numchars = get_line(client, buf, sizeof(buf));
i = 0; j = 0;
while (!ISspace(buf[j]) && (i < sizeof(method) - 1))
{
method[i] = buf[j];
i++; j++;
}
method[i] = '\0';
if (strcasecmp(method, "GET") && strcasecmp(method, "POST"))
{
unimplemented(client);
return;
}
if (strcasecmp(method, "POST") == 0)
cgi = 1;
i = 0;
while (ISspace(buf[j]) && (j < sizeof(buf)))
j++;
while (!ISspace(buf[j]) && (i < sizeof(url) - 1) && (j < sizeof(buf)))
{
url[i] = buf[j];
i++; j++;
}
url[i] = '\0';
if (strcasecmp(method, "GET") == 0)
{
query_string = url;
while ((*query_string != '?') && (*query_string != '\0'))
query_string++;
if (*query_string == '?')
{
cgi = 1;
*query_string = '\0';
query_string++;
}
}
sprintf(path, "htdocs%s", url);
if (path[strlen(path) - 1] == '/')
strcat(path, "index.html");
if (stat(path, &st) == -1) {
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
not_found(client);
}
else
{
if ((st.st_mode & S_IFMT) == S_IFDIR)
strcat(path, "/index.html");
if ((st.st_mode & S_IXUSR) ||
(st.st_mode & S_IXGRP) ||
(st.st_mode & S_IXOTH) )
cgi = 1;
if (!cgi)
serve_file(client, path);
else
execute_cgi(client, path, method, query_string);
}
close(client);
}
4.2 accept_request函数细节
在看get_line
之前必须明白,比如我们打开主页的时候,从client接收到的数据格式如下图所示(该图来自关于HTTP协议,一篇就够了),因为HTTP协议是无连接的,我们只要在每个client中对每行单独判读处理,就可以正确读取数据,并作出相应动作:
/**********************************************************************/
/* Get a line from a socket, whether the line ends in a newline,
* carriage return, or a CRLF combination. Terminates the string read
* with a null character. If no newline indicator is found before the
* end of the buffer, the string is terminated with a null. If any of
* the above three line terminators is read, the last character of the
* string will be a linefeed and the string will be terminated with a
* null character.
* Parameters: the socket descriptor
* the buffer to save the data in
* the size of the buffer
* Returns: the number of bytes stored (excluding null) */
/**********************************************************************/
int get_line(int sock, char *buf, int size)
{
int i = 0;
char c = '\0';
int n;
while ((i < size - 1) && (c != '\n'))
{
n = recv(sock, &c, 1, 0);
/* DEBUG printf("%02X\n", c); */
if (n > 0)
{
//每次读到字符,判断是否是空白符
if (c == '\r')
{
n = recv(sock, &c, 1, MSG_PEEK);
/* DEBUG printf("%02X\n", c); */
//上面的MSG_PEEK,并不实际从缓冲中去除该字节,所以下面的recv依然读取了该字节,也就是\n
if ((n > 0) && (c == '\n'))
recv(sock, &c, 1, 0);
else
c = '\n';
}
buf[i] = c;
i++;
}
else
//如果client shutdown,此时recv残留在Pipe中的数据之后,返回0.然后c=`\n`来结束循环
c = '\n';
}
buf[i] = '\0';
return(i);
}
其实上面原作者的注释已经比较清楚了,强调一点就是每次读取一行,且该行内如果保留\r\n
(如果有),显然这里对应报文中的请求行和请求头部,最后请求数据每次读取size
大小的数据,且以\0
结尾。注:请求数据此时可能不是以一行为单位进行读取。
今天花了一个半小时,就到这里吧,明天继续。_