当客户端的一个request到达服务器的时候,可以想象一下,一个服务器应该怎么处理这个请求呢?nginx又是怎么处理请求的呢?
客户端一个请求到达nginx后,一个worker进程accept后开始处理,首先解析此次请求的请求行(request line),然后处理请求头(request headers),然后再经过http各功能模块,实现对不同请求的特定处理,最后将result返回给客户端。
一:那么各个模块是怎么work的呢?nginx在这里采用了PHASE状态机来实现的,每个phase阶段由checker函数和handler函数来控制。先看一下nginx的11个PHASE,有个直观的印象。
typedef enum {
NGX_HTTP_POST_READ_PHASE = 0,
NGX_HTTP_SERVER_REWRITE_PHASE,
NGX_HTTP_FIND_CONFIG_PHASE,
NGX_HTTP_REWRITE_PHASE,
NGX_HTTP_POST_REWRITE_PHASE,
NGX_HTTP_PREACCESS_PHASE,
NGX_HTTP_ACCESS_PHASE,
NGX_HTTP_POST_ACCESS_PHASE,
NGX_HTTP_TRY_FILES_PHASE,
NGX_HTTP_CONTENT_PHASE,
NGX_HTTP_LOG_PHASE
} ngx_http_phases;
上面列举的就是nginx的所有phase,每个phase只有一个checker,checker是来控制请求阶段走向的,通过cheker函数来判断是继续在本phase进行下一个handler,还是到下一个phase的handler进行处理,或者直接跳到某个phase的某个handler处理,后续会详细看一下REWRITE_PHASE的checker函数:ngx_http_core_rewrite_phase 的工作过程。
每个phase都有哪些模块的handler挂载了呢?
当一个nginx进程在处理完请求行和请求头之后,就会到达phase状态机的入口函数:ngx_http_core_run_phases
这个函数囊括了整个状态机的运行。
void
ngx_http_core_run_phases(ngx_http_request_t *r)
{
ngx_int_t rc;
ngx_http_phase_handler_t *ph;
ngx_http_core_main_conf_t *cmcf;
cmcf = ngx_http_get_module_main_conf(r, ngx_http_core_module);
/*
ph就存储了各phase的checker和handler信息。
while循环就是顺序执行这些checker,checker是舵手,根据handler处理结果或者相关配置来掌控下一步走法
*/
ph = cmcf->phase_engine.handlers;
while (ph[r->phase_handler].checker) {
rc = ph[r->phase_handler].checker(r, &ph[r->phase_handler]);
if (rc == NGX_OK) {
return;
}
}
}
采用gdb调试nginx时,打印一下ngx_http_core_run_phases函数中的ph变量,可以看到当前编译的nginx在phase中注册的handler,通过handler的名称我们就能判断出handler的隶属模块,下面是笔者所用nginx打印出的ph值。
ph[0]= {checker = 0x43b8e1 <ngx_http_core_rewrite_phase>, handler = 0x48c0c0 <ngx_http_rewrite_handler>, next = 1}
ph[1]= {checker = 0x43b997 <ngx_http_core_find_config_phase>, handler = 0, next = 0}
ph[2]= {checker = 0x43b8e1 <ngx_http_core_rewrite_phase>, handler = 0x4a76a2 <ngx_http_xxx_xxx_handler>, next = 4}//自己开发的第三方扩展模块
ph[3]= {checker = 0x43b8e1 <ngx_http_core_rewrite_phase>, handler = 0x48c0c0 <ngx_http_rewrite_handler>, next = 4}
ph[4]= {checker = 0x43bdb5 <ngx_http_core_post_rewrite_phase>, handler = 0, next = 1}
ph[5]= {checker = 0x43b7fe <ngx_http_core_generic_phase>, handler = 0x484b98 <ngx_http_limit_req_handler>, next = 7}
ph[6]= {checker = 0x43b7fe <ngx_http_core_generic_phase>, handler = 0x4837f8 <ngx_http_limit_conn_handler>, next = 7}
ph[7]= {checker = 0x43bf99 <ngx_http_core_access_phase>, handler = 0x483334 <ngx_http_access_handler>, next = 10}
ph[8]= {checker = 0x43bf99 <ngx_http_core_access_phase>, handler = 0x48272c <ngx_http_auth_basic_handler>, next = 10}
ph[9]= {checker = 0x43c16c <ngx_http_core_post_access_phase>, handler = 0, next = 10}
ph[10]={checker = 0x43cae6 <ngx_http_core_content_phase>, handler = 0x467eb8 <ngx_http_index_handler>, next = 13}
ph[11]={checker = 0x43cae6 <ngx_http_core_content_phase>, handler = 0x480d5c <ngx_http_autoindex_handler>, next = 13}
ph[12]={checker = 0x43cae6 <ngx_http_core_content_phase>, handler = 0x467600 <ngx_http_static_handler>, next = 13}
可见phase状态机最后的形式是一个指向ngx_http_phase_handler_t结构的指针,cmcf->phase_engine.handlers。
那么cmcf->phase_engine.handlers这个指针指向的数组的赋值就成了关键。
二:这些handler是怎么添加到cmcf->phase_engine.handlers指向的数组中去的呢?
先看一下cmcf,cmcf结构是ngx_http_core_main_conf_t
ngx_http_core_main_conf_t有两个phase相关的成员变量:
typedef struct {
ngx_http_phase_engine_t phase_engine;
ngx_http_phase_t phases[NGX_HTTP_LOG_PHASE + 1];
} ngx_http_core_main_conf_t;
在ngx_http_block函数中,依次调用了各模块的 postconfiguration 函数,而玄机就在这些模块的postconfiguration函数里
static char *
ngx_http_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
......
for (m = 0; ngx_modules[m]; m++) {
if (ngx_modules[m]->type != NGX_HTTP_MODULE) {
continue;
}
module = ngx_modules[m]->ctx;
if (module->postconfiguration) {
if (module->postconfiguration(cf) != NGX_OK) {
return NGX_CONF_ERROR;
}
}
}
if (ngx_http_variables_init_vars(cf) != NGX_OK) {
return NGX_CONF_ERROR;
}
*cf = pcf;
if (ngx_http_init_phase_handlers(cf, cmcf) != NGX_OK) {
return NGX_CONF_ERROR;
}
......
}
例如:看一下访问控制模块ngx_http_access_module的postconfiguration函数
static ngx_http_module_t ngx_http_access_module_ctx = {
NULL, /* preconfiguration */
ngx_http_access_init, /* postconfiguration */
NULL, /* create main configuration */
NULL, /* init main configuration */
NULL, /* create server configuration */
NULL, /* merge server configuration */
ngx_http_access_create_loc_conf, /* create location configuration */
ngx_http_access_merge_loc_conf /* merge location configuration */
};
可以看到ngx_http_access_module模块postconfiguration函数是ngx_http_access_init,在ngx_http_access_init函数中,完成了添加access模块handler的第一步:
static ngx_int_t
ngx_http_access_init(ngx_conf_t *cf)
{
ngx_http_handler_pt *h;
ngx_http_core_main_conf_t *cmcf;
cmcf = ngx_http_conf_get_module_main_conf(cf, ngx_http_core_module);
/*将本模块的handler:ngx_http_access_handler放入到NGX_HTTP_ACCESS_PHASE阶段。
存在了cmcf->phases[NGX_HTTP_ACCESS_PHASE].handlers这个array中
*/
h = ngx_array_push(&cmcf->phases[NGX_HTTP_ACCESS_PHASE].handlers);
if (h == NULL) {
return NGX_ERROR;
}
*h = ngx_http_access_handler;
return NGX_OK;
}
即调用访问控制模块的postconfiguration函数,将本模块的handler添加到cmcf->phases中NGX_HTTP_ACCESS_PHASE阶段的数组中。
随着各个模块的postconfiguration函数调用,最后,各http模块均将自己的handler注册到了cmcf->phases中对应的phase数组中。下面列举了各PHASE一般包含的http功能模块,“:”左边是阶段(PHASE)名称,“:”右边是http功能模块,
POST_READ_PHASE:realip
SERVER_REWRITE_PHASE:rewrite
FIND_CONFIG_PHASE:NULL
REWRITE_PHASE:rewrite
POST_REWRITE_PHASE:NULL
PREACCESS_PHASE:limit_zone、limit_req、realip
ACCESS_PHASE:auth_basic 、access
POST_ACCESS_PHASE:NULL
TRY_FILES_PHASE:NULL
CONTENT_PHASE:index、autoindex、static、dav、gzip_static、random_index
LOG_PHASE:log
需要提到两个点:
1:为什么realip存在两个phase中
POST_READ_PHASE阶段的作用范围为server,使用realip的相关指令,可以对rewrite造成影响
PREACCESS_PHASE阶段的作用范围为location
2:怎么控制一个phase内执行不同模块执行的handler的顺序
执行顺序是按照模块的注册顺序的倒序,在进行phase_handlers初始化的时候实现的倒序,这个在后面会提到。
在ngx_http_block依次调用所有模块的postconfiguration函数后,11个phase的handler已经注册完毕,那么怎么对应到phase_engine上的呢?
ngx_http_block接着调用了函数ngx_http_init_phase_handlers,
static ngx_int_t
ngx_http_init_phase_handlers(ngx_conf_t *cf, ngx_http_core_main_conf_t *cmcf)
{
ngx_int_t j;
ngx_uint_t i, n;
ngx_uint_t find_config_index, use_rewrite, use_access;
ngx_http_handler_pt *h;
ngx_http_phase_handler_t *ph;
ngx_http_phase_handler_pt checker;
//这三个index记录了rewrite后的下一步,和rewrite命令结尾字符如break,last有关
cmcf->phase_engine.server_rewrite_index = (ngx_uint_t) -1;
cmcf->phase_engine.location_rewrite_index = (ngx_uint_t) -1;
find_config_index = 0;
/*use_rewrite用来标注REWRITE_PHASE不将phases[NGX_HTTP_POST_REWRITE_PHASE]数组中的handler放到phase_engine
,而是进行特殊处理;
use_access用来标注POST_ACCESS_PHASE不将phases[NGX_HTTP_POST_ACCESS_PHASE]数组中的handler放到phase_engine
,而是进行特殊处理;
同样的还有try_files阶段和find_location阶段;
可以看到在接下来的switch代码段的时候,case match后continue的阶段就是刚提到的进行特殊处理的四个阶段
*/
use_rewrite = cmcf->phases[NGX_HTTP_REWRITE_PHASE].handlers.nelts ? 1 : 0;
use_access = cmcf->phases[NGX_HTTP_ACCESS_PHASE].handlers.nelts ? 1 : 0;
n = use_rewrite + use_access + cmcf->try_files + 1 /* find config phase */;
for (i = 0; i < NGX_HTTP_LOG_PHASE; i++) {
n += cmcf->phases[i].handlers.nelts;
}
ph = ngx_pcalloc(cf->pool,
n * sizeof(ngx_http_phase_handler_t) + sizeof(void *));
if (ph == NULL) {
return NGX_ERROR;
}
cmcf->phase_engine.handlers = ph;
n = 0;
/*
for循环就是将phases的handler放到phase_engine数组中
*/
for (i = 0; i < NGX_HTTP_LOG_PHASE; i++) {
h = cmcf->phases[i].handlers.elts;
switch (i) {
case NGX_HTTP_SERVER_REWRITE_PHASE:
if (cmcf->phase_engine.server_rewrite_index == (ngx_uint_t) -1) {
cmcf->phase_engine.server_rewrite_index = n;
}
checker = ngx_http_core_rewrite_phase;
break;
case NGX_HTTP_FIND_CONFIG_PHASE:
find_config_index = n;
ph->checker = ngx_http_core_find_config_phase;
n++;
ph++;
continue;
case NGX_HTTP_REWRITE_PHASE:
if (cmcf->phase_engine.location_rewrite_index == (ngx_uint_t) -1) {
cmcf->phase_engine.location_rewrite_index = n;
}
checker = ngx_http_core_rewrite_phase;
break;
case NGX_HTTP_POST_REWRITE_PHASE:
if (use_rewrite) {
ph->checker = ngx_http_core_post_rewrite_phase;
ph->next = find_config_index;
n++;
ph++;
}
continue;
case NGX_HTTP_ACCESS_PHASE:
checker = ngx_http_core_access_phase;
n++;
break;
case NGX_HTTP_POST_ACCESS_PHASE:
if (use_access) {
ph->checker = ngx_http_core_post_access_phase;
ph->next = n;
ph++;
}
continue;
case NGX_HTTP_TRY_FILES_PHASE:
if (cmcf->try_files) {
ph->checker = ngx_http_core_try_files_phase;
n++;
ph++;
}
continue;
case NGX_HTTP_CONTENT_PHASE:
checker = ngx_http_core_content_phase;
break;
default:
checker = ngx_http_core_generic_phase;
}
n += cmcf->phases[i].handlers.nelts;
/*
在这里采用了倒叙的方法,即同一个phase中先注册模块的handler会在后面
*/
for (j = cmcf->phases[i].handlers.nelts - 1; j >=0; j--) {
ph->checker = checker;
ph->handler = h[j];
ph->next = n;
ph++;
}
}
return NGX_OK;
}
通过代码发现,如果要编写第三方模块的话,根据需求可以将自己模块的handler注册到某个phase;也可以更改nginx源代码,支持一个新phase。
但是有四个phase是不支持添加http功能的handler的,这四个阶段分别是FIND_CONFIG_PHASE、POST_REWRITE_PHASE、POST_ACCESS_PHASE、TRY_FILES_PHASE。
三:checker是怎么工作的呢?那个比较有代表性的checker讨论讨论,NGX_HTTP_POST_REWRITE_PHASE的checker:ngx_http_core_post_rewrite_phase
ngx_http_request_t结构有个成员变量phase_handler标志着走到了cmcf->phase_engine.handlers数组的哪一个元素,即到了状态机的哪一步。
ph = cmcf->phase_engine.handlers;
while (ph[r->phase_handler].checker) {
rc = ph[r->phase_handler].checker(r, &ph[r->phase_handler]);
if (rc == NGX_OK) {
return;
}
}
当前request到了状态机的哪一步,就调用对应的checker,假设走到了上面所列状态机的第五步,r->phase_handler = 4,
ph[4]= {checker = 0x43bdb5 <ngx_http_core_post_rewrite_phase>, handler = 0, next = 1}
那么就会执行ngx_http_core_post_rewrite_phase这个checker。
注意,每个phase_handler还有一个变量时next,在这里next=1,表明了requset在某些情况下从当前状态直接跳到find_config阶段
ph[1]= {checker = 0x43b997 <ngx_http_core_find_config_phase>, handler = 0, next = 0}
ngx_int_t
ngx_http_core_post_rewrite_phase(ngx_http_request_t *r,
ngx_http_phase_handler_t *ph)
{
ngx_http_core_srv_conf_t *cscf;
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
"post rewrite phase: %ui", r->phase_handler);
/*
不管是server_rewrite阶段,还是rewrite阶段,都会进行url的rewrite重写,
当使用last的时候,uri_changed=1,当使用break的时候,uri_changed=0;
也就是说当使用break时候,是走到状态机的下一步。
*/
if (!r->uri_changed) {
r->phase_handler++;
return NGX_AGAIN;
}
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
"uri changes: %d", r->uri_changes);
/*
如果使用last,则uri_changes会减去1
uri_changes初始化的值为NGX_HTTP_MAX_URI_CHANGES + 1=11次
也就是说在nginx内部url 的rewrite最多循环重定向11次,就会结束这个请求,并报500的错误码
*/
r->uri_changes--;
if (r->uri_changes == 0) {
ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
"rewrite or internal redirection cycle "
"while processing \"%V\"", &r->uri);
ngx_http_finalize_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
return NGX_OK;
}
/*
使用last的时候,就会改变状态机的当前状态,使得状态机回到find_config阶段,ph->next在这里指向的就是find_config阶段
*/
r->phase_handler = ph->next;
cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module);
r->loc_conf = cscf->ctx->loc_conf;
return NGX_AGAIN;
}
--------The End