Web Page Cache:
程序的运行具有局部性特征:
时间局部性
空间局部性
cache:命中
热区:局部性;
时效性:
缓存空间耗尽:LRU
过期:缓存清理
缓存命中率:hit/(hit+miss)
(0,1)
页面命中率:基于页面数量进行衡量
字节命中率:基于页面的体积进行衡量
缓存与否:
私有数据:private,private cache;
公共数据:public, public or private cache;
Cache-related Headers Fields
The most important caching header fields are:
Expires:过期时间;
Expires:Thu, 22 Oct 2026 06:34:30 GMT
Cache-Control
Etag
Last-Modified
If-Modified-Since
If-None-Match
Vary
Age
缓存有效性判断机制:
过期时间:Expires
HTTP/1.0
Expires
HTTP/1.1
Cache-Control: maxage=
Cache-Control: s-maxage=
条件式请求:
Last-Modified/If-Modified-Since
Etag/If-None-Match
Expires:Thu, 13 Aug 2026 02:05:12 GMT
Cache-Control:max-age=315360000
ETag:"1ec5-502264e2ae4c0"
Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT
cache-request-directive =
"no-cache" #能缓存但是不能直接响应,需要校验后才能响应
"no-store" #不存储缓存
"max-age" "=" delta-seconds
"max-stale" [ "=" delta-seconds ]
"min-fresh" "=" delta-seconds
"no-transform"
"only-if-cached"
cache-extension
cache-response-directive =
"public"
"private" [ "=" <"> 1#field-name <"> ] #私有缓存
"no-cache" [ "=" <"> 1#field-name <"> ]
"no-store"
"no-transform"
"must-revalidate"
"proxy-revalidate"
"max-age" "=" delta-seconds
"s-maxage" "=" delta-seconds #公共缓存
cache-extension
程序架构:
Manager进程
Cacher进程,包含多种类型的线程:
accept, worker, expiry, ...
shared memory log:
统计数据:计数器;
日志区域:日志记录;
varnishlog, varnishncsa, varnishstat...
配置接口:VCL
Varnish Configuration Language,
vcl complier --> c complier --> shared object
varnish的程序环境:
/etc/varnish/varnish.params: 配置varnish服务进程的工作特性,例如监听的地址和端口,缓存机制;
/etc/varnish/default.vcl:配置各Child/Cache线程的工作属性;
主程序:
/usr/sbin/varnishd
CLI interface:
/usr/bin/varnishadm
Shared Memory Log交互工具:
/usr/bin/varnishhist
/usr/bin/varnishlog
/usr/bin/varnishncsa
/usr/bin/varnishstat
/usr/bin/varnishtop
测试工具程序:
/usr/bin/varnishtest
VCL配置文件重载程序:
/usr/sbin/varnish_reload_vcl
Systemd Unit File:
/usr/lib/systemd/system/varnish.service
varnish服务
/usr/lib/systemd/system/varnishlog.service
/usr/lib/systemd/system/varnishncsa.service
日志持久的服务;
varnish的缓存存储机制( Storage Types):
· malloc[,size]
内存存储,[,size]用于定义空间大小;重启后所有缓存项失效;
· file[,path[,size[,granularity]]]
文件存储,黑盒;重启后所有缓存项失效;
· persistent,path,size
文件存储,黑盒;重启后所有缓存项有效;实验;
man varnishd
varnish程序的选项:
程序选项:/etc/varnish/varnish.params文件
-a address[:port][,address[:port][...],默认为6081端口;
-T address[:port],默认为6082端口;
-s [name=]type[,options],定义缓存存储机制;
-u user
-g group
-f config:VCL配置文件;
-F:在调试时使用,运行于前台;
...
运行时参数:/etc/varnish/varnish.params文件, DEAMON_OPTS
DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300"
-p param=value:设定运行参数及其值; 可重复使用多次;
-r param[,param...]: 设定指定的参数为只读状态;
vim /usr/lib/systemd/system/varnish.service
vim varnish.params
VARNISH_LISTEN_PORT=6081 #本地监听端口,一般为80端口
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1 #管理地址
VARNISH_ADMIN_LISTEN_PORT=6082 #管理的端口
VARNISH_SECRET_FILE=/etc/varnish/secret #管理服务的共享密钥文件
VARNISH_STORAGE="mallow,1G" #使用内存做为缓存,运行时不可修改,时间久会产生内存碎片,一般都是放在固态的文件上
vim default.vcl
写好后端主机ip后不要重启服务,直接重载配置,就可访问了
varnish_reload_vcl
重载vcl配置文件:
varnish_reload_vcl
varnishadm -S /etc/varnish/secret -T [ADDRESS:]PORT
varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082 # 可以省略ip但是 :6082不可省略
help [<command>]
ping [<timestamp>]
auth <response>
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname> #激活列表中的一个配置文件
vcl.discard <configname> #删除一个配置文件
vcl.list #列出了已经编译成功的列表
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show [-v] <configname>
backend.list [<backend_expression>]
backend.set_health <backend_expression> <state>
ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
ban.list
配置文件相关:
vcl.list
vcl.load:装载,加载并编译;
vcl.use:激活;
vcl.discard:删除;
vcl.show [-v] <configname>:查看指定的配置文件的详细信息;
运行时参数:
param.show -l:显示列表;
param.show <PARAM>
param.set <PARAM> <VALUE>
缓存存储:
storage.list
后端服务器:
backend.list
如要使用多个需要使用负载均衡模块
VCL:
”域“专有类型的配置语言;
state engine:状态引擎;
VCL有多个状态引擎,状态之间存在相关性,但彼此间互相隔离;每个状态引擎可使用return(x)指明关联至哪个下一级引擎;
vcl_hash --> return(hit) --> vcl_hit
请求处理流程:
(1) 接收请求:vcl_recv;判断其是否可缓存;
(a) 可缓存:vcl_hash
(i) 命中:vcl_hit
(ii)未命中:vcl_miss --> vcl_fetch
(b) 不可缓存:vcl_fetch
(2) 响应:vcl_deliver
state engine:状态引擎切换机制
request: vcl_recv
response: vcl_deliver
(1) vcl_hash -(hit)-> vcl_hit --> vcl_deliver
(2) vcl_hash -(miss)-> vcl_miss --> vcl_backend_fetch --> vcl_backend_response --> vcl_deliver
(3) vcl_hash -(purge)-> vcl_purge --> vcl_synth #缓存清理
(4) vcl_hash -(pipe)-> vcl_pipe
两个特殊的引擎:
vcl_init:在处理任何请求之前要执行的vcl代码:主要用于初始化VMODs;
vcl_fini:所有的请求都已经结束,在vcl配置被丢弃时调用;主要用于清理VMODs;
vcl的语法格式:
(1) VCL files start with vcl 4.0; #从4.0开始
(2) //, # and /* foo */ for comments; #注释
(3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { ...}; #子例程用sub 关键字来进行声明
(4) No loops, state-limited variables(受限于引擎的内建变量); #不支持循环.支持变量,支持条件判断
(5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action); #结束一个状态引擎,使用一个return函数交给下一级状态引擎
(6) Domain-specific; #域专用的配置,在一个域的配置只对本域有用
The VCL Finite State Machine
(1) Each request is processed separately; #每一个请求都是独立处理的
(2) Each request is independent from others at any given time; #请求在任何时间都是隔离的,独立的
(3) States are related, but isolated; #各状态引擎有相关性,但都是隔离的
(4) return(action); exits one state and instructs Varnish to proceed to the next state; #通过return状态的切换
(5) Built-in VCL code is always present and appended below your own VCL; #
三类主要语法:
sub subroutine {
...
}
if CONDITION {
...
} else {
...
}
return(), hash_data()
VCL Built-in Functions and Keywords
函数:
regsub(str, regex, sub) #只替换一个
regsuball(str, regex, sub) #全部替换
ban(boolean expression) #清理缓存空间的缓存项
hash_data(input) #进行hash计算
synthetic(str)
Keywords:
call subroutine, return(action),new,set,unset
操作符:
==, !=, ~, >, >=, <, <=
逻辑操作符:&&, ||, !
变量赋值:=
举例:obj.hits
if (obj.hits>0) {
set resp.http.X-Cache = "HIT via " + server.ip;
} else {
set resp.http.X-Cache = "MISS via " + server.ip;
}
变量类型:
内建变量:
req.*:request,表示由客户端发来的请求报文相关;
req.http.*
req.http.User-Agent, req.http.Referer, ...
bereq.*:由varnish发往BE主机的httpd请求相关;
bereq.http.*
beresp.*:由BE主机响应给varnish的响应报文相关;
beresp.http.*
resp.*:由varnish响应给client相关;
obj.*:存储在缓存空间中的缓存对象的属性;只读;
常用变量:
bereq.*:
bereq.http.HEADERS:请求报文的某指定首部,可以自己定义
bereq.request:请求方法;
bereq.url:请求的url;
bereq.proto:请求的协议版本;
bereq.backend:指明要调用的后端主机;
req.*:
req.http.Cookie:客户端的请求报文中Cookie首部的值;
req.http.User-Agent ~ "chrome" #获取浏览器的值
beresp.*, resp.*:
beresp.http.HEADERS
beresp.status:响应的状态码;
reresp.proto:协议版本;
beresp.backend.name:BE主机的主机名;
beresp.ttl:BE主机响应的内容的余下的可缓存时长;
obj.*
obj.hits:此对象从缓存中命中的次数;
obj.ttl:对象的ttl值
server.*
server.ip
server.hostname
client.*
client.ip
用户自定义:
set
unset
示例1:强制对某类资源的请求不检查缓存:
vcl_recv {
if (req.url ~ "(?i)^/(login|admin)") {
return(pass);
}
}
没有加载配置前如下
加载配置之后
示例2:对于特定类型的资源,例如公开的图片等,取消其私有标识,并强行设定其可以由varnish缓存的时长;
if (beresp.http.cache-control !~ "s-maxage") {
if (bereq.url ~ "(?i)\.(jpg|jpeg|png|gif|css|js)$") {
unset beresp.http.Set-Cookie;
set beresp.ttl = 3600s;
}
}
缓存对象的修剪:purge, ban
(1) 能执行purge操作
sub vcl_purge {
return (synth(200,"Purged"));
}
(2) 何时执行purge操作
sub vcl_recv {
if (req.method == "PURGE") {
return(purge);
}
...
}
添加此类请求的访问控制法则:
acl purgers {
"127.0.0.0"/8;
"10.1.0.0"/16;
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purgers) {
return(synth(405,"Purging not allowed for " + client.ip));
}
return(purge);
}
...
}
如何设定使用多个后端主机:
backend default {
.host = "172.16.100.6";
.port = "80";
}
backend appsrv {
.host = "172.16.100.7";
.port = "80";
}
sub vcl_recv {
if (req.url ~ "(?i)\.php$") {
set req.backend_hint = appsrv;
} else {
set req.backend_hint = default;
}
...
}
Director:
varnish module;
使用前需要导入:
import director;
示例:
import directors; # load the directors
backend server1 {
.host =
.port =
}
backend server2 {
.host =
.port =
}
sub vcl_init {
new GROUP_NAME = directors.round_robin();
GROUP_NAME.add_backend(server1);
GROUP_NAME.add_backend(server2);
}
sub vcl_recv {
# send all traffic to the bar director:
set req.backend_hint = GROUP_NAME.backend();
}
BE Health Check:
backend BE_NAME {
.host =
.port =
.probe = {
.url=
.timeout=
.interval=
.window=
.threshhold=
}
}
.probe:定义健康状态检测方法;
.url:检测时请求的URL,默认为”/";
.request:发出的具体请求;
.request =
"GET /.healthtest.html HTTP/1.1"
"Host: www.magedu.com"
"Connection: close"
.window:基于最近的多少次检查来判断其健康状态;
.threshhold:最近.window中定义的这么次检查中至有.threshhold定义的次数是成功的;
.interval:检测频度;
.timeout:超时时长;
.expected_response:期望的响应码,默认为200;
健康状态检测的配置方式:
(1) probe PB_NAME = { }
backend NAME = {
.probe = PB_NAME;
...
}
(2) backend NAME {
.probe = {
...
}
}
示例:
probe check {
.url="/.healthcheck"; #定义健康检查的页面
.window = 8; #设定在判定后端主机健康状态时基于最近多少次的探测进行
.threshold = 8; #在.window中指定的次数中,至少有多少次是成功的才判定后端主机正健康运行
.interval = 2s; #探测请求的发送周期,默认为5秒
.timeout = 1s; #每次探测请求的超时时长
}
backend default {
.host = "192.168.153.129";
.port = "80";
.probe=check;
}
backend two {
.host = "192.168.153.130";
.port = "80";
.probe=check;
}
varnish的运行时参数:
线程模型:
cache-worker
cache-main
ban lurker
acceptor:
epoll/kqueue:
...
线程相关的参数:
在线程池内部,其每一个请求由一个线程来处理; 其worker线程的最大数决定了varnish的并发响应能力;
thread_pools:Number of worker thread pools. 最好小于或等于CPU核心数量;
thread_pool_max:The maximum number of worker threads in each pool.
thread_pool_min:The minimum number of worker threads in each pool. 额外意义为“最大空闲线程数”;
最大并发连接数=thread_pools * thread_pool_max
thread_pool_timeout:Thread idle threshold. Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.
thread_pool_add_delay:Wait at least this long after creating a thread.
thread_pool_destroy_delay:Wait this long after destroying a thread.
设置方式:
param.set
永久有效的方法:
varnish.params
DEAMON_OPTS="-p PARAM1=VALUE -p PARAM2=VALUE"
varnishstat -1 -f MAIN.threads #显示指定的字段
varnish日志区域:
shared memory log
计数器
日志信息
1、varnishstat - Varnish Cache statistics
-1
-1 -f FILED_NAME
-l:可用于-f选项指定的字段名称列表;
MAIN.cache_hit
MAIN.cache_miss
varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss
2、varnishtop - Varnish log entry ranking
-1 Instead of a continously updated display, print the statistics once and exit.
-i taglist,可以同时使用多个-i选项,也可以一个选项跟上多个标签;
-I <[taglist:]regex>
-x taglist:排除列表
-X <[taglist:]regex>
3、varnishlog - Display Varnish logs
4、 varnishncsa - Display Varnish logs in Apache / NCSA combined log format
内建函数:
hash_data():指明哈希计算的数据;减少差异,以提升命中率;
regsub(str,regex,sub):把str中被regex第一次匹配到字符串替换为sub;主要用于URL Rewrite
regsuball(str,regex,sub):把str中被regex每一次匹配到字符串均替换为sub;
return():
ban(expression)
ban_url(regex):Bans所有的其URL可以被此处的regex匹配到的缓存对象;
synth(status,"STRING"):purge操作;
为了提高命中率防止其中一台服务器坏掉命中率下降,要使用nginx的
hash $request_uri consistent; #表示使用一致性hash算法