本文链接：https://blog.csdn.net/qq_44564366/article/details/97971964

varish

缓存
varnish简介
varnish架构
varnish工作流程
使用varnish
VCL的配置

缓存

varnish简介

Varnish是一款高性能的开源HTTP加速器，挪威最大的在线报纸 Verdens Gang 使用3台Varnish代替了原来的12台Squid，性能比以前更好。
Varnish 的作者Poul-Henning Kamp是FreeBSD的内核开发者之一，他认为现在的计算机比起1975年已经复杂许多。在1975年时，储存媒介只有两种：内存与硬盘。但现在计算机系统的内存除了主存外，还包括了CPU内的L1、L2，甚至有L3快取。硬盘上也有自己的快取装置，因此Squid Cache自行处理物件替换的架构不可能得知这些情况而做到最佳化，但操作系统可以得知这些情况，所以这部份的工作应该交给操作系统处理，这就是 Varnish cache设计架构。

varnish架构

在这里插入图片描述

Management

主要用于子进程管理、加载配置文件、提供命令接口、编译vcl配置。Management会不断的监听子进程的心跳信息，如果某个子进程的心跳无法监听到，Management会重新启动该进程

CLI interface：文本命令行接口、
web interface：图形命令行接口
telnet interface：telnet命令行接口

child/cahce

提供命令行工具、storage存储、hash计算、日志管理、接收用户的请求、管理后端服务器、管理工作线程池、清空过期的缓存内容。

log file

主要用来管理日志文件，varnish日志文件以循环覆盖的方式存储于共享内存中，大小一般为90M。分为两部分，第一部分是计数器，另一部分为客户端请求的数据

VCL

VCL(Varnish Configuration Language)是varnish提供的一个专供“域”类型的缓存策略配置接口。它支持内置函数、操作运算符以及内置变量等等。VCL缓存策略在编写完成之后，需要将配置文件先编译成C语言文件，最后通过C语言编译器转换成二进制文件供child内的各个进程使用。

varnish工作流程

在这里插入图片描述
上图为简要的一个流程图，包含着多个引擎，每个引擎都是相互隔离的，可以使用return来表明由哪个引擎跳至哪个引擎。每一个引擎都有其自己的相关配置项。即subroutine。

vcl_recv：请求报文入口，这是最开始的一个引擎。lookup本地查询。
vcl_hash：对请求报文进行hash计算，根据VCL配置文件来判断该请求报文是否可以通过缓存来响应。下一步就交由pass.或者pipe等等。
缓存策略引擎
- vcl_hit：缓存命中，由缓存直接响应客户端
- vcl_miss：没有命中缓存，直接交由fetch处理。
- vcl_purge：缓存修剪，如果有缓存，也不要用缓存来响应我。
- vcl_pipe：直接由后端服务器响处理请求报文，由varnish将相应结果原封不动的传递给客户端。
- vcl_waiting：请求报文流入的太多，有一部分需要进入等待队列。
- vcl_pass：不查询缓存，直接交由fetch来处理。
- vcl_backend_fetch：由后端服务器来响应结果
  - vcl_backend_response：服务器响应的结果，根据配置文件来判断是否将响应内容放入缓存。
  - vcl_backend_error：错误响应
- vcl_deliver：响应客户端的报文
- vcl_synth：从varnish生成响应报文，通常是一个带有错误信息的网页，也可以将网页重定向到另一个网页。

使用varnish

首先安装下varnish，可以直接在epel仓库中直接安装

[root@keepalived2 ~]# yum info varnish
源    ：installed

varnish的配置文件有两个，default.vcl和varnish.params，其中vcl用来配置缓存策略，而params来配置varnish服务进程的工作特性，例如监听端口等等

首先介绍下params

[root@keepalived2 varnish]# vim varnish.params 

  1 # Varnish environment configuration description. This was derived from
  2 # the old style sysconfig/defaults settings
  3 
  4 # Set this to 1 to make systemd reload try to switch VCL without restart.
  5 RELOAD_VCL=1    #在启动服务器时候会不会重新编译VCL文件，1表示会。当然也可以在不重启的情况下编译此文件
  6 
  7 # Main configuration file. You probably want to change it.
  8 VARNISH_VCL_CONF=/etc/varnish/default.vcl    #默认加载哪个缓存策略文件
  9 
 10 # Default address and port to bind to. Blank address means all IPv4
 11 # and IPv6 interfaces, otherwise specify a host name, an IPv4 dotted
 12 # quad, or an IPv6 address in brackets.
 13 # VARNISH_LISTEN_ADDRESS=192.168.1.5      #监听的地址
 14 VARNISH_LISTEN_PORT=6081       #监听的端口
 15 
 16 # Admin interface listen address and port
 17 VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1    #远程地址监听的地址
 18 VARNISH_ADMIN_LISTEN_PORT=6082   #远程地址监听的端口
 19 
 20 # Shared secret file for admin interface
 21 VARNISH_SECRET_FILE=/etc/varnish/secret     #密钥文件
 22 
 23 # Backend storage specification, see Storage Types in the varnishd(5)
 24 # man page for details.
 25 VARNISH_STORAGE="malloc,256M"      #使用什么格式的缓存
 # 缓存格式由三种：
 malloc：内存存储，重启后失效
 file：磁盘文件存储，重启后缓存失效，使用时需要自己创建文件目录，并且修改属组和属主
persistent：文件存储，但是处于实验阶段
 26 
 27 # User and group for the varnishd worker processes
 28 VARNISH_USER=varnish       #varnish工作的属组和属主
 29 VARNISH_GROUP=varnish
 30 
 31 # Other options, see the man page varnishd(1)
 32 #DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300"     #定义运行时的参数

default.vcl配置文件

[root@keepalived2 varnish]# vim default.vcl 

  1 #
  2 # This is an example VCL file for Varnish.
  3 #
  4 # It does not do anything by default, delegating control to the
  5 # builtin VCL. The builtin VCL is called when there is no explicit
  6 # return statement.
  7 #
  8 # See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
  9 # and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.
 10 
 11 # Marker to tell the VCL compiler that this VCL has been adapted to the
 12 # new 4.0 format.
 13 vcl 4.0;
 14 
 15 # Default backend definition. Set this to point to your content server.
 16 backend default {   #后端服务器的地址
 17     .host = "127.0.0.1";
 18     .port = "8080";
 19 }
 20 
 21 sub vcl_recv {
 22     # Happens before we check if we have this in cache already.
 23     #
 24     # Typically you clean up the request here, removing cookies you don't need,
 25     # rewriting the request, etc.
 26 }
 27 
 28 sub vcl_backend_response {
 29     # Happens after we have read the response headers from the backend.
 30     #
 31     # Here you clean the response headers, removing silly Set-Cookie headers
 32     # and other mistakes your backend does.
 33 }
 34 
 35 sub vcl_deliver {
 36     # Happens when we have all the pieces we need, and are about to send the
 37     # response to the client.
 38     #
 39     # You can do accounting or modifying the final object here.
 40 }
~

示例：

#简单的一个反代，params下面两行改一下，让varnish自己接受客户端请求
  VARNISH_LISTEN_ADDRESS=192.168.199.145
  VARNISH_LISTEN_PORT=80

#default.vcl里面backend default里面填上后端服务器的地址以及端口。
backend default {
    .host = "192.168.179.129";
    .port = "80";
 }

#访问192.168.199.145会出现129上定义的内容。

VCL的配置

上述只是简单的配置，接下来介绍关于内建变量、函数等等。对VCL的配置文件我们需要用到varnishadm。它提供了一个交互式接口，可以加载VCL文件、列出VCL文件等等。

varnishadm的用法：

     #进入交互模式
        [root@keepalived2 varnish]# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
        help
			    200        #响应码
			    help [<command>]   
			    ping [<timestamp>]  #判断后端服务器是否存活
			    auth <response>   #认证
			    quit   #退出
			    banner  #欢迎信息
			    status
			    start
			    stop
			    vcl.load <configname> <filename>   #加载VCL配置
			    vcl.inline <configname> <quoted_VCLstring>   #
			    vcl.use <configname>   #使用哪个配置文件
			    vcl.discard <configname>   #手动清理版本
			    vcl.list    #列出所有的vcl文件
			    param.show [-l] [<param>]   #查看主进程运行的参数选项
			    param.set <param> <value>  #设定主进程运行时的参数
			    panic.show
			    panic.clear
			    storage.list  #列出存储模式
			    vcl.show [-v] <configname>
			    backend.list [<backend_expression>]  #列出后端服务器
			    backend.set_health <backend_expression> <state>
			    ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
			    ban.list

当我们第一次在启动varnish的时候，它的内部有一系列的默认配置，可以在交互式命令行中查看。

#可以使用以下两个命令查看
vcl.list
200        
active          0 boot

vcl.show -v boot
200        

#这里只列出一个，简单说下
sub vcl_recv {
    if (req.method == "PRI") {      #如果方法是PRI的话，我们将要返回一个405的合成响应报文
	/* We do not support SPDY or HTTP/2.0 */
	return (synth(405));
    }
    if (req.method != "GET" &&
      req.method != "HEAD" &&
      req.method != "PUT" &&
      req.method != "POST" &&
      req.method != "TRACE" &&
      req.method != "OPTIONS" &&
      req.method != "DELETE") {
        /* Non-RFC2616 or CONNECT which is weird. */    #如果请求的方法都不是上述几种的话，我们直接交由后端服务器处理
        return (pipe);
    }

    if (req.method != "GET" && req.method != "HEAD") {      #如果不是GET和HEAD请求方法 的话，我们直接pass，不查缓存
        /* We only deal with GET and HEAD by default */
        return (pass);
    }
    if (req.http.Authorization || req.http.Cookie) {    #如果带有认证消息或者cookie的我们也pass，不查缓存
        /* Not cacheable by default */
        return (pass);
    }
    return (hash);下一步到hash引擎
}

内建变量

内建变量	解释
req.*	由客户端发来的请求报文
bereq.*	由varnish发往BE主机的httpd请求相关
beresp.*	由BE主机发往varnish的响应报文
resp.*	由varnish发往客户端的响应报文
obj.*	存储在缓存空间中的缓存对象的属性

常用变量：

req.* bereq.*
- bereq(req).http.HEADERS：定义请求报文的首部
- bereq(req).request：请求方法
- bereq(req).url：请求的url
- bereq(req).proto：请求的协议版本
- bereq(req).backend：指明要调用的后端主机
- req.http.Cookie：客户端的请求报文中Cookie首部的值
beresp., resp.
- beresp（resp）.http.HEADERS：响应报文首部
- beresp（resp）.status：响应码
- beresp（resp）.proto:协议版本
- beresp(resp).backend.name：BE主机的主机名
- beresp(resp).ttl：BE主机响应的内容的余下的可缓存时长
obj.*
- obj.hits：此对象从缓存中命中的次数
- obj.ttl：对象的ttl值，缓存时长
server.*
- server.ip：服务器ip
- server.hostname:服务器主机名
client.*
- client.ip：客户端ip
  但是每个方法对应的引擎都不一样。如下图，每个引擎上只能有指定的变量。
  
  示例1：

[root@keepalived2 varnish]# cat default.vcl 
sub vcl_deliver {
    # Happens when we have all the pieces we need, and are about to send the
    # response to the client.
    #
    if (obj.hits >0) {
		set resp.http.X-Cache = "Hit via" + server.ip;
	} else {
		set resp.http.X-Cache = "Miss from" + server.ip;
	}	
    # You can do accounting or modifying the final object here.
}
# 在响应报文中添加命中缓存的提示

示例2：
访问特定资源时不经过查询缓存

sub vcl_recv {
    # Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.

  if (req.url ~ "(?i)/^(login|admin)") {
      return(pass);
   }
｝

purge,ban

purge是修剪缓存的意思，就是如果缓存中有命中的选项，你可以直接让它不在缓存中响应。
示例：

#在default.vcl文件中加入下面一行，意思是purge引擎的定义
sub vcl_purge {
		return(synth(200,"purged"));
}

#在recv中添加上请求方法如果是PURGE的话，就进行缓存修剪。就是将缓存进行清除处理
sub vcl_recv {
	if (req.method == "PURGE") {
		return(purge);
	}
	# Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.
}

当然不是任何人都可以进行purge操作的，我们需要对它进行访问控制。

acl purgers {   #在最外面定义acl
    "127.0.0.1";
    "192.168.0.0"/24;
}



sub vcl_recv {
    if (req.url ~ "(?i) ^/login") {
		return(pass);
	}
	
	if (req.method == "PURGE") {    
			if (!client.ip ~ purgers){     #对请求ip进行判断，如果是不是acl控制列表理的，返回405响应码
				return(synth(405,"Purging not allow for " + client.ip));
			}	
		return(purge);   
	}
	# Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.
}

ban的意思是禁用选中的缓存项，可以在交互式命令行中直接使用。

ban req.url ~ ^/admin    
#表示访问admin目录下的资源时不使用缓存，这个只是一次性，再次访问会继续缓存

Director

同lvs反代一样，可以实现对后端多个服务器的调用，可以定义各种访问方法。但是需要在顶部导入Director模块

import directors;

backend webserver1 {  #定义两个后端服务器
    .host = "192.168.179.129";
    .port = "80";
}

backend webserver2 {
	.host = "192.168.179.130";
	.port = "80";
}

sub vcl_init {
    new webser = directors.round_robin();
    webser.add_backend(webserver1);
    webser.add_backend(webserver2);

}


sub vcl_recv {
		set req.backend_hint = webser.backend();

}

后端服务器健康检测

示例：

probe check{
   .url = "/index.html";
   .window = 5; 
   .threshold = 4;
   .interval = 2s;
   .timeout = 2s;
}

backend web1 {
    .host = "172.17.0.2";
    .port = "80";
    .probe = check;
}

backend web2 {
   .host = "172.17.0.3";
   .port = "80";
   .probe = check;
}

url：指定检测的url，默认为/
window：基于最近的多少次检查来判断其健康状态
threshold：最近window中定义的这么多次检查中至少有threshold次数是成功的
interval：检测的频度
timeout：超时时长

查看后端服务器的健康状态使用backend.list

backend.list
200        
Backend name                   Refs   Admin      Probe
default(127.0.0.1,,8080)       1      probe      Healthy (no probe)
default(172.17.0.2,,80)        4      probe      Healthy (no probe)
web1(172.17.0.2,,80)           3      probe      Healthy 5/5
web2(172.17.0.3,,80)           3      probe      Healthy 5/5   #此为正常状态



backend.list
200        
Backend name                   Refs   Admin      Probe
default(127.0.0.1,,8080)       1      probe      Healthy (no probe)
default(172.17.0.2,,80)        4      probe      Healthy (no probe)
web1(172.17.0.2,,80)           3      probe      Healthy 5/5
web2(172.17.0.3,,80)           3      probe      Sick 0/5 #当检测的url出现失败后，将会标记为不可用。

性能调优参数

在线程池内部，其每一个请求由一个线程来处理，其worker线程的数量决定来varnish的并发响应能力

thread_pools：线程池数量，应该小于或者等于cpu的核心数
thread_pool_max：每个线程池内部的最大线程数
thread_pool_min：最大空闲线程数
thread_pool_timeout：线程空闲了多长时间开始摧毁，例如，最大空闲2000，如果目前线程总数为2500个，此处设置的时间一到，就会消灭500个。
thread_pool_add_delay：添加线程的延时时间。
thread_pool_destroy_delay：摧毁线程的延时时间
send_timeout：向客户端发送响应报文的超时时长
timeout_idle：客户端连接的空闲时长
timeout_req：接收客户端请求的超时时长
cli_timeout：接收客户端报文首部的最长时间

可以使用命令行的方式来设置参数，但是临时有效。
示例：

param.set thread_pools 4
200        

param.show thread_pools
200        
thread_pools
        Value is: 4 [pools]  #  更改之后为4个
        Default is: 2        # 默认2个 
        Minimum is: 1        # 最小1个

永久保存需要在varnish.param中，在文件中添加以下内容。

DAEMON_OPTS="-p thread_pools=4 -p thread_pool_min=200 -p thread_pool_max=2000 -p thread_pool_timeout=300 -p thread_pool_add_delay=300 -p thread_pool_destroy_delay=300 -p timeout_idle=60"