最近准备自己实践一下基于 Consul 的服务发现,于是有了这篇文章。
本人的环境是 Google Cloud,CentOS 7.5,生产环境。
首先安装 Docker 和 docker-compose 。官网都有安装方法,这里不详细说了。
然后安装 Consul 和 Registrator。因为这两个有关联,所以我直接编排在了一个 docker-compose.yml 文件里。文件如下:
version: "3"
image: consul:latest
container_name: consul
network_mode: "host"
- ${PWD}/cert/server-key.pem:/etc/pki/tls/private/server-key.pem
- ${PWD}/cert/server.pem:/etc/pki/tls/certs/server.pem
- ${PWD}/cert/consul-ca.pem:/etc/pki/tls/certs/consul-ca.pem
- ${PWD}/config.json:/consul/config/config.json
command: agent
# see consul.io/docs/agent/options/html for more information
# -server: server mode
# -ui: enable ui on :8500/ui/
# -bind: used for communication in a cluster
# -advertise-wan: used for WAN communication
# -node: node name
# -datacenter: datacenter
# -bootstrap-expect: number of servers expected to be connected when bootstraping
# listen on local docker sock to register the container with public ports to the consul service
image: gliderlabs/registrator:master
container_name: registrator
network_mode: "host"
- "consul"
- "/var/run/docker.sock:/tmp/docker.sock"
command: -tags registrator -retry-attempts 10 -retry-interval 5000 consul://localhost:8500
- Consul:向 Consul 容器共享了四个文件,分别是本服务器的证书和 key、自签 CA 的证书、以及 consul 节点的配置文件。关于如何签署证书,参考官网:https://www.consul.io/docs/agent/encryption.html。
- Registrator:-tags registrator 表示所有通过 registrator 注册的服务都加上这个 tag,主要是便于标识。-retry 相关的是因为即使设置了 depend on consul,registrator 也几乎是和 consul 同时启动的, 因此可能出现第一次连不上的情况。
我的 config.json 配置文件如下:
"datacenter": "google",
"node_name": "gce-node1",
"server": true,
"bootstrap": true,
"ui": true,
"ports": {
"https": 8501
"bind_addr": "lan ip",
"advertise_addr_wan": "wan ip",
"client_addr": "",
"encrypt": "=================",
"key_file": "/etc/pki/tls/private/server-key.pem",
"cert_file": "/etc/pki/tls/certs/server.pem",
"ca_file": "/etc/pki/tls/certs/consul-ca.pem",
"verify_incoming": true,
"verify_outgoing": true,
"verify_server_hostname": true
- 数据中心和node name 这个不多说。
- server 表示这个 agent 是在 server 模式,bootstrap 表示是 bootstrap 模式,因为这是第一个 consul 节点所以要 bootstrap。
- ui 表示在 HTTP(HTTPS)端口开启 Web UI,现在的 consul 版本已经自带静态资源文件了,不需要手动下载静态资源并指定 ui-dir。
- encrypt 表示集群内通信使用的 key,这个在官方的页面(https://www.consul.io/docs/agent/encryption.html)上有提到。
- key_file、cert_file、ca_file、verify_incoming、verify_outgoing、verify_server_hostname 这些在官方的 Encryption 页面上也有详细介绍,不多说了
docker-compose up 测试一下是否正常运行,有问题看官方文档。之后进入最后一步:Conusl-template 。
然后写 consul-template 的配置文件:
# See https://github.com/hashicorp/consul-template
consul {
# This is the address of the Consul agent. By default, this is
#, which is the default bind and port for a local Consul
# agent. It is not recommended that you communicate directly with a Consul
# server, and instead communicate with the local Consul agent. There are many
# reasons for this, most importantly the Consul agent is able to multiplex
# connections to the Consul server and reduce the number of open HTTP
# connections. Additionally, it provides a "well-known" IP address for which
# clients can connect.
address = ""
# This is the ACL token to use when connecting to Consul. If you did not
# enable ACLs on your Consul cluster, you do not need to set this option.
# This option is also available via the environment variable CONSUL_TOKEN.
# token = "abcd1234"
# This controls the retry behavior when an error is returned from Consul.
# Consul Template is highly fault tolerant, meaning it does not exit in the
# face of failure. Instead, it uses exponential back-off and retry functions
# to wait for the cluster to become available, as is customary in distributed
# systems.
retry {
# This enabled retries. Retries are enabled by default, so this is
# redundant.
enabled = true
# This specifies the number of attempts to make before giving up. Each
# attempt adds the exponential backoff sleep time. Setting this to
# zero will implement an unlimited number of retries.
attempts = 12
# This is the base amount of time to sleep between retry attempts. Each
# retry sleeps for an exponent of 2 longer than this base. For 5 retries,
# the sleep times would be: 250ms, 500ms, 1s, 2s, then 4s.
backoff = "250ms"
# This is the maximum amount of time to sleep between retry attempts.
# When max_backoff is set to zero, there is no upper limit to the
# exponential sleep between retry attempts.
# If max_backoff is set to 10s and backoff is set to 1s, sleep times
# would be: 1s, 2s, 4s, 8s, 10s, 10s, ...
max_backoff = "1m"
ssl {
# This enables SSL. Specifying any option for SSL will also enable it.
enabled = true
# This enables SSL peer verification. The default value is "true", which
# will check the global CA chain to make sure the given certificates are
# valid. If you are using a self-signed certificate that you have not added
# to the CA chain, you may want to disable SSL verification. However, please
# understand this is a potential security vulnerability.
verify = false
# This is the path to the certificate to use to authenticate. If just a
# certificate is provided, it is assumed to contain both the certificate and
# the key to convert to an X509 certificate. If both the certificate and
# key are specified, Consul Template will automatically combine them into an
# X509 certificate for you.
cert = "/root/service_discovery/cert/client.pem"
key = "/root/service_discovery/cert/client-key.pem"
# This is the path to the certificate authority to use as a CA. This is
# useful for self-signed certificates or for organizations using their own
# internal certificate authority.
ca_cert = "/root/service_discovery/cert/consul-ca.pem"
# This sets the SNI server name to use for validation.
server_name = "server.google-tw.consul"
# This block defines the configuration for connecting to a syslog server for
# logging.
syslog {
# This enables syslog logging. Specifying any other option also enables
# syslog logging.
#enabled = true
# This is the name of the syslog facility to log to.
#facility = "LOCAL5"
template {
source = "/root/service_discovery/consul-template/upstream-everyclass-server.conf"
destination = "/www/server/panel/vhost/nginx/upstream-server.conf"
command = "nginx -s reload"
# This option backs up the previously rendered template at the destination
# path before writing a new one. It keeps exactly one backup. This option is
# useful for preventing accidental changes to the data without having a
# rollback strategy.
backup = true
wait {
min = "2s"
max = "10s"
官方的 README 对每一个项目都有非常详尽的说明(https://github.com/hashicorp/consul-template),可供参考。
upstream everyclass-server {
{{ range service "everyclass-server" }}
server {{ .Address }}:{{ .Port }};
{{ end }}
在 nginx 的 server 配置里:
location /
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://everyclass-server;
把 upstream 和 server 分离成两个文件主要是考虑到 server 部分可能会在服务器可视化面板中修改,如果写在了 consul-template 的模板里就只能上机器手动改了,不太方便。
consul-template -config "/root/service_discovery/consul-template/config.hcl"