背景:使用cache03作为流量分发服务器,cache02和cache01作为应用服务器
nginx日志目录 /usr/servers/nginx/logs
1、nginx+lua将访问流量上报到kafka中
在nginx应用服务器这一层,接收到访问请求的时候,就把请求的流量上报发送给kafka
这样的话,storm才能去消费kafka中的实时的访问日志,然后去进行缓存热数据的统计
用得技术方案非常简单,从lua脚本直接创建一个kafka producer,发送数据到kafka
下载kafka的相关包
wget https://github.com/doujiang24/lua-resty-kafka/archive/master.zip
yum install -y unzip
unzip master.zip
cp -rf /usr/local/lua-resty-kafka-master/lib/resty /usr/hello/lualib
重启kafka
/usr/servers/nginx/sbin/nginx -s reload
编辑product.lua,将提交kafka代码加上
vi /usr/hello/lua/product.lua
local cjson = require("cjson")
local producer = require("resty.kafka.producer")
local broker_list = {
{ host = "139.199.10.125", port = 9092 },
{ host = "111.230.234.30", port = 9092 },
{ host = "139.199.6.253", port = 9092 }
}
local log_json = {}
log_json["headers"] = ngx.req.get_headers()
log_json["uri_args"] = ngx.req.get_uri_args()
log_json["body"] = ngx.req.read_body()
log_json["http_version"] = ngx.req.http_version()
log_json["method"] =ngx.req.get_method()
log_json["raw_reader"] = ngx.req.raw_header()
log_json["body_data"] = ngx.req.get_body_data()
local message = cjson.encode(log_json);
local productId = ngx.req.get_uri_args()["productId"]
local async_producer = producer:new(broker_list, { producer_type = "async" })
local ok, err = async_producer:send("access-log", productId, message)
if not ok then
ngx.log(ngx.ERR, "kafka send err:", err)
return
end
在nginx应用层服务器的nginx.conf中,http部分,加入resolver 8.8.8.8;然后重启nginx
vi /usr/servers/nginx/conf/nginx.conf
/usr/servers/nginx/sbin/nginx -s reload
需要在每个kafka中加入advertised.host.name = 本机ip地址,重启三个kafka进程(下面的kafka版本较新为准)
vi /usr/local/kafka/config/server.properties
advertised.listeners=PLAINTEXT://139.199.10.125:9092 cache01上
advertised.listeners=PLAINTEXT://111.230.234.30:9092 cache02上
advertised.listeners=PLAINTEXT://139.199.6.253:9092 cache03上
注意,这里如果写的是cache01,cache02,cache03可能会出现不能识别的问题
不然可能会报错:
[lua] producer.lua:258: buffered messages send to kafka err: no resolver defined to resolve "cache02",
retryable: true, topic: access-log, partition_id: 0, length: 1, context: ngx.timer, client: 139.199.6.253,
server: 0.0.0.0:80
两台nginx应用服务器机器上重启nginx
/usr/servers/nginx/sbin/nginx -s reload
访问:http://139.199.6.253/product?requestPath=product&productId=2&shopId=2
consumer显示出相应的信息
例如:
{"method":"GET","http_version":1.1,"raw_reader":"GET \/product?productId=1&shopId=1 HTTP\/1.1\r\nHost: 111.230.234.30\r\nUser-Agent: lua-resty-http\/0.11 (Lua) ngx_lua\/9014\r\n\r\n","uri_args":{"productId":"1","shopId":"1"},"headers":{"host":"111.230.234.30","user-agent":"lua-resty-http\/0.11 (Lua) ngx_lua\/9014"}}
nginx+lua+kafka实时上报功能完成
---------------------------------------------------------
两台nginx应用服务器机器上都这样做,才能统一上报流量到kafka
bin/kafka-topics.sh --zookeeper cache01:2181,cache02:2181,cache03:2181 --topic access-log --replication-factor 1 --partitions 1 --create
bin/kafka-console-consumer.sh --zookeeper cache01:2181,cache02:2181,cache03:2181 --topic access-log --from-beginning
(1)kafka在187上的节点死掉了,可能是虚拟机的问题,杀掉进程,重新启动一下
nohup bin/kafka-server-start.sh config/server.properties &
(2)需要在nginx应用层服务器的nginx.conf中,http部分,加入resolver 8.8.8.8;然后重启nginx
vi /usr/servers/nginx/conf/nginx.conf
/usr/servers/nginx/sbin/nginx -s reload
需要在kafka中加入advertised.host.name = 本机ip地址,重启三个kafka进程(下面的kafka版本较新为准)
vi /usr/local/kafka/config/server.properties
advertised.listeners=PLAINTEXT://139.199.10.125:9092 cache01上
advertised.listeners=PLAINTEXT://111.230.234.30:9092 cache02上
advertised.listeners=PLAINTEXT://139.199.6.253:9092 cache03上
注意,这里如果写的是cache01,cache02,cache03可能会出现不能识别的问题
不然可能会报错:
[lua] producer.lua:258: buffered messages send to kafka err: no resolver defined to resolve "cache02",
retryable: true, topic: access-log, partition_id: 0, length: 1, context: ngx.timer, client: 139.199.6.253,
server: 0.0.0.0:80
(4)需要启动eshop-cache缓存服务,因为nginx中的本地缓存可能不在了
nginx日志目录 /usr/servers/nginx/logs
1、nginx+lua将访问流量上报到kafka中
在nginx应用服务器这一层,接收到访问请求的时候,就把请求的流量上报发送给kafka
这样的话,storm才能去消费kafka中的实时的访问日志,然后去进行缓存热数据的统计
用得技术方案非常简单,从lua脚本直接创建一个kafka producer,发送数据到kafka
下载kafka的相关包
wget https://github.com/doujiang24/lua-resty-kafka/archive/master.zip
yum install -y unzip
unzip master.zip
cp -rf /usr/local/lua-resty-kafka-master/lib/resty /usr/hello/lualib
重启kafka
/usr/servers/nginx/sbin/nginx -s reload
编辑product.lua,将提交kafka代码加上
vi /usr/hello/lua/product.lua
local cjson = require("cjson")
local producer = require("resty.kafka.producer")
local broker_list = {
{ host = "139.199.10.125", port = 9092 },
{ host = "111.230.234.30", port = 9092 },
{ host = "139.199.6.253", port = 9092 }
}
local log_json = {}
log_json["headers"] = ngx.req.get_headers()
log_json["uri_args"] = ngx.req.get_uri_args()
log_json["body"] = ngx.req.read_body()
log_json["http_version"] = ngx.req.http_version()
log_json["method"] =ngx.req.get_method()
log_json["raw_reader"] = ngx.req.raw_header()
log_json["body_data"] = ngx.req.get_body_data()
local message = cjson.encode(log_json);
local productId = ngx.req.get_uri_args()["productId"]
local async_producer = producer:new(broker_list, { producer_type = "async" })
local ok, err = async_producer:send("access-log", productId, message)
if not ok then
ngx.log(ngx.ERR, "kafka send err:", err)
return
end
在nginx应用层服务器的nginx.conf中,http部分,加入resolver 8.8.8.8;然后重启nginx
vi /usr/servers/nginx/conf/nginx.conf
/usr/servers/nginx/sbin/nginx -s reload
需要在每个kafka中加入advertised.host.name = 本机ip地址,重启三个kafka进程(下面的kafka版本较新为准)
vi /usr/local/kafka/config/server.properties
advertised.listeners=PLAINTEXT://139.199.10.125:9092 cache01上
advertised.listeners=PLAINTEXT://111.230.234.30:9092 cache02上
advertised.listeners=PLAINTEXT://139.199.6.253:9092 cache03上
注意,这里如果写的是cache01,cache02,cache03可能会出现不能识别的问题
不然可能会报错:
[lua] producer.lua:258: buffered messages send to kafka err: no resolver defined to resolve "cache02",
retryable: true, topic: access-log, partition_id: 0, length: 1, context: ngx.timer, client: 139.199.6.253,
server: 0.0.0.0:80
两台nginx应用服务器机器上重启nginx
/usr/servers/nginx/sbin/nginx -s reload
访问:http://139.199.6.253/product?requestPath=product&productId=2&shopId=2
consumer显示出相应的信息
例如:
{"method":"GET","http_version":1.1,"raw_reader":"GET \/product?productId=1&shopId=1 HTTP\/1.1\r\nHost: 111.230.234.30\r\nUser-Agent: lua-resty-http\/0.11 (Lua) ngx_lua\/9014\r\n\r\n","uri_args":{"productId":"1","shopId":"1"},"headers":{"host":"111.230.234.30","user-agent":"lua-resty-http\/0.11 (Lua) ngx_lua\/9014"}}
nginx+lua+kafka实时上报功能完成
---------------------------------------------------------
两台nginx应用服务器机器上都这样做,才能统一上报流量到kafka
bin/kafka-topics.sh --zookeeper cache01:2181,cache02:2181,cache03:2181 --topic access-log --replication-factor 1 --partitions 1 --create
bin/kafka-console-consumer.sh --zookeeper cache01:2181,cache02:2181,cache03:2181 --topic access-log --from-beginning
(1)kafka在187上的节点死掉了,可能是虚拟机的问题,杀掉进程,重新启动一下
nohup bin/kafka-server-start.sh config/server.properties &
(2)需要在nginx应用层服务器的nginx.conf中,http部分,加入resolver 8.8.8.8;然后重启nginx
vi /usr/servers/nginx/conf/nginx.conf
/usr/servers/nginx/sbin/nginx -s reload
需要在kafka中加入advertised.host.name = 本机ip地址,重启三个kafka进程(下面的kafka版本较新为准)
vi /usr/local/kafka/config/server.properties
advertised.listeners=PLAINTEXT://139.199.10.125:9092 cache01上
advertised.listeners=PLAINTEXT://111.230.234.30:9092 cache02上
advertised.listeners=PLAINTEXT://139.199.6.253:9092 cache03上
注意,这里如果写的是cache01,cache02,cache03可能会出现不能识别的问题
不然可能会报错:
[lua] producer.lua:258: buffered messages send to kafka err: no resolver defined to resolve "cache02",
retryable: true, topic: access-log, partition_id: 0, length: 1, context: ngx.timer, client: 139.199.6.253,
server: 0.0.0.0:80
(4)需要启动eshop-cache缓存服务,因为nginx中的本地缓存可能不在了