如何把 nginx log 批量导入 mongodb

@[TOC] 如何把 nginx log 批量导入 mongodb

如何把 nginx log 批量导入 mongodb

记一次需要统计nginx log,之前使用过 用 filebase+ logstash + Elasticsearch 的方式 导入 es库中,用于数据统计。但之前的环境都被我删除了,现在要重新搭 一整套 elk 环境还是比较花时间。所以就想 把 nginx log导入到mongodb来做一些简单的 数据统计。

  1. 把一个大的日志文本 取前100行先做文本处理测试,目标是把nginx log文件格式化为 csv格试,来导入mongodb。
# head -100 access.2019.09.26.log > access.test.log

下面的我服务器的 nginx log 日志格式。

log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for" "$http_host"';
115.223.196.221 - - [26/Sep/2019:00:00:22 +0800] "GET /static/js/jquery-1.6.2.min.js?ver=1.0 HTTP/1.1" 200 91556 "https://yhh5.sipaphoto.com/h5/fdc8bd32-9f7c-a94a-9984-cecbae942f9b.html?__noHeadPic__=0&__adcf__=&__docSource__=91110105MA01GJ8G99&t_oaid=ls_m3%2BCoXIKqR0Un2Rd7424SQ%3D%3D" "Mozilla/5.0 (Linux; Android 5.1; OPPO A37m Build/LMY47I; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/43.0.2357.121 Mobile Safari/537.36" "115.223.196.221" "yhh5.sipaphoto.com"
123.139.85.100 - - [26/Sep/2019:00:00:22 +0800] "GET /static/js/jquery-1.6.2.min.js?ver=1.0 HTTP/1.1" 200 91556 "-" "Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/55.0.2883.91 Mobile Safari/537.36" "123.139.85.100" "yhh5.sipaphoto.com"
140.143.221.200 - - [26/Sep/2019:00:00:22 +0800] "GET / HTTP/1.1" 200 27 "-" "-" "-" "yhh5.sipaphoto.com"
49.119.241.255 - - [26/Sep/2019:00:00:22 +0800] "GET /static/js/jquery-1.6.2.min.js?ver=1.0 HTTP/1.1" 200 91556 "-" "Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Plus Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/55.0.2883.91 Mobile Safari/537.36" "49.119.241.255" "yhh5.sipaphoto.com"

2.目示格式
cvs文本格式是第一行对应的是字段,第二行开始是数据,每个字段的数据是双引号加豆号隔开。
如:

ip,date,request,status,body_bytes_sent,http_user_agent,http_x_forwarded_for,http_host
“49.119.241.255”,"26/Sep/2019:00:00:22 +0800", "GET /static/js/jquery-1.6.2.min.js?ver=1.0 HTTP/1.1"," 200" ,"91556", "Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Plus Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/55.0.2883.91 Mobile Safari/537.36","49.119.241.255" ,"yhh5.sipaphoto.com"

3.处理步骤

3.1.处理下特殊字符串( 中括号,豆号,冒号,分好,括号)替换成 空格
# perl -pi -e 's|\"||g' access.test.log //把 " 替换成空格
#perl -pi -e 's/\[//g' test1.log //中括号替换成空格
#perl -pi -e 's/\]//g' test1.log //中括号替换成空格
#perl -pi -e 's/\(//g' test1.log //括号替换成空格
#perl -pi -e 's/\)//g' test1.log //括号替换成空格
#perl -pi -e 's/\;//g' test1.log //分号替换成空格
#perl -pi -e 's/\,//g' test1.log //豆号替换成空格
# perl -pi -e 's/uuid\///g' access.test.log  //把uuid/替换成空格
#sed -ie '/static/d' access.test.log //删除 带有static字符串的行
#sed -ie '/140.143.221.200/d' access.test.log  //删除 ip为 114.143.221.200 的行,这是负载载均的空数据没有统计的意义。
#perl -pi -e 's| |","|g' access.test.log //把空格替换成","
#sed -i 's/^/\"&/g' access.test.log  //在每行的开头添加一个冒号 ”;
#sed -i 's/$/&\"/g' access.test.log  //在每行的行尾添加一个冒号 ”;
#perl -pi -e 's/\"-\"\,//g' access.test.log  把 "-", 这个特殊字符中 替换成空格

处理后如下

"113.16.250.180","26/Sep/2019:00:00:06","+0800","GET","/h5/fdc8bd32-9f7c-a94a-9984-cecbae942f9b.html?__noHeadPic__=0&__adcf__=&__docSource__=91110105MA01GJ8G99&t_oaid=ls_o48XbnwI7W8RzbZZDDmfpw%3D%3D","HTTP/1.1","200","13553","Mozilla/5.0","Linux","Android","6.0.1","OPPO","R9s","Build/MMB29M","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/55.0.2883.91","Mobile","Safari/537.36","113.16.250.180","yhh5.sipaphoto.com"
"106.124.36.115","26/Sep/2019:00:00:11","+0800","GET","/h5/fdc8bd32-9f7c-a94a-9984-cecbae942f9b.html?__noHeadPic__=0&__adcf__=&__docSource__=91110105MA01GJ8G99&t_oaid=ls_PBvTCu1u1ijnBgWfn1liNw%3D%3D","HTTP/1.1","200","13553","Mozilla/5.0","Linux","Android","5.1","OPPO","A59s","Build/LMY47I","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/43.0.2357.121","Mobile","Safari/537.36","106.124.36.115","yhh5.sipaphoto.com"
"106.127.214.37","26/Sep/2019:00:00:14","+0800","GET","/h5/45af4e52-6043-6725-69b5-a43d45306488.html?__noHeadPic__=0&__adcf__=0_21_1&__docSource__=91110105MA01GJ8G99","HTTP/1.1","200","11890","Mozilla/5.0","Linux","Android","8.1.0","PBBM00","Build/O11019","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/62.0.3202.84","Mobile","Safari/537.36","pictorial_version/5.6.1","4ae2c231da7b717c4b5d18ff11f497cb","channel/OPPO","language/zh-CN","106.127.214.37","yhh5.sipaphoto.com"
"117.136.39.203","26/Sep/2019:00:00:16","+0800","GET","/h5/fdc8bd32-9f7c-a94a-9984-cecbae942f9b.html?__noHeadPic__=0&__adcf__=&__docSource__=91110105MA01GJ8G99&t_oaid=ls_Ist8GUiaQVIhZQX48WxyvQ%3D%3D","HTTP/1.1","200","13553","Mozilla/5.0","Linux","Android","6.0.1","OPPO","A57t","Build/MMB29M","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/55.0.2883.91","Mobile","Safari/537.36","117.136.39.203","yhh5.sipaphoto.com"
"223.104.64.8","26/Sep/2019:00:00:18","+0800","GET","/h5/817bd461-731b-4bcf-9da2-f511796914aa.html?__noHeadPic__=0&__adcf__=0_21_1&__docSource__=91110105MA01GJ8G99","HTTP/1.1","200","15699","Mozilla/5.0","Linux","Android","7.1.1","OPPO","A83","Build/N6F26Q","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/62.0.3202.84","Mobile","Safari/537.36","pictorial_version/5.7.1","337f6ac48692146db8f5123eb14fa82e","channel/OPPO","language/zh-CN","223.104.64.8","yhh5.sipaphoto.com"
"223.104.1.199","26/Sep/2019:00:00:21","+0800","GET","/h5/e748af6e-ad70-2b49-729f-bb7e105733ee.html?__noHeadPic__=0&__adcf__=0_21_1&__docSource__=91110105MA01GJ8G99","HTTP/1.1","200","11375","Mozilla/5.0","Linux","Android","8.1.0","PBBT00","Build/O11019","wv","AppleWebKit/537.36","KHTML","like","Gecko","Version/4.0","Chrome/62.0.3202.84","Mobile","Safari/537.36","pictorial_version/5.7.1","4126a51d25e779b9c31fbcef7e8902b0","channel/OPPO","language/zh-CN","223.104.1.199","yhh5.sipaphoto.com"
第一行添加对应用的字段名
#sed -i "1i ip,date,timezone,request_mode,url,server_protocol,status,body_bytes_sent,agent_0,agent_1,agent_2,agent_3,agent_4,agent_5,agent_6,agent_7,agent_8,agent_9,agent_10,agent_11,agent_12,agent_13,agent_14,agent_15,uuid,mobile_mode,language,host_ip,host_name" access.test.log

对应表

"223.104.1.199"=>ip
"26/Sep/2019:00:00:21"=>date
"+0800"=>timezone
"GET"=>request_mode
"/h5/e748af6e-ad70-2b49-729f-bb7e105733ee.html?__noHeadPic__=0&__adcf__=0_21_1&__docSource__=91110105MA01GJ8G99"=>url
"HTTP/1.1"=>server_protocol
"200"=>status
"11375"=>body_bytes_sent
"Mozilla/5.0"=>agent_0
"Linux"=>agent_1
"Android"=>agent_2
"8.1.0"=>agent_3
"PBBT00"=>agent_4
"Build/O11019"=>agent_5
"wv"=>agent_6
"AppleWebKit/537.36"=>agent_7
"KHTML"=>agent_8
"like"=>agent_9
"Gecko"=>agent_10
"Version/4.0"=>agent_11
"Chrome/62.0.3202.84"=>agent_12
"Mobile"=>agent_13
"Safari/537.36"=>agent_14
"pictorial_version/5.7.1"=>agent_15
"4126a51d25e779b9c31fbcef7e8902b0"=>uuid
"channel/OPPO"=>mobile_mode
"language/zh-CN"=>language
"223.104.1.199"=>host_ip
"yhh5.sipaphoto.com"=>host_name
把cvs格式的nginx log 导入mongodb

1.把格式化好的log日志扩展名改为.csv

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值