Linux下面查找数据的grep

Linux三剑客:grep,awk 和 sed,从功能上来说分别对应了查找,分段,修改。

我们这里聚焦在查找上。

命令含义

grep是 `global regular expression print` 的缩写,中文含义是全局正则表达式打印工具,用于数据查找和定位。

所以使用grep的基础就是了解正则表达式,这部分不用赘述,网上很多内容。

工作常用的grep命令和意义:

基础操作:
grep pattern file
grep -i pattern file 忽略大小写
grep -v pattern file  不显示匹配行
grep -o pattern file 只把每个匹配的内容独立的行显示,这个很有用
grep -E pattern file 使用拓展正则表达式
#注意:grep 'a[0-9]\{10\}' 等同于 grep -E 'a[0-9]{10}' 
grep -e pattern1 -e pattern2 -e pattern3 file 多个匹配规则同时匹配
grep -f pattern_file file 如果匹配规则太多了还可以写到文件里面,然后读入。pattern_file里面每行是一个规则
grep pattern -r dir/ 递归搜索
grep -w 全词匹配,比如pattern是aa,那么就不会匹配出来aaab

附加操作:
grep -2 pattern file 打印命中数据的上下2行(也可以是其他数字)
grep -m1 打印第一个匹配的项
grep -n 顺便输出行号
grep -l 打印匹配的文件名字,比如想看aaa在哪几个文件中出现了,但是不关心出现的上下文是啥
grep -c 打印匹配了多少次,比如aaa在文件中出现了3次,则打印3,不打印匹配的内容和上下文

例子1

找出下面数据中所有m_array_size = 数字的字符串,并逐行打印出来。

样例数据

$16 = { m_id = {
    category = 0 '\000', app = 0, iID = 0, sID = ""}, m_ref = {category = 0 '\000', app = 0, iID = 0, sID = ""}, m_meta = std::map with 28 elements = {[0] = {m_type = 1 '\001', m_array_size = 20, m_value = {i8 = -48 '\320',
        p_i8 = 0x55fdc5ac3cd0 "\n\002\001\004\006\354\362$\202\200\003\aweeeeee\202", i16 = 15568, p_i16 = 0x55fdc5ac3cd0, i32 = -978567984, p_i32 = 0x55fdc5ac3cd0, i64 = 94548431486160, p_i64 = 0x55fdc5ac3cd0}}, [1] = {
      m_type = 1 '\001', m_array_size = 100, m_value = {i8 = 32 ' ', p_i8 = 0x55fdc5ac1020 "dd\202\200\005'\002\001$0E417346-9621-C7A4-A832-7C5EE4FDE1B3\202\227h", i16 = 4128, p_i16 = 0x55fdc5ac1020, i32 = -978579424,
        p_i32 = 0x55fdc5ac1020, i64 = 94548431474720, p_i64 = 0x55fdc5ac1020}}, [2] = {m_type = 1 '\001', m_array_size = 39, m_value = {i8 = -32 '\340', p_i8 = 0x55fdc5ac40e0 "\002\001$5F4652E6-255D-B499-FF4F-4A00DE54FA97",
        i16 = 16608, p_i16 = 0x55fdc5ac40e0, i32 = -978566944, p_i32 = 0x55fdc5ac40e0, i64 = 94548431487200, p_i64 = 0x55fdc5ac40e0}}, [5] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 2 '\002',
        p_i8 = 0x55fd00000002 <Address 0x55fd00000002 out of bounds>, i16 = 2, p_i16 = 0x55fd00000002, i32 = 2, p_i32 = 0x55fd00000002, i64 = 94545115086850, p_i64 = 0x55fd00000002}}, [6] = {m_type = 1 '\001', m_array_size = 49,
      m_value = {i8 = 48 '0', p_i8 = 0x55fdc5ac1430 "666677646440", i16 = 5168, p_i16 = 0x55fdc5ac1430, i32 = -978578384, p_i32 = 0x55fdc5ac1430, i64 = 94548431475760, p_i64 = 0x55fdc5ac1430}}, [22] = {m_type = 1 '\001',
      m_array_size = 52, m_value = {i8 = 64 '@', p_i8 = 0x55fdc5ac1840 "479-B77872C4EBB6\202\227h", i16 = 6208, p_i16 = 0x55fdc5ac1840, i32 = -978577344, p_i32 = 0x55fdc5ac1840, i64 = 94548431476800, p_i64 = 0x55fdc5ac1840}},
    [25] = {m_type = 1 '\001', m_array_size = 49, m_value = {i8 = -96 '\240', p_i8 = 0x55fdc5ac30a0 "D3-36BC-D5FB-7134A7B12984\202\227h", i16 = 12448, p_i16 = 0x55fdc5ac30a0, i32 = -978571104, p_i32 = 0x55fdc5ac30a0,
        i64 = 94548431483040, p_i64 = 0x55fdc5ac30a0}}, [27] = {m_type = 1 '\001', m_array_size = 55, m_value = {i8 = -64 '\300', p_i8 = 0x55fdc5ac38c0 "9B5\202\227h", i16 = 14528, p_i16 = 0x55fdc5ac38c0, i32 = -978569024,
        p_i32 = 0x55fdc5ac38c0, i64 = 94548431485120, p_i64 = 0x55fdc5ac38c0}}, [28] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 68 'D', p_i8 = 0x20746e7500000044 <Address 0x20746e7500000044 out of bounds>, i16 = 68,
        p_i16 = 0x20746e7500000044, i32 = 68, p_i32 = 0x20746e7500000044, i64 = 2338615555302359108, p_i64 = 0x20746e7500000044}}, [32] = {m_type = 1 '\001', m_array_size = 116, m_value = {i8 = -80 '\260',
        p_i8 = 0x55fdc5ac34b0 "gvjxE3bQ", i16 = 13488, p_i16 = 0x55fdc5ac34b0, i32 = -978570064, p_i32 = 0x55fdc5ac34b0, i64 = 94548431484080, p_i64 = 0x55fdc5ac34b0}}, [33] = {m_type = 1 '\001', m_array_size = 48, m_value = {
        i8 = 96 '`', p_i8 = 0x55fdc5ac2060 "E-6E1D-47E4E70082BE\202\227h", i16 = 8288, p_i16 = 0x55fdc5ac2060, i32 = -978575264, p_i32 = 0x55fdc5ac2060, i64 = 94548431478880, p_i64 = 0x55fdc5ac2060}}, [34] = {m_type = 1 '\001',
      m_array_size = 57, m_value = {i8 = -128 '\200', p_i8 = 0x55fdc5ac2880 "84-03AD5F1C2697\202\227h", i16 = 10368, p_i16 = 0x55fdc5ac2880, i32 = -978573184, p_i32 = 0x55fdc5ac2880, i64 = 94548431480960, p_i64 = 0x55fdc5ac2880}},
    [37] = {m_type = 1 '\001', m_array_size = 103, m_value = {i8 = 112 'p', p_i8 = 0x55fdc5ac2470 "daQ", i16 = 9328, p_i16 = 0x55fdc5ac2470, i32 = -978574224, p_i32 = 0x55fdc5ac2470, i64 = 94548431479920, p_i64 = 0x55fdc5ac2470}},
    [47] = {m_type = 1 '\001', m_array_size = 52, m_value = {i8 = 0 '\000', p_i8 = 0x55fdc5ac0800 "gaER22MsixMbjMOuQ", i16 = 2048, p_i16 = 0x55fdc5ac0800, i32 = -978581504, p_i32 = 0x55fdc5ac0800, i64 = 94548431472640,
        p_i64 = 0x55fdc5ac0800}}, [49] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 86 'V', p_i8 = 0x56 <Address 0x56 out of bounds>, i16 = 86, p_i16 = 0x56, i32 = 86, p_i32 = 0x56, i64 = 86, p_i64 = 0x56}}, [51] = {
      m_type = 1 '\001', m_array_size = 108, m_value = {i8 = -112 '\220', p_i8 = 0x55fdc5ac2c90 "oria\034\001p", i16 = 11408, p_i16 = 0x55fdc5ac2c90, i32 = -978572144, p_i32 = 0x55fdc5ac2c90, i64 = 94548431482000,
        p_i64 = 0x55fdc5ac2c90}}, [55] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 27 '\033', p_i8 = 0x3d7265730000001b <Address 0x3d7265730000001b out of bounds>, i16 = 27, p_i16 = 0x3d7265730000001b, i32 = 27,
        p_i32 = 0x3d7265730000001b, i64 = 4427712928254263323, p_i64 = 0x3d7265730000001b}}, [57] = {m_type = 1 '\001', m_array_size = 101, m_value = {i8 = 80 'P', p_i8 = 0x55fdc5ac1c50 "anxxx wang\033", i16 = 7248,
        p_i16 = 0x55fdc5ac1c50, i32 = -978576304, p_i32 = 0x55fdc5ac1c50, i64 = 94548431477840, p_i64 = 0x55fdc5ac1c50}}, [1532] = {m_type = 1 '\001', m_array_size = 13, m_value = {i8 = 16 '\020',
        p_i8 = 0x55fdc5ac0c10 "1666678039768", i16 = 3088, p_i16 = 0x55fdc5ac0c10, i32 = -978580464, p_i32 = 0x55fdc5ac0c10, i64 = 94548431473680, p_i64 = 0x55fdc5ac0c10}}, [2560] = {m_type = 3 '\003', m_array_size = 0, m_value = {
        i8 = -20 '\354', p_i8 = 0x6131333400008bec <Address 0x6131333400008bec out of bounds>, i16 = -29716, p_i16 = 0x6131333400008bec, i32 = 35820, p_i32 = 0x6131333400008bec, i64 = 7003435193969183724,
        p_i64 = 0x6131333400008bec}}, [16385] = {m_type = 1 '\001', m_array_size = 5, m_value = {i8 = 0 '\000', p_i8 = 0x55fdc5ac4900 "Hello", i16 = 18688, p_i16 = 0x55fdc5ac4900, i32 = -978564864, p_i32 = 0x55fdc5ac4900,
        i64 = 94548431489280, p_i64 = 0x55fdc5ac4900}}, [16386] = {m_type = 1 '\001', m_array_size = 39, m_value = {i8 = 16 '\020', p_i8 = 0x55fdc5ac4d10 "\002\001$5141B4B7-C1B5-FFBE-4ED3-4EB2D324CCA3", i16 = 19728,
        p_i16 = 0x55fdc5ac4d10, i32 = -978563824, p_i32 = 0x55fdc5ac4d10, i64 = 94548431490320, p_i64 = 0x55fdc5ac4d10}}, [17908] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 0 '\000',
        p_i8 = 0x303d6c6100000000 <Address 0x303d6c6100000000 out of bounds>, i16 = 0, p_i16 = 0x303d6c6100000000, i32 = 0, p_i32 = 0x303d6c6100000000, i64 = 3476053651267518464, p_i64 = 0x303d6c6100000000}}, [17909] = {
      m_type = 1 '\001', m_array_size = 174, m_value = {i8 = -16 '\360', p_i8 = 0x55fdc5ac44f0 "", i16 = 17648, p_i16 = 0x55fdc5ac44f0, i32 = -978565904, p_i32 = 0x55fdc5ac44f0, i64 = 94548431488240, p_i64 = 0x55fdc5ac44f0}},
    [17910] = {m_type = 3 '\003', m_array_size = 0, m_value = {i8 = 0 '\000', p_i8 = 0x3030303000000000 <Address 0x3030303000000000 out of bounds>, i16 = 0, p_i16 = 0x3030303000000000, i32 = 0, p_i32 = 0x3030303000000000,
        i64 = 3472328295419215872, p_i64 = 0x3030303000000000}}, [17914] = {m_type = 1 '\001', m_array_size = 4, m_value = {i8 = 64 '@',
        p_i8 = 0x55fdc5ac5940 "\202\200\003\002q2\202\200\005'\002\001$7F4436AC-B402-5A13-B930-CA6AD9B7D48D\202\227h", i16 = 22848, p_i16 = 0x55fdc5ac5940, i32 = -978560704, p_i32 = 0x55fdc5ac5940, i64 = 94548431493440,
        p_i64 = 0x55fdc5ac5940}}, [17916] = {m_type = 1 '\001', m_array_size = 13, m_value = {i8 = 32 ' ', p_i8 = 0x55fdc5ac5120 "1666678085434\177", i16 = 20768, p_i16 = 0x55fdc5ac5120, i32 = -978562784, p_i32 = 0x55fdc5ac5120,
        i64 = 94548431491360, p_i64 = 0x55fdc5ac5120}}, [672555] = {m_type = 0 '\000', m_array_size = 0, m_value = {i8 = 3 '\003', p_i8 = 0x8003 <Address 0x8003 out of bounds>, i16 = -32765, p_i16 = 0x8003, i32 = 32771,
        p_i32 = 0x8003, i64 = 32771, p_i64 = 0x8003}}}, m_followers = empty std::list, m_spRef = {m_trustor = 0x0}, m_mbs = {m_imp = 0x0, m_ref = {lpref = 0x0}}}

观察数据特征可知,基本是m_array_size = 数字,对应的正则表达式是`m_array_size = [0-9]*`

使用命令

grep -o "m_array_size = [0-9]*"  a

操作结果 

$ grep -o "m_array_size = [0-9]*"  a
m_array_size = 20
m_array_size = 100
m_array_size = 39
m_array_size = 0
m_array_size = 49
m_array_size = 52
m_array_size = 49
m_array_size = 55
m_array_size = 0
m_array_size = 116
m_array_size = 48

例子2

找出下面数据中所有m_array_size = 数字 和 m_type = 数字 的字符串,并逐行打印出来。

样例数据同上

观察数据特征,`m_type = 1 '\001'` 是以逗号结尾的,所以正则表达式可以写成`m_type[^,]*`,前面也提到了多个匹配规则同时匹配可以使用`grep -e pattern1 -e pattern2 `或者`grep -f pattern_file`

使用命令

grep  -e "m_array_size = [0-9]*" -e "m_type = [^,]*" a -o

操作结果

$ grep  -e "m_array_size = [0-9]*" -e "m_type = [^,]*" a -o
m_type = 1 '\001'
m_array_size = 20
m_type = 1 '\001'
m_array_size = 100
m_type = 1 '\001'
m_array_size = 39
m_type = 3 '\003'
m_array_size = 0
m_type = 1 '\001'
m_array_size = 49
m_type = 1 '\001'
m_array_size = 52
m_type = 1 '\001'
m_array_size = 49

例子3

逐行打印国内最大的测试社区网站 testerhome 中所有的http开头的URL链接

样例数据的获得 

curl -s https://testerhome.com 

观察数据特征可知,以http开头的URL链接结尾一般是双引号,所以匹配到不是双引号的地方就是整个数据内容, 同理,例子1里面匹配的内容是到逗号结束的,所以正则表达式也可以是 `grep m_arr[^,]* a`

          <a href="https://testerhome.com/wiki/about">关于</a> /
          <a href="https://testerhome.com/users">活跃用户</a> /
          <a href="https://www.bagevent.com/event/7689076?bag_track=sqznpt"  target="_blank">中国移动互联网测试技术大会</a> /
          <a href="/topics/node13">反馈</a> /
          <a href="https://github.com/testerhome">Github</a> /
          <a href="https://testerhome.com//api-doc/">API</a> /
          <a href="/wiki/spreadtesterhome">帮助推广</a>
          <a style="" href="http://wetest.qq.com/?utm_sour

所以,正则表达式应该是 `http[^\"]*` 这里的双引号需要转义,所以加了`\`转移符号

使用命令

curl -s https://testerhome.com | grep  href | grep -o http[^\"]*

操作结果

$ curl -s https://testerhome.com | grep  href | grep -o http[^\"]*
https://testerhome.com/topics/feed
https://testerhome.com/users/third_app_login/dragontesting
https://www.fucegaoshou.com
http://www.qacon.net/
https://testerhome.com/system/letter_avatars/h.png
http://www.qacon.net/
https://testerhome.com/articles/34499
https://testerhome.com/articles/34500
https://testerhome.com/topics/34482
https://testerhome.com/topics/34526
https://testerhome.com/topics/34556
https://testerhome.com/articles/33998

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值