awk命令
功能描述:强大的文本处理工具
基本格式:awk [参数] '条件{动作}' 源文件
常用参数:
-F:指定字段分隔符
awk常见用法
测试数据:
[root@bogon awk]# cat log
https://www.baidu.com/index
https://www.baidu.com/home
https://www.baidu.com/123
https://mp.csdn.net/postlist
https://mp.csdn.net/postedit
https://spring.io/projects
http://dubbo.apache.org
https://spring.io/guides
实现给每一行加上行号,可以通过cat -n实现,同样可以用awk实现:
[root@bogon awk]# cat -n log
1 https://www.baidu.com/index
2 https://www.baidu.com/home
3 https://www.baidu.com/123
4 https://mp.csdn.net/postlist
5 https://mp.csdn.net/postedit
6 https://spring.io/projects
7 http://dubbo.apache.org
8 https://spring.io/guides
[root@bogon awk]# awk '{print NR,$0}' log
1 https://www.baidu.com/index
2 https://www.baidu.com/home
3 https://www.baidu.com/123
4 https://mp.csdn.net/postlist
5 https://mp.csdn.net/postedit
6 https://spring.io/projects
7 http://dubbo.apache.org
8 https://spring.io/guides
print相当于awk的内部命令,NR表示行号,$0表示一整行内容。
显示第5到第6行:
[root@bogon awk]# awk 'NR==5,NR==6 {print NR,$0}' log
5 https://mp.csdn.net/postedit
6 https://spring.io/projects
利用-F命令按关键字 “/” 分隔行,同时显示分隔后的第一列、第三列和最后一列:
[root@bogon awk]# awk -F "/" '{print $1,$3,$NF}' log
https: www.baidu.com index
https: www.baidu.com home
https: www.baidu.com 123
https: mp.csdn.net postlist
https: mp.csdn.net postedit
https: spring.io projects
http: dubbo.apache.org dubbo.apache.org
https: spring.io guides
$1表示第一列,$3表示第三列,$NF表示最后一列,而$0表示一整行。
利用gsub函数进行字符替换,将文件中的https替换成http:
[root@bogon awk]# awk '{gsub("https","http",$0);print $0}' log
http://www.baidu.com/index
http://www.baidu.com/home
http://www.baidu.com/123
http://mp.csdn.net/postlist
http://mp.csdn.net/postedit
http://spring.io/projects
http://dubbo.apache.org
http://spring.io/guides
面试题:将以下文件中的域名取出,并统计域名出现次数
https://www.baidu.com/index
https://www.baidu.com/home
https://www.baidu.com/123
https://mp.csdn.net/postlist
https://mp.csdn.net/postedit
https://spring.io/projects
http://dubbo.apache.org
https://spring.io/guides
思路:
1.取出域名
2.排序,让相同行相连
3.去重计数
[root@bogon awk]# awk -F '/' '{print $3}' log | sort | uniq -c
1 dubbo.apache.org
2 mp.csdn.net
2 spring.io
3 www.baidu.com