文本(link.txt)内容如下:
http://www.sohu.com/index.html
http://www.sohu.com/1.html
http://www.sohu.com/2.html
http://post.sohu.com/index.html
http://mp3.sohu.com/index.html
http://www.sohu.com/3.html
http://post.sohu.com/2.html
如何使用Linux三剑客得到如下结果:域名的出现次数、域名。
4 www.sohu.com
2 post.sohu.com
1 mp3.sohu.com
# 文本内容
[root@centos7 ~]# cat link.txt
http://www.sohu.com/index.html
http://www.sohu.com/1.html
http://www.sohu.com/2.html
http://post.sohu.com/index.html
http://mp3.sohu.com/index.html
http://www.sohu.com/3.html
http://post.sohu.com/2.html
# grep 实现
[root@centos7 ~]# grep -o -e "[[:alnum:]]\+.sohu.com" link.txt | sort | uniq -c | sort -nr
4 www.sohu.com
2 post.sohu.com
1 mp3.sohu.com
# sed 实现
[root@centos7 ~]# sed -rn 's/.*\/\/([[:alnum:]]+\.sohu\.com).*/\1/p' link.txt | sort | uniq -c | sort -nr
4 www.sohu.com
2 post.sohu.com
1 mp3.sohu.com
# awk 实现
[root@centos7 ~]# awk -F "/" '{link[$3]++}END{for(i in link){print link[i],i}}' link.txt | sort -nr
4 www.sohu.com
2 post.sohu.com
1 mp3.sohu.com
# cut 实现
[root@centos7 ~]# cut -d"/" -f 3 link.txt | sort | uniq -c | sort -nr
4 www.sohu.com
2 post.sohu.com
1 mp3.sohu.com