LINUX Shell 下求两个文件交集和差集的办法

最新推荐文章于 2022-06-07 22:53:46 发布

JIESA

最新推荐文章于 2022-06-07 22:53:46 发布

阅读量858

点赞数

分类专栏： Linux 文章标签： shell

Linux 专栏收录该内容

21 篇文章 0 订阅

订阅专栏

设两个文件FILE1和FILE2用集合A和B表示，FILE1内容如下：

[xhtml]view plaincopy 
    
 a  
 b  
 c  
 e  
 d  
 a  

FILE2内容如下：

[xhtml]view plaincopy 
    
 c  
 d  
 a  
 c  

基本上有两个方法，一个是comm命令，一个是grep命令。分别介绍如下：

comm命令 ， Compare sorted files FILE1 and FILE2 line by line. With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files. 要注意两个文件必须是排序和唯一(sorted and unique)的，默认输出为三列，第一列为是A-B，第二列B-A，第三列为A交B。

直接运行结果如下：

[xhtml]view plaincopy 
    
 $ comm a.txt b.txt  
 a  
 b  
                 c  
         d  
         a  
         c  
 e  
 d  
 a  

仅仅排序：

[xhtml]view plaincopy 
    
 $ comm <(sort a.txt ) <(sort b.txt )  
                 a  
 a  
 b  
                 c  
         c  
                 d  
 e  

排序并且唯一：

[xhtml]view plaincopy 
    
 $ comm <(sort a.txt|uniq ) <(sort b.txt|uniq )  
                 a  
 b  
                 c  
                 d  
 e  

如果只想要交集，如下即可：

[xhtml]view plaincopy 
    
 $ comm -12 <(sort a.txt|uniq ) <(sort b.txt|uniq )  
 a  
 c  
 d  

至于差集，读者自己思考了。

grep 命令是常用的搜索文本内容的，要找交集，如下即可：

[xhtml]view plaincopy 
    
 p$ grep -F -f a.txt b.txt  
 c  
 d  
 a  
 c  

grep不要求排序，但是因为是集合操作，唯一是必须的（不然怎么是集合呢？）。所以：

[c-sharp]view plaincopy 
    
 $ grep -F -f a.txt b.txt | sort | uniq  
 a  
 c  
 d  

差集呢？

[xhtml]view plaincopy 
    
 $ grep -F -v -f a.txt b.txt | sort | uniq  
 $ grep -F -v -f b.txt a.txt | sort | uniq  
 b  
 e  

第一行结果为B-A，所以为空；第二行为A-B。注意顺序很重要！

http://blog.csdn.net/autofei/article/details/6579320

JIESA

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录