【问题】
I have the following sample tab delimited file:
.CvR Col_1 Col_2 Col_3 Col_4 Col_5 S1 1 0 1 0 1 S2 1 1 1 0 1 S3 1 1 1 1 1 S4 1 0 1 1 1 S5 1 0 1 1 1
I am trying to come up with a simple way to print the first column and all columns with just “1” values in them.
My desired output file should look like this:
.CvR Col_1 Col_3 Col_5 S1 1 1 1 S2 1 1 1 S3 1 1 1 S4 1 1 1 S5 1 1 1
My actual input file will be much bigger. I would like to do this in UNIX where possible. Can anybody help? Thanks.
#!/bin/bash clear value='\\(\[01\]\\)' cp file file2 for i in 1 2 3 4 5 6; do sed -i "s/ ${value}/ val${i}_\\1/" file2 done rowcount=$(wc -l <file2) for i in 1 2 3 4 5 6; do if \[ $(grep -c val${i}_1 file2) -eq ${rowcount} \]; then sed -i "s/val${i}_./1/" file2 else sed -i "s/Col_${i}//" file2 sed -i "s/val${i}_.//" file2 fi done cat file2
【回答】
动态列的问题还可以用 SPL 做,代码更简单些:
A | |
1 | =file("/user/data.txt").import@t() |
2 | =A1.fname().to(2,).select(A1.field(~).count(~==1)==A1.count()) |
3 | =A1.new(${"\'.CvR\',"+A2.concat@c()}) |
集算器支持 windows\unix 命令行,还可与 JAVA 集成,参考【集算器实现文本处理的应用方案】