利用Python脚本进行删除不需要的行数


工作上遇到一些文本要进行操作,如下图,需要Policy name: per_id_4相关的4行内容进行删除,由于行数高达10多万行,那么只能借助工具来删除了。

Python脚本是一个比较不错的方法,为了举例,把10多万行文件进行简化。


具体做法如下:


jameszhou@JamesNet1:~/python/work$ cat sorce_session.log

Session ID: 160723322, Policy name: per_id_4/310, State: Active, Timeout: 102, Valid

  In: 10.251.111.28/10050 --> 10.252.42.57/44109;tcp, If: reth1.101, Pkts: 5, Bytes: 281

  Out: 10.252.42.57/44109 --> 10.251.111.28/10050;tcp, If: reth1.101, Pkts: 0, Bytes: 0


Session ID: 160724631, Policy name: id_620779/2153, State: Active, Timeout: 3506, Valid

  In: 10.251.111.145/54262 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 2, Bytes: 1072

  Out: 10.252.16.53/5500 --> 10.251.111.145/54262;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160725260, Policy name: per_id_4/310, State: Active, Timeout: 822, Valid

  In: 10.251.111.28/10050 --> 10.252.42.57/47473;tcp, If: reth1.101, Pkts: 5, Bytes: 281

  Out: 10.252.42.57/47473 --> 10.251.111.28/10050;tcp, If: reth1.101, Pkts: 0, Bytes: 0


Session ID: 160725485, Policy name: per_id_4/310, State: Active, Timeout: 42, Valid

  In: 10.251.111.28/10050 --> 10.252.42.57/43843;tcp, If: reth1.101, Pkts: 5, Bytes: 281

  Out: 10.252.42.57/43843 --> 10.251.111.28/10050;tcp, If: reth1.101, Pkts: 0, Bytes: 0


Session ID: 160727289, Policy name: per_id_4/310, State: Active, Timeout: 1652, Valid

  In: 10.251.111.28/10050 --> 10.252.42.57/51335;tcp, If: reth1.101, Pkts: 5, Bytes: 297

  Out: 10.252.42.57/51335 --> 10.251.111.28/10050;tcp, If: reth1.101, Pkts: 0, Bytes: 0


Session ID: 160727763, Policy name: id_620779/2153, State: Active, Timeout: 1318, Valid

  In: 10.251.111.146/41091 --> 10.252.16.54/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.54/5500 --> 10.251.111.146/41091;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728597, Policy name: id_620779/2153, State: Active, Timeout: 3074, Valid

  In: 10.251.111.7/34445 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 2, Bytes: 1072

  Out: 10.252.16.53/5500 --> 10.251.111.7/34445;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728718, Policy name: id_620779/2153, State: Active, Timeout: 3482, Valid

  In: 10.251.111.8/39406 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 3, Bytes: 1608

  Out: 10.252.16.53/5500 --> 10.251.111.8/39406;udp, If: reth2.953, Pkts: 2, Bytes: 1072


Session ID: 160728893, Policy name: id_620779/2153, State: Active, Timeout: 2690, Valid

  In: 10.251.111.9/57551 --> 10.252.16.54/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.54/5500 --> 10.251.111.9/57551;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728933, Policy name: id_620779/2153, State: Active, Timeout: 2010, Valid

  In: 10.251.111.147/60098 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.53/5500 --> 10.251.111.147/60098;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160729073, Policy name: per_id_4/310, State: Active, Timeout: 1084, Valid

  In: 10.251.111.28/10050 --> 10.252.42.57/48730;tcp, If: reth1.101, Pkts: 5, Bytes: 281

  Out: 10.252.42.57/48730 --> 10.251.111.28/10050;tcp, If: reth1.101, Pkts: 0, Bytes: 0


jameszhou@JamesNet1:~/python/work$ vi rm_per_id_4_session.py

输入以下代码:

-------------------------

import string

sorce_log = open('sorce_session.log','r')

destination_log = open('destination_session.log','w')

while 1:

        line = sorce_log.readline()

        if not line:

                break

        if line.find('per_id_4') != -1:

                line = sorce_log.readline()

                line = sorce_log.readline()

                line = sorce_log.readline()

        else:

                destination_log.write(line)

sorce_log.close()

destination_log.close()

print "Game over"

-------------------------


jameszhou@JamesNet1:~/python/work$ 

jameszhou@JamesNet1:~/python/work$ python rm_per_id_4_session.py

Game over

jameszhou@JamesNet1:~/python/work$ cat destination_session.log

Session ID: 160724631, Policy name: id_620779/2153, State: Active, Timeout: 3506, Valid

  In: 10.251.111.145/54262 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 2, Bytes: 1072

  Out: 10.252.16.53/5500 --> 10.251.111.145/54262;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160727763, Policy name: id_620779/2153, State: Active, Timeout: 1318, Valid

  In: 10.251.111.146/41091 --> 10.252.16.54/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.54/5500 --> 10.251.111.146/41091;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728597, Policy name: id_620779/2153, State: Active, Timeout: 3074, Valid

  In: 10.251.111.7/34445 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 2, Bytes: 1072

  Out: 10.252.16.53/5500 --> 10.251.111.7/34445;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728718, Policy name: id_620779/2153, State: Active, Timeout: 3482, Valid

  In: 10.251.111.8/39406 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 3, Bytes: 1608

  Out: 10.252.16.53/5500 --> 10.251.111.8/39406;udp, If: reth2.953, Pkts: 2, Bytes: 1072


Session ID: 160728893, Policy name: id_620779/2153, State: Active, Timeout: 2690, Valid

  In: 10.251.111.9/57551 --> 10.252.16.54/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.54/5500 --> 10.251.111.9/57551;udp, If: reth2.953, Pkts: 1, Bytes: 536


Session ID: 160728933, Policy name: id_620779/2153, State: Active, Timeout: 2010, Valid

  In: 10.251.111.147/60098 --> 10.252.16.53/5500;udp, If: reth1.101, Pkts: 1, Bytes: 536

  Out: 10.252.16.53/5500 --> 10.251.111.147/60098;udp, If: reth2.953, Pkts: 1, Bytes: 536


jameszhou@JamesNet1:~/python/work$ 

嘿嘿,per_id_4的相关的内容删除了。。。

OK!!!