文本内容如下:
192.168.100.125 UNKNOWN w0100441 [03/Jun/2015:16:16:26 +0800] 226 119096 0.014
..........................
63.54.78.89 UNKNOWN w0500465 [03/Sep/2015:16:16:26 +0800] 275 119098 0.019
..........................
61.30.89.6 UNKNOWN kobe [15/Oct/2015:09:15:35 +0800] 426 1109168 3.215
..........................
192.168.100.126 UNKNOWN dingjunhui [15/Mar/2016:09:15:35 +0800] 426 1109168 3.215
..........................
220.86.123.140 UNKNOWN idenbo [03/May/2016:09:41:27 +0800] 226 7068129 12.102
上面的文件是一个OS的登录日志,第一列是登录到服务器的客户端的IP地址,第三列是帐号,以w开头的是公司内网帐号,
其余的均为外网帐号。192.168.100的为内网IP,其余的均为外网IP,我现在想要找出三个月内,
以内网帐号和外网IP登录服务器的行(例如 63.54.78.89 UNKNOWN w0500465),
和以外网帐号和内网IP登录服务器的行(例如: 192.168.100.126 UNKNOWN dingjunhui),
用Python如何来实现?如果这里查找的时间是个变量,即找出四个月内,五个月内的信息(依次类推),用Python语言如何来写?
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import getopt
from datetime import datetime
import time
from dateutil.relativedelta import relativedelta
import re
__author__ = 'shengwei ma'
__author_email__ = 'shengweima@icloud.com'
input_file = ""
output_file = ""
a = ""
try:
opts, args = getopt.getopt(sys.argv[1:], "hm:", ["input=", "output="])
except getopt.GetoptError as err:
print(str(err))
for op, value in opts:
if op == "--input":
input_file = value
elif op == "-m":
a = int(value)
elif op == "--output":
output_file = open(value, 'w')
elif op == "-h":
print("python filter_by_month.py --input your_input_file -m 3 --output your_out_file")
sys.exit()
months = datetime.now() + relativedelta(months=-a)
month = months.strftime("%d/%b/%Y %H:%M:%S")
print 'You will find after this time %s line!' % month
with open(input_file, 'r') as f:
for num, line in enumerate(f):
if num % 2 == 0:
new_line = line.strip().split()
date = new_line[3].lstrip('[').split(':')
new_date = date[0] + ' ' + date[1] + ':' + date[2] + ':' + date[3]
timeArray_b = time.strptime(new_date, "%d/%b/%Y %H:%M:%S")
timeArray_a = time.strptime(month, "%d/%b/%Y %H:%M:%S")
if timeArray_b > timeArray_a:
if new_line[0].startswith('192') and re.findall(r'[^w]', new_line[2]):
output_file.write(line)
elif new_line[2].startswith('w') and re.findall(r'[^192]', new_line[0]):
output_file.write(line)