面试题目:
1. 用PYTHON实现grep -A和-B功能,打印文本多位置匹配?
解题思路:
1. grep -A匹配连带后N行,要实现此功能,首先遍历每一行,如果发现匹配项设置记录标志位,后面循环的linenum行会被记录,但有可能下面linenum行中也存在匹配项,所以就需要不匹配和标志位是否被设置同时判断,一旦记录数到达linenum+1行就打印然后重置零时数组和标志位,但重置后的下一个遍历元素可能为非匹配项,所以需要判断一下标志位是否被设置,依次类推即可
2. grep -B匹配连带前N行,相对来说简单多了,遍历每一行,append入零时列表,当列表数目超出linenum时就开始从头部pop,一旦匹配就判断长度打印即可
测试数据:
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
具体实现:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Date : 2017-02-09 22:17:46
# @Author : 李满满 (xmdevops@vip.qq.com)
# @Link : http://xmdevops.blog.51cto.com/
# @Version : $Id$
from __future__ import absolute_import
# 说明: 导入公共模块
import os
import re
# 说明: 导入其它模块
# 模拟Linux grep过滤
class Grep(object):
def __init__(self, file, pattern):
self.file = file
self.pattern = pattern
self.__init()
def __init(self):
try:
self.pattern = re.compile(self.pattern)
except Exception, e:
print 'errors: grep pattern with error {0}'.format(e)
exit()
def match_regex(self, strs):
match = self.pattern.search(strs)
return match
# 模拟grep -A n 匹配连带后N行
def after_lines(self, linenum):
results = []
addflag = False
with open(self.file, 'r+b') as fd_p:
for cur_line in fd_p:
# 匹配后的非匹配项/匹配项
if not self.match_regex(cur_line) or addflag:
# 防止非匹配项被加入
if addflag:
# 只要添加标志位存在就一直添加
results.append(cur_line)
# 数目是否达到linenum,达到即输出然后清空列表
if len(results) == linenum+1:
print ''.join(results)
results = []
addflag = False
else:
# 匹配时开启添加标志位
addflag = True
results.append(cur_line)
return ''.join(results)
# 模拟grep -B n 匹配连带前N行
def before_lines(self, linenum):
results = []
with open(self.file, 'r+b') as fd_p:
for cur_line in fd_p:
# 列表长度大于限制数就开始从头POP
if len(results) > linenum:
results.pop(0)
# 持续添加
results.append(cur_line)
# 发现匹配开始输出
if self.match_regex(cur_line):
# 加长度判断
if len(results) == linenum + 1:
print ''.join(results)
# return ''.join(results)
if __name__ == '__main__':
grep = Grep('data.cfg', r'1')
grep.after_lines(7)
grep.before_lines(2)
有图有像:
转载于:https://blog.51cto.com/xmdevops/1896799