如果你用的是 linux,那我建議你用 grep 就好了:
$ ls mydir
a.js b.js c.js
$ grep JSQ mydir/*.js
mydir/a.js:abcdefg JSQ abcdefg
mydir/a.js:JSQ abcdefg abcdefg
mydir/a.js:abcdefg abcdefg JSQ
mydir/c.js:abcdefg JSQ abcdefg
mydir/c.js:JSQ abcdefg abcdefg
mydir/c.js:abcdefg abcdefg JSQ
(上面的例子裡,第一行的顯示有點問題,應該是這樣:grep JSQ mydir/*.js)
你也可以導到文件裡:
$ grep JSQ mydir/* > results.txt
然後你再從 results.txt 中去整理和統計數據。
如果你堅持想要使用 Python,我寫了一個應該是比較優化的代碼,你可以參考一下:
import os
import glob
def search(root, key, ftype='', logname=None):
ftype = '*.'+ftype if ftype else '*'
logname = logname or os.devnull
symbol = os.path.join(root, ftype)
fnames = glob.glob(symbol)
vc = len(fnames)
fc = 0
with open(logname, 'w') as writer:
for fname in fnames:
found = False
with open(fname) as reader:
for idx, line in enumerate(reader):
line = line.strip()
if key in line.split():
line = line.replace(key, '**'+key+'**')
found = True
print('{} -- {}: {}'.format(fname, idx, line), file=writer)
if found:
fc = fc + 1
print('{} has {}'.format(fname, key))
return vc, fc
search(root, key, ftype='', logname=None)
會在 root 這個 path 底下
尋找副檔名為 ftype 的文件(如果沒給則全部的文件都接受)
在裡面搜尋是否包含 key 這個關鍵字
如果有給 logname,則會輸出關鍵字前後用 '**' highlight 的 log 文件,內容是包含該關鍵字的每一行
實際上可以這樣用(search.py):
if __name__=='__main__':
root = 'mydir'
key = input("type key: ")
vc, fc = search(root, key, 'js', logname='results')
print('Found in {} files, visited {}'.format(fc, vc))
運行:
$ python3 search.py
type key: JSQ
mydir/c.js has JSQ
mydir/a.js has JSQ
Found in 2 files, visited 3
logfile results:
mydir/c.js -- 0: abcdefg **JSQ** abcdefg
mydir/c.js -- 1: **JSQ** abcdefg abcdefg
mydir/c.js -- 2: abcdefg abcdefg **JSQ**
mydir/a.js -- 0: abcdefg **JSQ** abcdefg
mydir/a.js -- 1: **JSQ** abcdefg abcdefg
mydir/a.js -- 2: abcdefg abcdefg **JSQ**