[第一弹]os.walk的相关用法

os.walk(string directoryPath)的参数是一个目录,字符串类型,返回root(根目录),directory(子目录,列表),file(子文件名,列表类型)

代码1-1.

import os

for root,dirs,files in os.walk('e://HIMYM//HIMYM-S5'):
    print root,dirs,files,'\n'
输出结果:

e://HIMYM//HIMYM-S5 [] ['How I Met Your Mother S05E01 Definitions 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E02 Double Date 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E03 Robin 101 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E04 The Sexless Innkeeper 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E05 Duel Citizenship 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E06 Bagpipes 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E07 The Rough Patch 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E08 The Playbook 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E10 The Window 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E11 The Last Cigarette Ever 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E12 Girls VS. Suits 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E13 Jenkins 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E14 The Perfect Week 720p WEB-DL DD5.1 H264-PeeWee.mkv', 'How I Met Your Mother S05E15 Rabbit or Duck 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E16 Hooked 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E17 Of Course 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E18 Say Cheese 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E19 Zoo or False 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E20 Home Wreckers 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E21 Twin Beds 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E22 Robots vs. Wrestlers 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E23 The Wedding Bride 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E24 Doppelgangers 720p WEB-DL DD5.1-PeeWee.mkv'] 

代码1-1中os.walk的参数'e://HIMYM//HIMYM-S5'是一个只包含文件的目录,没有子目录,所以dirs=[].


使用os.walk经常遇到中文编码问题,当目录名或文件名中包含中文时,输出乱码,如下:

代码1-2.

import os

for root,dirs,files in os.walk('E:\\WORK_FILE\\Python\\Python2'):
    print root,dirs,files,'\n'

输出结果:

>>> 

E:\WORK_FILE\Python\Python2 [] ['bkjw.py', 'calculator.py', 'cdclog.txt', 'cdctools.py', 'cdctools.pyc', 'class_login.py', 'class_test01.py', 'eight_queen.py', 'hehe.py', 'pycdc-v0.5.py', 'pyre_ebb9ce1c-e5e8-4219-a8ae-7ee620d5f9f1.png', 'renren.html', 'renren.py', 're_match.py', 're_test.py', 'szhxy\xd0\xde\xb8\xc4\xb0\xe6.py', 'szhxy\xd4\xad\xb0\xe6.py', 'table.html', 'test (2).py', 'test.py', 'test0.py', 'test1.py', 'YaYa', 'YaYa.html', 'YaYa.txt', 'yy1.py', 'yy2.py', '\xd5\xbb.py', '\xc0\xe0\xb5\xc4\xbc\xcc\xb3\xd0.py', '\xb1\xe0\xc2\xeb\xce\xca\xcc\xe2.py', '\xbc\xc7\xca\xc2\xb1\xbe.py'] 

解决方法:像上面代码中直接输出dirs,files,会导致乱码,如果将dirs,files遍历每项然后输出,就不会产生乱码,

代码1-3:

import os

for root,dirs,files in os.walk('E:\\WORK_FILE\\Python\\Python2'):
    print 'root:' , root , '\n'
    print 'directory:\n'
    for directory in dirs:
        print directory , '\n'
    print 'file:\n'
    for f in files:
        print f , '\n'
部分输出结果:

>>> 

root: E:\WORK_FILE\Python\Python2 

directory:

file:

....

szhxy修改版.py 

szhxy原版.py 

...

栈.py 

类的继承.py 

编码问题.py 

记事本.py 

实用代码1-3:

# _*_coding:utf-8 _*_
import os
import chardet
import re
#
#@param file_list 全为字符串的列表
#功能:将列表中的每一个字符串重新格式化,返回一个格式化好的字符串
#
def list2str(file_list):
    if file_list==[]:
        return 'null'
    tmp_file=''
    i=0
    for name in file_list:
        if i%5==0 and i!=0:
            tmp_file+='\n'
        tmp_file+=(name+'|#|')
        i+=1
    return tmp_file
#
#param directory 需要遍历的目录
#      save-file 将遍历之后的结果保存在save_file
#
def fileWalker(directory,save_file):
    fp=open(save_file,'w')
    for root,dirs,files in os.walk(directory):
        dirs=list2str(dirs)
        files=list2str(files)
        tmp='rootdir:'+root+'\n'+'dirs----'+dirs+'\n'+'files----'+files+'\n'
        fp.write(tmp)
        fp.write('+'*20+'\n'+'+'*20+'\n')
    fp.close()
#
#param directory 指定搜索目录
#      keyword   指定查询关键字 
#返回directory目录下的所有符合条件的目录,文件,子目录,子文件
#
def Grep(directory,keyword):
    tmp_dir=''
    for root,dirs,files in os.walk(directory):
        '''dirs=list2str(dirs)
        files=list2str(files)
        re_find=re.compile(keyword)    
        re_find.findAll(dirs)'''
        if chardet.detect(keyword)['encoding']!='ascii':
            for dir_name in dirs:
                if chardet.detect(dir_name)['encoding']=='GB2312':
                    if keyword.decode('utf8') in dir_name.decode('GB2312'):
                        tmp_dir+=('d:'+root+'\\'+dir_name+'\n')
            for file_name in files:
                code=chardet.detect(file_name)['encoding']
                try:
                    if keyword.decode('utf8') in file_name.decode(code):
                        tmp_dir+=('f:'+root+'\\'+file_name+'\n')
                except:
                    pass
                    
        else:
            for dir_name in dirs:
                if keyword in dir_name:
                    tmp_dir+=('d:'+root+'\\'+dir_name+'\n')
            for file_name in files:
                if keyword in file_name:
                    tmp_dir+=('f:'+root+'\\'+file_name+'\n')
    return tmp_dir                
if __name__=="__main__":
    '''directory="E:\\BaiduYunDownload"
    fileWalker(directory,"E:\\WORK_FILE\\Python\\Python2\\cdclog.txt")'''
    dirs=Grep('E:\\BaiduYunDownload','韩寒')
    print dirs




                
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值