详述os.walk()方法

os.walk() 函数有四个入参,分别是:top, topdown=True, οnerrοr=None, followlinks=False

top

top参数为walk递归的顶层路径,可取绝对路径或者相对路径。

topdown

topdown参数默认为True,这是一个很有意思的参数,他的存在应该就是为了运行效率。具体来说就是:topdown=True 可更改dirnames列表(删除或者分割列表),walk方法仅会递归进入仍在dirnames列表中的目录;
topdown=False 则无论对dirnames列表如何处理,递归子目录会重新生成,不会改变。

如何理解呢?

比如近期我的硬盘D盘空间满了,要找出最大的前十个文件。D盘有两个SVN目录,由于SVN文件目录很深,文件很多,即使找出有很大的文件,也不能删除,那么我需要的就是过滤这个目录。

如何做到呢?

先设定无需进入的目录列表,然后做一个判断,若无需walk则remove该目录即可。代码如下:

# 无需walk进入的目录
exclusive_dir = ["iSeeRobotAdvisor","","VM INSTALL","VIPSTU"]

for dirpath, dirnames, filenames in os.walk(top=path,topdown=True):
    for each in exclusive_dir:
        if each in dirnames:
            dirnames.remove(each) # 移除特定目录

onerror

oneerror默认为None。这个是当walk有异常报错时,可用该参数定义一个函数。输出错误,但不中断walk函数;也可抛出异常,中断walk函数。

followlinks 默认为False。当目录中出现链接时,需要用该参数定义。
若followlinks=False ,则不会递归进入链接目录;
若followlinks=True ,则会递归进入链接目录;

原文参数解析

"""Directory tree generator.

For each directory in the directory tree rooted at top (including top
itself, but excluding '.' and '..'), yields a 3-tuple

    dirpath, dirnames, filenames

dirpath is a string, the path to the directory.  dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists are just names, with no path components.
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).

If optional arg 'topdown' is true or not specified, the triple for a
directory is generated before the triples for any of its subdirectories
(directories are generated top down).  If topdown is false, the triple
for a directory is generated after the triples for all of its
subdirectories (directories are generated bottom up).

When topdown is true, the caller can modify the dirnames list in-place
(e.g., via del or slice assignment), and walk will only recurse into the
subdirectories whose names remain in dirnames; this can be used to prune the
search, or to impose a specific order of visiting.  Modifying dirnames when
topdown is false is ineffective, since the directories in dirnames have
already been generated by the time dirnames itself is generated. No matter
the value of topdown, the list of subdirectories is retrieved before the
tuples for the directory and its subdirectories are generated.

By default errors from the os.listdir() call are ignored.  If
optional arg 'onerror' is specified, it should be a function; it
will be called with one argument, an OSError instance.  It can
report the error to continue with the walk, or raise the exception
to abort the walk.  Note that the filename is available as the
filename attribute of the exception object.

By default, os.walk does not follow symbolic links to subdirectories on
systems that support them.  In order to get this functionality, set the
optional argument 'followlinks' to true.

Caution:  if you pass a relative pathname for top, don't change the
current working directory between resumptions of walk.  walk never
changes the current directory, and assumes that the client doesn't
either.

Example:

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
    print(root, "consumes", end="")
    print(sum([getsize(join(root, name)) for name in files]), end="")
    print("bytes in", len(files), "non-directory files")
    if 'CVS' in dirs:
        dirs.remove('CVS')  # don't visit CVS directories

"""
  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值