python 文件操作 os.walk() 方法

最新推荐文章于 2023-10-31 17:07:47 发布

Dontla

最新推荐文章于 2023-10-31 17:07:47 发布

阅读量592

点赞数

分类专栏： Python Reptile

本文链接：https://blog.csdn.net/Dontla/article/details/102699625

版权

Python Reptile 专栏收录该内容

8 篇文章 1 订阅

订阅专栏

all = os.walk(source_txt_path)

# dirpath：从all中存储的source_txt_path下文件夹及子文件夹列表中取出每个文件夹及子文件夹路径
# dirnames ：dirpath下的文件夹列表（不包括子文件夹）
# filenames ：dirpath下的文件列表
for dirpath, dirnames, filenames in all:

topdown、onerror、followlinks参数貌似还不怎么用到，暂时不管。

在这里插入图片描述
doc：

def walk(top, topdown=True, onerror=None, followlinks=False):
    """Directory tree generator. 目录树生成器。

    For each directory in the directory tree rooted at top
    （对于以目录树为根的目录中的每个目录） 
    (including top itself, but excluding（不包括） '.' and '..'), 
    yields（产生） a 3-tuple

        dirpath, dirnames, filenames

    dirpath is a string, the path to the directory.  dirnames is a list of
    the names of the subdirectories in dirpath (excluding '.' and '..').
    
    dirpath是一个字符串，是目录的路径。 dirnames是dirpath中子目录名称的列表
    
    
    filenames is a list of the names of the non-directory files in dirpath.
    Note that the names in the lists are just names, with no path components.
    To get a full path (which begins with top) to a file or directory in
    dirpath, do os.path.join(dirpath, name).
    
    filenames是dirpath中非目录文件名称的列表。 
    请注意，列表中的名称只是名称，没有路径成分。 
    要获取目录路径中文件或目录的完整路径（从顶部开始），请执行os.path.join（dirpath，name）。

    If optional arg 'topdown' is true or not specified, the triple for a
    directory is generated before the triples for any of its subdirectories
    (directories are generated top down).  If topdown is false, the triple
    for a directory is generated after the triples for all of its
    subdirectories (directories are generated bottom up).

	如果可选参数arg'topdown'为true或未指定，则在其任何子目录的三元组之前生成目录的三元组（目录是自上而下生成的）。 
	如果topdown为false，则在其所有子目录的三元组之后生成目录的三元组（目录自下而上生成）。

    When topdown is true, the caller can modify the dirnames list in-place
    (e.g., via del or slice assignment), and walk will only recurse into the
    subdirectories whose names remain in dirnames; this can be used to prune the
    search, or to impose a specific order of visiting.  Modifying dirnames when
    topdown is false is ineffective, since the directories in dirnames have
    already been generated by the time dirnames itself is generated. No matter
    the value of topdown, the list of subdirectories is retrieved before the
    tuples for the directory and its subdirectories are generated.
	
	当topdown为true时，调用者可以就地修改目录名列表（例如，通过del或slice分配），而walk仅会递归到名称保留在目录名中的子目录中； 
	这可用于修剪搜索或强加特定的访问顺序。 
	当topdown为false时修改目录名无效，因为目录名本身生成时已经生成了目录名中的目录。 
	无论topdown的值如何，都将在生成目录及其子目录的元组之前检索子目录的列表。

    By default errors from the os.scandir() call are ignored.  If
    optional arg 'onerror' is specified, it should be a function; it
    will be called with one argument, an OSError instance.  It can
    report the error to continue with the walk, or raise the exception
    to abort the walk.  Note that the filename is available as the
    filename attribute of the exception object.

	默认情况下，将忽略os.scandir（）调用中的错误。 
	如果指定了可选的arg'onerror'，它应该是一个函数； 
	将使用一个参数OSError实例来调用它。 
	它可以报告错误以继续进行遍历，或者引发异常以中止遍历。 
	请注意，文件名可用作异常对象的文件名属性。

    By default, os.walk does not follow symbolic links to subdirectories on
    systems that support them.  In order to get this functionality, set the
    optional argument 'followlinks' to true.
	
	缺省情况下，os.walk在支持它们的系统上不跟随符号链接到子目录。 
	为了获得此功能，请将可选参数'followlinks'设置为true。

    Caution:  if you pass a relative pathname for top, don't change the
    current working directory between resumptions of walk.  walk never
    changes the current directory, and assumes that the client doesn't
    either.
	
	注意：如果您为top传递了相对路径名，请不要在恢复行走之间更改当前的工作目录。 
	walk永远不会更改当前目录，并假定客户端也不会更改当前目录。

    Example:

    import os
    from os.path import join, getsize
    for root, dirs, files in os.walk('python/Lib/email'):
        print(root, "consumes", end="")
        print(sum([getsize(join(root, name)) for name in files]), end="")
        print("bytes in", len(files), "non-directory files")
        if 'CVS' in dirs:
            dirs.remove('CVS')  # don't visit CVS directories

    """
    top = fspath(top)
    dirs = []
    nondirs = []
    walk_dirs = []

    # We may not have read permission for top, in which case we can't
    # get a list of the files the directory contains.  os.walk
    # always suppressed the exception then, rather than blow up for a
    # minor reason when (say) a thousand readable directories are still
    # left to visit.  That logic is copied here.

	# 我们可能没有top的读取权限，在这种情况下，我们无法获取目录包含的文件的列表。 
	# 那时os.walk总是抑制该异常，而不是因为（尽管如此）当仍然有千个可读目录需要访问时才因次要原因而崩溃。 
	# 该逻辑复制到此处。
	
    try:
        # Note that scandir is global in this module due
        # to earlier import-*.
	
		# 注意，由于较早的import-*，scandir在此模块中是全局的。
		
        scandir_it = scandir(top)
    except OSError as error:
        if onerror is not None:
            onerror(error)
        return

    with scandir_it:
        while True:
            try:
                try:
                    entry = next(scandir_it)
                except StopIteration:
                    break
            except OSError as error:
                if onerror is not None:
                    onerror(error)
                return

            try:
                is_dir = entry.is_dir()
            except OSError:
                # If is_dir() raises an OSError, consider that the entry is not
                # a directory, same behaviour than os.path.isdir().

				# 如果is_dir（）引发OSError，请考虑该条目不是目录，其行为与os.path.isdir（）相同。
				
                is_dir = False

            if is_dir:
                dirs.append(entry.name)
            else:
                nondirs.append(entry.name)

            if not topdown and is_dir:
                # Bottom-up: recurse into sub-directory, but exclude symlinks to
                # directories if followlinks is False

				# 自下而上：递归到子目录，但如果followlinks为False，则排除指向目录的符号链接
				
                if followlinks:
                    walk_into = True
                else:
                    try:
                        is_symlink = entry.is_symlink()
                    except OSError:
                        # If is_symlink() raises an OSError, consider that the
                        # entry is not a symbolic link, same behaviour than
                        # os.path.islink().

						# 如果is_symlink（）引发OSError，请考虑该条目不是符号链接，其行为与os.path.islink（）相同。
                        is_symlink = False
                    walk_into = not is_symlink

                if walk_into:
                    walk_dirs.append(entry.path)

    # Yield before recursion if going top down
	# 如果自上而下，则在递归前生成
    if topdown:
        yield top, dirs, nondirs

        # Recurse into sub-directories
        # 递归到子目录
        islink, join = path.islink, path.join
        for dirname in dirs:
            new_path = join(top, dirname)
            # Issue #23605: os.path.islink() is used instead of caching
            # entry.is_symlink() result during the loop on os.scandir() because
            # the caller can replace the directory entry during the "yield"
            # above.
            # 问题＃23605：在os.scandir（）上的循环期间，使用了os.path.islink（）而不是缓存entry.is_symlink（）结果，因为调用者可以在上面的“ yield”期间替换目录条目。
            if followlinks or not islink(new_path):
                yield from walk(new_path, topdown, onerror, followlinks)
    else:
        # Recurse into sub-directories
        # 递归到子目录
        for new_path in walk_dirs:
            yield from walk(new_path, topdown, onerror, followlinks)
        # Yield after recursion if going bottom up
        # 如果自下而上，则递归后生成
        yield top, dirs, nondirs