Python 第十五周文件、目录、流

最新推荐文章于 2024-07-19 16:36:18 发布

colin3516

最新推荐文章于 2024-07-19 16:36:18 发布

阅读量675

点赞数

分类专栏： python 文章标签： python 语法

本文链接：https://blog.csdn.net/colin3516/article/details/46424373

版权

python 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

第七章文件的处理

7.1 文件的基本操作

文件通常用于在存储数据或应用系统的参数。

7.1.1 文件的打开或创建

文件的打开或创建可以使用内联模块的函数 file( ) 。

声明： file ( name [ , mode [ , buffering ] ] ) > file object

参数 name 是被打开的文件名称，如果文件 name 不存在，file ( ) 将创建名为 name 的文件，然后再打开。

参数 mode 是文件的打开模式。具体如下图

参数 buffering 设置缓存模式。0 表示不缓存；1 表示行缓冲；如果大于1 则表示缓冲区的大小，以字节为单位。

file ( ) 返回 1 个file 对象，file 对象可以对文件进行各种操作

注意：文件中文支持时： import codecs
f = codecs.open(filename,mode,encoding)

import codecs
f=codecs.open(u'文件.txt','w','utf-8')

f.write(u'用python做些事\n')
f.write(u'黑板客\n')
f.write(u'网易云课堂\n')
f.close()

打开创建路径文件：
<pre name="code" class="python">path = '/home/colin/colin'
f = open ('%s' % path +'/1.txt','w')

文件处理的一般分为以下三步：

1、创建并打开文件，使用 file ( ) 函数返回 1 个file 对象

2、调用 file 对象的 read () 、write （）等方法处理文件。

3、调用 close () 关闭文件，释放file 对象占用的资源。

# 创建文件
context = '''hello world
hello China            
'''
f = file('hello.txt', 'w')   # 打开文件    还可以用 open ( ) 代替 file ( )
f.write(context)             # 把字符串写入文件
f.close()                    # 关闭文件

7.1.2 文件的读取

1、按行读取方式 readline ( )，每次读取文件中的一行，需要使用永真表达式循环读取文件。

f = open('hello.txt')
while True:
    line = f.readline(2)
    if line:
        <strong>print line,    # 加 ，号，打印空格</strong>
    else:
        break
f.close()

2、多行读取方式 readlines ( ) ,需要通过循环访问返回列表中的元素。

f = open ('hello.txt')
lines = f.readlines()
for line in lines:
    print line,

3、一次性读取方式 read () ，将从文件中读出所有内容，并赋值给1个字符串变量。

f = open ('hello.txt')
contxt = f.read (5)      #读取前5个字节的内容，此时文件指针移到第5个字节处
print contxt
print f.tell()           #读取字节数
contxt = f.read (5)      #从指针第5个字节处继续读取，即第6个字节到第10个字节的内容。
print contxt
print f.tell()

注意：file 对象内部将记录文件指针的位置，以便下次操作。只要file对象没有执行close()方法，文件指针就不会释放。

7.1.3 文件的写入

write () 方法把字符串定入文件已在 7.1.1 例中演示过了。

writelines () 方法可以把列表中存储的内容写入文件。

# 使用writelines()读文件
f = file("hello.txt", "w+")           # w+ 是 读写模式
li = ["hello world\n", "hello China\n"]      #  \n   用于换行
f.writelines(li)
f.close()    

# 追加新的内容到文件
f = file("hello.txt", "a+")         #  a+ 是读写模式，但是在文件结尾追加新内容
new_context = "goodbye"
f.write(new_context)
f.close()

7.1.4 文件的删除

文件的删除需要使用os模块和os.path模块。

判断文件是否存在需要使用os.path模块，os.path模块用于处理文件和目录的路径。

import os

file("hello.txt", "w")
if os.path.exists("hello.txt"):          
    os.remove("hello.txt")

删除操作是在os模块下进行的，因此用os.remove 。而前面的读写是在文件file()类下进行的，故用：file.close() 或 f.close

7.1.5 文件的复制

file 类没有提供直接复制文件的方法，但是可以使用read() 、write() 方法模拟实现文件拷贝。

方法一：

# 使用read()、write()实现拷贝
# 创建文件hello.txt
src = file("hello.txt", "w")
li = ["hello world\n", "hello China\n"]
src.writelines(li) 
src.close()
# 把hello.txt拷贝到hello2.txt
src = file("hello.txt", "r")           
dst = file("hello2.txt", "w")
dst.write(src.read())             #如果要实现 src.read()，就要利用只读方式读取文件，即上两步!
src.close()
dst.close()

方法二：

shutil 模块里提供了一些用于复制文件、目录的函数。copyfile ( scr,dst ) 函数可以实现文件的拷贝，参数scr 表示源文件的路径，scr是字符串类型。参数dst表示目标文件路径，dst 是字符串类型。move ( )函数可以实现文件的剪切。

# shutil模块实现文件的拷贝
import shutil

shutil.copyfile("hello.txt","hello2.txt")
shutil.move("hello.txt","../")                  #移动到目录的父目录，即上一层目录。 "."表示当前目录，“..”表示父目录
shutil.move("hello2.txt","hello3.txt")

7.1.6 文件的重命名

os 模块的rename()可以对文件或目录进行重命名。

import os
li = os.listdir('.')
print li
if 'hello.txt' in li:
    os.rename('hello.txt','hi.txt')
elif 'hi.txt'in li:
    os.rename('hi.txt','hello.txt')

#修改后缀名
#方法一：
import os
li = os.listdir('.')
for liname in li:
    ms = liname.find('.')
    if liname[ms+1:] == 'txt':
        newname = liname[:ms+1] + 'html'
        os.rename(liname,newname)

splitext ( ) 返回1个列表，列表中的第1个元素表示文件名，第2个元素表示文件的后缀名（ . txt ）。 别忘了后缀名前面的点！！

import os
lk = os.listdir('.')
print lk
for oldname in lk:
    lm = os.path.splitext(oldname)
    if lm[1] == '.html':
        newsname = lm[0] + '.txt'
        os.rename(oldname,newsname)
print os.listdir('.')

7.1.7 文件内容的查找和替换

1、文件内容的查找

import re
# 文件的查找
f1 = file("hello.txt", "r")
count = 0
for s in f1.readlines():    
    li = re.findall("hello", s)
    if len(li) > 0:
        count = count + li.count("hello")
print "查找到" + str(count) + "个hello"
f1.close()

2、文件的替换

# 文件的替换
f1 = file("hello.txt", "r")
f2 = file("hello3.txt", "w")
for s in f1.readlines():    
    f2.write(s.replace("hello", "hi"))
f1.close()
f2.close()

7.1.8 文件的比较

python 提供了模块difflib用于实现序列、文件的比较。如果要列出两个文件的异同，可以使用difflib模块的 SequenceMatcher 类实现。

第一步，先生成1个SequenceMatcher对象。声明如下：class SequenceMatcher ( [ isjunk [ , a [ , b ] ] ] ) ,isjunk 表示比较过程中是否匹配指定的字符或字符串，a,b 表示待比较的两个序列。

第二步，调用方法 get_opcodes()可以返回两个序列的比较结果，将返回1个元组（tag , i1,i2, j1 , j2 ) .tag 表示序列分片的比较结果，i1，i2表示序列a的索引，j1,j2表示序列b的索引。

import difflib

f1 = file("hello.txt", "r")
f2 = file("hi.txt", "r")
src = f1.read()
dst = f2.read()
print src
print dst
s = difflib.SequenceMatcher(lambda x: x == "", src, dst)     # 忽略 hi.txt 中的换行符。
for tag, i1, i2, j1, j2 in s.get_opcodes():                  #获取比较的结果
    print ("%s src[%d:%d]=%s dst[%d:%d]=%s" % \
    (tag, i1, i2, src[i1:i2], j1, j2, dst[j1:j2]))

7.1.9 配置文件的访问

在应用程序中，通常使用配置文件定义一些参数。

1、读取配置文件的内容

Python 标准库中的ConfigParser 模块用于解析配置文件。ConfigParser 模块的ConfigParser 类可读取ini文件内容。

# 读配置文件
import ConfigParser

config = ConfigParser.ConfigParser()             #创建1个ConfigParser 对象 config
config.read("ODBC.ini")
sections = config.sections()                    # 返回所有的配置块
print "配置块：", sections
o = config.options("ODBC 32 bit Data Sources")  # 返回所有的配置项 ,  options() 返回 块下各配置项的标题
print "配置项：", o
v = config.items("ODBC 32 bit Data Sources")      # items<span style="font-family: Arial, Helvetica, sans-serif;">() 返回 块下各配置项的内容</span>
print "内容：", v

</pre><pre name="code" class="python"><pre name="code" class="python"># 根据配置块和配置项返回内容    调用get( ) 方法获取配置项的内容
access = config.get("ODBC 32 bit Data Sources", "MS Access Database")
print access
excel = config.get("ODBC 32 bit Data Sources", "Excel Files")
print excel
dBASE = config.get("ODBC 32 bit Data Sources", "dBASE Files")
print dBASE

2、将设置的配置项目写入配置文件

首先调用add_section() 方法添加1个新的配置块，然后调用set()方法，设置配置项目，最后写入配置文件ODBC.ini 即可

# 写配置文件
import ConfigParser

config = ConfigParser.ConfigParser()
config.add_section("ODBC Driver Count")             # 添加新的配置块
config.set("ODBC Driver Count", "count", 2)         # 添加新的配置项
f = open("ODBC.ini", "a+")
config.write(f)                                 
f.close()

3、修改配置文件

需要先读取ODBC.ini文件，然后调用set()方法设置指定配置块下某个配置项的值，最后写入配置文件ODBC.ini

# 修改配置文件
import ConfigParser
config = ConfigParser.ConfigParser()
config.read("ODBC.ini")
config.set("ODBC Driver Count", "count", 3)
f = open("ODBC.ini", "r+")
config.write(f)     
f.close()

4、配置块和配置项的删除

# 删除配置文件
import ConfigParser
config = ConfigParser.ConfigParser()
config.read("ODBC.ini")
config.remove_option("ODBC Driver Count", "count")  # 删除配置项  调用remove_section()方法
config.remove_section("ODBC Driver Count")          # 删除配置块  调用remove_option()方法
f = open("ODBC.ini", "w+")
config.write(f)     
f.close()

7.2 目录的基本操作

7.2.1 目录的创建与删除

os模块提供了针对目录进行操作的函数。如图

import os
os.mkdir('hello')                  #如果已有hello目录，则不能创建
os.rmdir('hello')
os.makedirs('hello/world')         #创建多级目录，hello文件下的world文件
os.removedirs('hello/world')

7.2.2 目录的遍历

目录的遍历有三种使用方法：

方法一：递归函数

# 递归遍历目录
import os
def VisitDir(path):                  #把路径作为参数
    li = os.listdir(path)            #返回当前路径下的所有目录名和文件名
    for p in li:
        pathname = os.path.join(path, p)     # join() 获取文件的完整路径，并赋值给pathname
        if not os.path.isfile(pathname):    
            VisitDir(pathname)
        else:
            print pathname

if __name__ == "__main__":          #__name__ 用于判断当前模块是否是程序的入口，如果当前程序正在被使用时：__name__ 的值是 __main__
    path = r"F:\python"
    VisitDir(path)

方法二：os.path.walk ( )

walk ( top , func , arg ) 参数top 表示需要遍历的目录树的路径，参数func 表示回调函数。参数arg 是传递给回调函数func的元组。

所谓回调函数，是作为某个函数的参数使用，当某个事件触发时，程序将调用定义好的回调函数处理某个任务。回调函数必须提供3个参数：第1 个参数为walk() 的参数arg ，第2个参数表示目录列表，第3个参数表示文件列表。回调函数的第1个参数必须是arg，为回调函数提供处理参数，参数arg可以为空元组。

import os , os.path

def visitdir (arg,dirname,names):
    for filepath in names:
        print os.path.join(dirname,filepath)

path = r"F:\cx"                                    #  r  ?
os.path.walk(path,visitdir,())

方法三：os.walk()

os.walk() 比 os.path.walk() 效率高，而且不需要使用回调函数。

walk( top , topdown = Ture ,onerror = None )

参数 top 表示需要遍历的目录树的路径

参数topdown 的默认值是 Ture ,表示首先返回目录树下的文件，然后再遍历目录树的子目录。topdown为False 时，则表示先遍历目录树的子目录，返回子目录中的文件，最后返回根目录下的文件。

参数onerror 的默认值是 None ,表示忽略文件遍历时产生的错误。如果不为空，则提供1个自定义函数提示错误信息后继续遍历或抛出异常中止遍历。

函数返回1个含有3个元素的元组，3个元素分别是每次遍历的路径名、目录列表和文件列表。

import os
def visitdir(path):
    for root,dirs,files in os.walk(path):
        for filepath in files:
            print os.path.join(root,filepath)

path = "G:\KuGou"
visitdir(path)

注：os.path.walk() 输出的结果包含目录路径，而os.walk () 输出的结果只有文件路径。

7.3 文件和流

为了有效的表示数据的读写，把文件、外设、网络连接等数据传输抽象的表示为“流”。数据的传输就好像流水一样，从一个容器到另一个容器。程序中数据的传输也是如此。

7.3.1 Pyhton的流对象

sys 模块提供了3种基本的流对象——stdin 标准输入、stdout 标准输出和 stderr 错误输出。流对象可以使用File类的属性和方法，与文件处理方式相同。

1、stdin （标准输入)

只是利用sys.stdin 替换文件对象，故流对象的使用方法和文件对象相同

import sys
sys.stdin = open ('hello3.txt','r')
for lines in sys.stdin.readlines():            # readlines ()  是方法，方法后别忘记加（），属性则不用
    print lines

2、stdout (标准输出）

import sys
sys.stdout  = open('hello3.txt','a')       #  a 以追加模式打开文件
print 'goodbye'                            # 向文件hello3.txt 输出字符串“goodbye”
sys.stdout.close()

3、stderr (错误输出）

import sys,time
sys.stderr = open('record.log','a')
f = open ('./hi.txt','r')
t = time.strftime('%Y-%m-%d%X',time.localtime())
context = f.read()
if context:
    sys.stderr.write(t+" " + context)
else:
    raise Exception,t + '异常信息'

7.4 文件处理示例——文件属性浏览程序

设计一个浏览文件属性的程序。设计一个函数showFileProperties(path) , 查看path目录下所有文件的属性。大致分为以下三个步骤：

1、遍历path指定的目录，获取每个子目录的路径【通过os.walk() 】

2、遍历子目录下的所有文件，并返回文件的属性列表。【通过os模块的函数star()】

3、分解属性列表，对属性列表的值进行格式化输出。【通过列表的索引获取文件的各个属性】

#!/usr/bin/python
# -*- coding: UTF-8 -*-

def showFileProperties(path):
    '''显示浏览文件的大小，创建时间，访问时间，修改时间'''
    import os,time
    for root,dirs,files in os.walk(path,True):
        print "位置:" + root
        for filename in files:
            state = os.stat(os.path.join(root,filename))
            info = "文件名：" + filename + " "
            info = info + '大小：' + ("%d" % state[-4]) + " "
            t = time.strftime('%Y-%m-%d%X',time.localtime(state[-1]))
            info = info + '创建时间：' + t + " "
            t = time.strftime('%Y-%m-%d%X',time.localtime(state[-2]))
            info = info + '最后修改时间：' + t + " "
            t = time.strftime('%Y-%m-%d%X',time.localtime(state[-3]))
            info = info + '最后访问时间：' + t + " "
            print info
if __name__ == "__main__":
    path = r"F:\FoxitPDFReader"
    showFileProperties(path)

注意：os.stat() 的参数必须是绝对路径。因此，先调用os.path.join(root,filename)连接文件的路径和文件名。

colin3516

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python 第十五周文件、目录、流

第七章文件的处理7.1 文件的基本操作文件通常用于在存储数据或应用系统的参数。7.1.1 文件的打开或创建文件的打开或创建可以使用内联模块的函数 file( ) 。声明： file ( name [ , mode [ , buffering ] ] ) > file object 参数 name 是被打开的
复制链接

扫一扫