Python编程-让繁琐的工作自动化（八）读写文件(导入自定义模块)_通过要求读取指定目录的数据并写入新的目录,用什么编程软件做好-CSDN博客

本文链接：https://blog.csdn.net/qq_33195791/article/details/94967789

1.文件与文件路径

2.当前工作目录

3.绝对路径与相对路径

4.用os.makedirs()创建新文件夹

9.5 用pprint.pformat()函数保存变量

文件是跨平台，跨主机，跨网络传播信息的一个重要工具，也是数据存储的最重要的工具之一。本章将学习如何使用python在硬盘上创建，读取和保存文件。

1.文件与文件路径

Windows与Linux的差异:

Windows 的文件夹风格为

>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto\\100-days'

Linux的文件夹风格为

C:/Learning/PYTHON/python-auto/100-days

<2>Windows 和OS X上是不区分大小写的，但在Linux上是区分大小写的

在Windows上，路径书写使用倒斜杠作为文件夹之间的分隔符，在OS X和Linux上，使用正斜杠作为路径分隔符。如果想要程序运行在所有操作系统上，在编写python脚本时，就要处理这种情况。

使用os.path.join(param1,param2, ...)可以解决这个问题。

将单个文件和路径上的文件夹名称的字符传递给它，os.path.join()就会返回一个文件夹路径的字符串，包含正确的系统路径分隔符。

例如：

>>> os.path.join('usr','bin','cash')
'usr\\bin\\cash'

Windows系统上返回路径中包含两个倒斜杠。

2.当前工作目录

利用os.getcwd() 可以获取当前工作路径的字符串，并可以利用os.chdir()改变他们。

>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto\\100-days'
>>> os.chdir('../')
>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto'

>>> os.chdir('./100-days')
>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto\\100-days'

如果要更改的目录不存在，Python就会显示一个错误：

>>> os.chdir('../c++')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [WinError 2] 系统找不到指定的文件。: '../c++'

3.绝对路径与相对路径

有2种方法指定一个文件路径。

1)"绝对路径"，总是从根文件夹开始

2)"相对路径"，它相对于程序的当前工作目录

还有 (.) 和 (..)，它们不是真正的文件夹，而是可以在路径中使用的特殊名称。单个句点 (.)用作文件夹目录名称时，是“这个目录”的缩写，两个句点(..)意思是上一层文件夹，可以叠加使用（../）

4.用os.makedirs()创建新文件夹

可以使用os.makedirs()创建新的目录，在交互式环境测试：

>>> os.makedirs('.\\mkdir\\test\\python')

这不仅创建python文件夹，并且在当前目录下创建了mkdir/test文件夹，也就是说os.makedirs()将创建所有必要的中间文件夹。

5.os.path 模块：

os.path模块包含了许多与文件名和文件路径相关的函数，例如：os.path.join()可以构建所有操作系统上都有效的路径。因为os.path是os模块中的模块，所以只要执行import os就可以导入它。

6.处理绝对路径和相对路径

os.path()模块提供了一些函数，返回一个相对路径的绝对路径，以及检查给定的路径是否为绝对路径。

6.1 获取绝对路径和相对路径

<1>os.path.abspath(path)将返回参数的绝对路径的字符串，这是将相对路径转换为绝对路径的简便方法。

<2> os.path.isabs(path) ，如果参数是一个绝对路径，就返回True,如果是一个相对路径，就返回False。

<3> os.path.relpath(path，start)，将返回从start到path路径的相对路径的字符串。如果没有提供start，就使用当前工作目录作为开始路径。

>>> os.path.abspath('.')
'C:\\Learning\\PYTHON\\python-auto\\100-days'
>>> os.path.abspath('.\\mkdir')
'C:\\Learning\\PYTHON\\python-auto\\100-days\\mkdir'
>>> os.path.isabs('.')
False
>>> os.path.isabs(os.path.abspath('.'))
True

os.path.relpath()

>>> os.path.relpath('C:\\Windows','C:\\')
'Windows'
>>> os.path.relpath('C:\\Windows','.\\')
'..\\..\\..\\..\\Windows'
>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto\\100-days'

<4> os.path.dirname(path) 将返回一个字符串，它包含path参数中最后一个斜杠之前的所有内容。调用os.path.basename(path) 将返回一个字符串，它包含path参数中最后一个斜杠只有的所有内容。

一个路径的名称包含目录名称和基本名称：如
C:\Windows\System32\calc.exe

>>> path = 'C:\\Windows\\System32\\calc.exe'
>>> os.path.basename(path)
'calc.exe'
>>> os.path.dirname(path)
'C:\\Windows\\System32'

如果需要同时获取一个路径的目录名称和基本名称，就可以调用os.path.split()，获得者两个字符串的元组。os.path.split()的作用是按照路径将文件名和路径分开。

>>> path = 'C:\\Windows\\System32\\calc.exe'
>>> os.path.split(path)
('C:\\Windows\\System32', 'calc.exe')

注意，可以调用os.path.dirname(path)和os.path.basename(path)将他们的返回值放在一个元组中，从而得到同样的元组。

>>> path = 'C:\\Windows\\System32\\calc.exe'
>>> os.path.dirname(path),os.path.basename(path)
('C:\\Windows\\System32', 'calc.exe')

但是如果需要两个值，os.path.split()是很好的快捷方式。

注意，os.path.split()不会接受一个文件路径并返回每个文件夹的字符串列表，如果需要这样，请使用split()字符串方法，并根据os.path.sep中的字符串进行分割。

>>> path = 'C:\\Windows\\System32\\calc.exe'
>>> path.split(os.path.sep)
['C:', 'Windows', 'System32', 'calc.exe']

split()字符串方法返回一个列表，包含该路径的所有部分。如果向它传递os.path.sep，就能在所有操作系统上工作。例如以下代码可以在windows和Linux系统运行。

>>> cwd = os.getcwd()
>>> cwd.split(os.path.sep)
['C:', 'Learning', 'PYTHON', 'python-auto', '100-days']

7. 查看文件列表和文件大小

<1> os.path.getsize(path)将返回path参数中文件的字节数。

<2> os.listdir(path)将返回文件名字符串的列表，包含path参数中的每个文件。

>>> os.listdir(os.getcwd())
['100-days.zip', 'CHONG_DIR.txt', 'datatype.py', 'day6', 'day9_object_duotai.py', 'day9_object_pro.py', 'dirregex.py', 'fileproc.py', 'gui1.py', 'mkdir', 'Narcissistic.cpp', 'Narcissistic.py', 'pthread.cpp', 'regex.py', 'tar.txt', 'TAR_DIR.txt', 'UltramanMonsters.py']
>>> os.path.getsize(os.getcwd())
4096
>>> os.path.getsize(str(os.getcwd())+'\\Narcissistic.cpp')
1788

#! /usr/bin/python3

import os

totalsize1 = 0
for filename in os.listdir('C:\\Learning\\PYTHON\\python-auto\\100-days'):
    totalsize1 += os.path.getsize(os.path.join('C:\\Learning\\PYTHON\\python-auto\\100-days',filename))

print('totalsize1 = %d' %(totalsize1))

totalsize2 = 0
for filename in os.listdir(os.getcwd()):
    totalsize2 += os.path.getsize(os.path.join(os.getcwd(),filename))

print('totalsize2 = %d' %(totalsize2))


totalsize3 = 0
for filename in os.listdir(os.getcwd()):
    totalsize3 += os.path.getsize(filename)

print('totalsize3 = %d' %(totalsize3))

结果：

totalsize1 = 59598
totalsize2 = 59598
totalsize3 = 59598

建议还是加上os.path.join()

8. 检查路径有效性

1> 如果path参数路径所制定的文件或文件夹存在，调用 os.path.exists(path)将返回True,否则返回False

2> 如果path参数存在，并且是一个文件，调用os.path.isfile()将返回True，否则返回Flase

3> 如果path参数存在，并且是一个文件夹，调用os.path.isdir(path)将返回True

>>> import os
>>> os.path.isdir(os.getcwd())
True
>>> os.path.exists(os.path.join(os.getcwd(),'lily'))
False
>>> os.path.exists(os.path.join(os.getcwd(),'Narcissistic.py'))
True
>>> os.path.isfile(os.path.join(os.getcwd(),'Narcissistic.py'))
True
>>> os.path.isdir(os.path.join(os.getcwd(),'Narcissistic.py'))
False
>>> os.getcwd()
'C:\\Learning\\PYTHON\\python-auto\\100-days'
>>> os.listdir(os.getcwd())
['100-days.zip', 'CHONG_DIR.txt', 'datatype.py', 'day6', 'day9_object_duotai.py', 'day9_object_pro.py', 'dirregex.py', 'fileproc.py', 'gui1.py', 'mkdir', 'Narcissistic.cpp', 'Narcissistic.py', 'pthread.cpp', 'regex.py', 'tar.txt', 'TAR_DIR.txt', 'UltramanMonsters.py']
>>>

9. 文件读写过程

9.1 用open()函数打开文件

Python open() 方法用于打开一个文件，并返回文件对象，在对文件进行处理过程都需要使用到这个函数，如果该文件无法被打开，会抛出 OSError。

注意：使用 open() 方法一定要保证关闭文件对象，即调用 close() 方法。

open() 函数常用形式是接收两个参数：文件名(file)和模式(mode)。

open(file, mode='r')

完整的语法格式为：

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

参数说明:

file: 必需，文件路径（相对或者绝对路径）。
mode: 可选，文件打开模式
buffering: 设置缓冲
encoding: 一般使用utf8
errors: 报错级别
newline: 区分换行符
closefd: 传入的file参数类型
opener:

mode 参数有：

模式	描述
t	文本模式 (默认)。
x	写模式，新建一个文件，如果该文件已存在则会报错。
b	二进制模式。
+	打开一个文件进行更新(可读可写)。
U	通用换行模式（不推荐）。
r	以只读方式打开文件。文件的指针将会放在文件的开头。这是默认模式。
rb	以二进制格式打开一个文件用于只读。文件指针将会放在文件的开头。这是默认模式。一般用于非文本文件如图片等。
r+	打开一个文件用于读写。文件指针将会放在文件的开头。
rb+	以二进制格式打开一个文件用于读写。文件指针将会放在文件的开头。一般用于非文本文件如图片等。
w	打开一个文件只用于写入。如果该文件已存在则打开文件，并从开头开始编辑，即原有内容会被删除。如果该文件不存在，创建新文件。
wb	以二进制格式打开一个文件只用于写入。如果该文件已存在则打开文件，并从开头开始编辑，即原有内容会被删除。如果该文件不存在，创建新文件。一般用于非文本文件如图片等。
w+	打开一个文件用于读写。如果该文件已存在则打开文件，并从开头开始编辑，即原有内容会被删除。如果该文件不存在，创建新文件。
wb+	以二进制格式打开一个文件用于读写。如果该文件已存在则打开文件，并从开头开始编辑，即原有内容会被删除。如果该文件不存在，创建新文件。一般用于非文本文件如图片等。
a	打开一个文件用于追加。如果该文件已存在，文件指针将会放在文件的结尾。也就是说，新的内容将会被写入到已有内容之后。如果该文件不存在，创建新文件进行写入。
ab	以二进制格式打开一个文件用于追加。如果该文件已存在，文件指针将会放在文件的结尾。也就是说，新的内容将会被写入到已有内容之后。如果该文件不存在，创建新文件进行写入。
a+	打开一个文件用于读写。如果该文件已存在，文件指针将会放在文件的结尾。文件打开时会是追加模式。如果该文件不存在，创建新文件用于读写。
ab+	以二进制格式打开一个文件用于追加。如果该文件已存在，文件指针将会放在文件的结尾。如果该文件不存在，创建新文件用于读写。

9.2 读取文件内容

file.read([size])

从文件读取指定的字节数，如果未给定或为负则读取所有。

file.readline([size])

读取整行，包括 "\n" 字符。

file.readlines([sizeint])

读取所有行并返回列表，若给定sizeint>0，返回总和大约为sizeint字节的行, 实际读取值可能比 sizeint 较大, 因为需要填充缓冲区。

打开名为sonnet.txt的文件，输入以下内容，并保存。

When,in disgrace with fortune and men's eyes,
I all alone bewee my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon myself and curse my face.

用以下语句读取sonnet.txt

sonnetFile = open('sonnet.txt')
sonnetContent = sonnetFile.readlines()
print(sonnetContent)

输出：

["When,in disgrace with fortune and men's eyes,\n", 'I all alone bewee my outcast state,\n', 'And trouble deaf heaven with my bootless cries,\n', 'And look upon myself and curse my face.']

可见，readlines返回了一个字符串列表，每一行作为一个列表元素，通常更方便处理。

9.3 写入文件

file.write(str)

将字符串写入文件，返回的是写入的字符长度。

file.writelines(sequence)

向文件写入一个序列字符串列表，如果需要换行则要自己加入每行的换行符。

def wirteBaconFile():
    baconFile = open('bacon.txt','w')
    size = baconFile.write('Hello world\n')
    if 0 < size:
        print('write %d characters to  baconFile success' %(size))
    baconFile.close()

    baconFile = open('bacon.txt','a')
    size = baconFile.write('Bacon is no a vegetable.\n')
    if 0 < size:
        print('write %d characters to  baconFile success' %(size))
    baconFile.close()

    baconFile = open('bacon.txt')
    content = baconFile.read()
    baconFile.close()
    print(content)

输出：

write 12 characters to baconFile success
write 25 characters to baconFile success
Hello world
Bacon is no a vegetable.

注意，write()方法不会像print()函数那样，在字符串的末尾添加换行，必须自己添加换行符号。

9.4.用shelve模块保存变量

利用shelve模块，可以将python程序中的变量保存到二进制文件中，这样，程序就可以从硬盘中恢复变量的数据。shelve模块让你在程序中添加“保存”和“打开”功能。

import shelve
def shelveModle():
    if  False == os.path.exists(os.path.join(os.getcwd(),'shelve')) :
        os.makedirs('.\\shelve')
    if True == os.path.exists(os.path.join(os.getcwd(),'shelve')):
        os.chdir(os.path.join(os.getcwd(),'shelve'))
        shelfFile = shelve.open('mydata')
        cats =['Zophie','Pooka','Simon']
        shelfFile['cats'] = cats
        shelfFile.close()
        print('save cats to shelve success')
    else:
        print('no such file or path .\\shelve')
        return None

    shelfFile = shelve.open('mydata')
    print('type of shlfFile is ', type(shelfFile))
    cats = list(shelfFile['cats'])
    shelfFile.close()
    print('type of shlef is ',type(cats))
    print('recove from shlef , cats = ',cats)

输出：

save cats to shelve success
shelfFile saved valuse is: ['Zophie', 'Pooka', 'Simon']

这里我们打开shelf文件，获取了保存的正确数据，然后close。

就像字典一样，shelf值有keys()和valuse()方法，返回shelf中键和值得类似列表的值。因为这些值不是真正的列表，所以应该将他们传递给list()函数，取得列表的形式。

实际在Python3中，使用变量接受shelfFile['cats'])后，该变量就是list类型。不过建议还是加上list()取得列表的值。

9.5 用pprint.pformat()函数保存变量

打印函数pprint.pprint()将列表或者字典中的内容比较清晰的打印到屏蔽。而pprint.pfomat()函数将返回同样的文本字符串，但不是打印它。这个字符串不仅是易于阅读的格式，同时也是语法上正确的python代码。假定你有一个字典，保存在一个变量中，你希望保存这个变量和它的内容，以便将来使用。pprint.pformat()函数提供一个字符串，你可以将它写入.py文件，该文件将成为你自己的模块，使得你在需要读取变量的值得时候导入该模块。

例子：

import pprint
workpath = os.getcwd()
def pformatSave():
    if False == os.path.exists(os.path.join(os.getcwd(),'pformat')) :
        os.makedirs('.\\pformat')
    if True == os.path.exists(os.path.join(os.getcwd(),'pformat')):
        os.chdir(os.path.join(os.getcwd(),'pformat'))
        dogs = [{'name': 'Zopie','desc': 'chubby'},{'name':'Pooka','desc':'fluffy'},{'name': 'zs','desc':'faker'}]
        dogstr = pprint.pformat(dogs)
        filePy = open('myDogs.py','w')
        size = filePy.write('dogs = ' + dogstr + '\n') #前面的dogs命名决定后续myDogs模块的变量名，此处是直接按格式写入.py文件的
        if 0 < size:
            print('%d characters write to myDOgs.py' %(size))
        
        filePy.close()

#在模块下创建空文件 __init__.py
def ImportSubdir():
    ''' __init__.py '''
    os.chdir(os.path.join(workpath,'pformat'))
    print('current dir is',os.getcwd())
    tmfile = open('__init__.py','w')
    tmfile.close()

#导入自定义模块
'''
导入自定义模块的方式
1> import sys
2> import os
3>sys.path.append(path) 添加该模块所在路径
4> 在该模块目录下创建空文件名称为：__init__.py
'''
import sys
sys.path.append(os.path.join(workpath,'pformat')) #添加自己指定的搜索路径
from  pformat import myDogs  #此种导入方式可以直接用Mydogs

def pformatImport():
    print('readding myDogs from pprint.pformat saved')
    print('myDogs.dogs =', myDogs.dogs) 
    print('dogs[2] = ', myDogs.dogs[2])
    print("myDogs.dogs[1]['desc']",myDogs.dogs[1]['desc'])
    print("myDogs.dogs[1]['name']",myDogs.dogs[1]['name'])



if __name__ == '__main__':
    #totalFileSize()
    #readSonnet()
    #wirteBaconFile()
    #shelveModle()
    pformatSave()
    ImportSubdir()
    pformatImport()

输出：

117 characters write to myDOgs.py
current dir is C:\Learning\PYTHON\python-auto\100-days\pformat
readding myDogs from pprint.pformat saved
myDogs.dogs = [{'desc': 'chubby', 'name': 'Zopie'}, {'desc': 'fluffy', 'name': 'Pooka'}, {'desc': 'loser', 'name': 'zs'}]
dogs[2] = {'desc': 'loser', 'name': 'zs'}
myDogs.dogs[1]['desc'] fluffy
myDogs.dogs[1]['name'] Pooka

这里用到了导入自定义模块:

#导入自定义模块

'''

导入自定义模块的方式

1> import sys

2> import os

3>sys.path.append(path) 添加该模块所在路径

4> 在该模块目录下创建空文件名称为：__init__.py

'''