python入门基础总结笔记（8）—

本文链接：https://blog.csdn.net/qq_34810865/article/details/107562649

python入门基础总结笔记（8）——IO编程

学习采用的是廖雪峰老师的python教程，很精彩详细，收获很大，聊表感谢！原文链接：https://www.liaoxuefeng.com/wiki/1016959663602400

IO在计算机中指Input/Output，也就是输入和输出。

由于程序和运行时数据是在内存中驻留，由CPU这个超快的计算核心来执行，涉及到数据交换的地方，通常是磁盘、网络等，就需要IO接口。

比如你打开浏览器，访问新浪首页，浏览器这个程序就需要通过*网络IO获取新浪的网页。浏览器首先会发送数据给新浪服务器，告诉它我想要首页的HTML，这个动作是往外发数据，叫Output，随后新浪服务器把网页发过来，这个动作是从外面接收数据，叫Input。

文件读写
StringIO和BytesIO
操作文件和目录
序列化和反序列化

1.文件读写

在磁盘上读写文件的功能都是由操作系统提供的，读写文件就是请求操作系统打开一个文件对象（通常称为文件描述符），然后，通过操作系统提供的接口从这个文件对象中读取数据（读文件），或者把数据写入这个文件对象（写文件）。

1.1读文件

读文件有以下几种方法，第一种使用更简便：

 #在路径前面加r，即保持字符原始值的意思
path = r'C:\Users\Estelle\Desktop\c.txt'  

 #替换为双反斜杆
path = 'C:\\Users\\Estelle\\Desktop\\c.txt'\

#替换为正斜杆
path = 'C:/Users/Estelle/Desktop/c.txt'

读文件操作：

path = r'C:\Users\Estelle\Desktop\c.txt'  
file= open(path,'r')
a = file.read()   #read一次性读取文件全部内容
print(a)
file.close()    #打开文件后一定要关闭close

#运行结果：文件中内容
Love look not with the eyes,
but with the mind.

有一种方法可以帮我们自动调用close()方法：

with open('/path/to/file', 'r') as f:
    print(f.read())    #不需要再加close语句

如果文件很小，read()一次性读取最方便；如果不能确定文件大小，反复调用read(size)比较保险；如果是配置文件，调用**readlines()**最方便：

for line in f.readlines():
    print(line.strip()) # 把末尾的'\n'删掉

1.2 写文件

写文件操作：

path = r'C:\Users\Estelle\Desktop\c.txt'
f = open(path,'w')

#f.write('Hello,World')         #只有一行str时用
L = ['Love look not with the eyes,\n','but with the mind.\n']
f.writelines(L)   #多行str时用

f = open(path,'r')
print(f.read())
f.close()

同读文件一样，免除写close的方法如下：

with open('/Users/michael/test.txt', 'w') as f:
    f.write('Hello, world!')

练习
请将本地一个文本文件读为一个str并打印出来：

path = 'C:/Users/Estelle/Desktop/c.txt'

with open(path, 'r') as f:
    l = []
    for line in f.readlines():
         c = line.strip()  # 把末尾的'\n'删掉
         print(c)
         if c != '':
            l.append(c)   #将多行并列
    print(L)
    print(''.join(l))     ###

#运行结果
Love look not with the eyes,
but with the mind.
['Love look not with the eyes,', 'but with the mind.']
Love look not with the eyes,but with the mind.

其中最后一行用到了join（）函数：

语法：  'sep'.join(seq)

参数说明
sep：分隔符。可以为空
seq：要连接的元素序列、字符串、元组、字典
上面的语法即：以sep作为分隔符，将seq所有的元素合并成一个新的字符串

返回值：返回一个以分隔符sep连接各个元素后生成的字符串

2.StringIO和BytesIO

2.1 StringIO（内存中读写str）

很多时候，数据读写不一定是文件，也可以在内存中读写。

StringIO顾名思义就是在内存中读写str。

要把str写入StringIO，我们需要先创建一个StringIO，然后，像文件一样写入即可：

>>> from io import StringIO
>>> f = StringIO()     #创建一个StringIO
>>> f.write('hello')
5
>>> f.write(' ')
1
>>> f.write('world!')    #可多次写入
6
>>> print(f.getvalue())    #getvalue()方法用于获得写入后的str
hello world!

要读取StringIO，可以用一个str初始化StringIO，然后，像读文件一样读取：

>>> from io import StringIO
>>> f = StringIO('Hello!\nHi!\nGoodbye!')
>>> while True:   #循环
...     s = f.readline()   #一次只读取一行
...     if s == '':
...         break
...   # print(s)
...     print(s.strip())   #移除字符串头尾的空格或换行符
...
'''
Hello!

Hi!

Goodbye!
'''
Hello!
Hi!
Goodbye!

Python strip()函数

Python strip() 方法用于移除字符串头尾指定的字符（默认为空格或换行符）或字符序列。
注意：该方法只能删除开头或是结尾的字符，不能删除中间部分的字符。

语法：
str.strip([chars])

str = "00000003210Runoob01230000000"; 
print str.strip( '0' );  # 去除首尾字符 0

str2 = "   Runoob      ";   # 去除首尾空格
print str2.strip();

str = "123abcrunoob321"
print (str.strip( '12' ))  # 字符序列为 12,去除两边的12
#结果
3210Runoob0123
Runoob
3abcrunoob3

2.2 BytesIO（二进制数据读写）

BytesIO实现了在内存中读写bytes，我们创建一个BytesIO，然后写入一些bytes：

>>> from io import BytesIO
>>> f = BytesIO()
>>> f.write('中文'.encode('utf-8'))   #写入的不是str，而是经过UTF-8编码的bytes
6
>>> print(f.getvalue())
b'\xe4\xb8\xad\xe6\x96\x87'

和StringIO类似，可以用一个bytes初始化BytesIO，然后，像读文件一样读取：

>>> from io import BytesIO
>>> f = BytesIO(b'\xe4\xb8\xad\xe6\x96\x87')
>>> f.read()
b'\xe4\xb8\xad\xe6\x96\x87'

3.操作文件和目录

Python内置的os模块也可以直接调用操作系统提供的接口函数。

打开Python交互式命令行，我们来看看如何使用os模块的基本功能：

>>> import os
>>> os.name # 操作系统类型
'nt'

如果是posix，说明系统是Linux、Unix或Mac OS X，如果是nt，就是Windows系统。

练习
编写一个程序，能在当前目录以及当前目录的所有子目录下查找文件名包含指定字符串的文件，并打印出相对路径。

三种类似的方法：

函数	作用
os.walk（）	遍历目标文件
os.getcwd()	获取当前目录
print(os.path.join(root,name))	用join拼接当前目录+符合条件的文件名

import os
from os import path

def searchFile(folder, keystr):
   
    all_files = os.walk(folder)  # 用walk方法遍历目标路径，参数folder是要遍历的目标路径；这其中，all_files是一个generator
    
    for root, dirs, files in all_files:  # 我们取出其中的root（当前目录），dirs（所有目录名），files（所有文件名）
    
        for file_name in files:  # 我们遍历所有文件名，取文件名中包含关键字的；如果改成“in dirs”那就是取文件夹中包含关键字的

            if keystr.lower() in file_name.lower():  # 不分大小写

                print(path.join(root, file_name))  # 用join拼接当前目录+符合条件的文件名

if __name__ == "__main__":  # Test
    searchFile('.', 'lib')  # .是指当前目录，你也可以换成‘D:/’或‘py’



#结果
.\venv\Lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\utils\glibc.py

import os

def searchfile(keyword):

    for root,dirs,files in os.walk('.',topdown=False):

        for name in files:

            if name.find(keyword)!=-1:  #Python find() 方法检测字符串中是否包含子字符串 str ，检查是否包含在指定范围内，如果包含子字符串返回开始的索引值，否则返回-1。

                print(os.path.join(root,name))  # 用join拼接当前目录+符合条件的文件名

if __name__=='__main__':

    searchfile('test')  #搜索关键词含test的文件

#结果
.\venv\Lib\site-packages\pip-19.0.3-py3.8.egg\pip\_vendor\pytoml\test.py
.\venv\Lib\site-packages\pip-19.0.3-py3.8.egg\pip\_vendor\webencodings\tests.py

import os

def search(s):

   for root,dirs,files in os.walk(os.getcwd()):  #使用os.getcwd() 函数获取当前目录

       for name in files:

           if s in name:

              print(os.path.join(root,name))

       for name in dirs:

           if s in name:

               print(os.path.join(root,name))

if __name__=='__main__':   #test

    search('py')           #搜索关键词为py

#结果
（当前目录D:\Pythonhello\venv\Lib下所有py文件相对路径）

4.序列化和反序列化

序列化：内存中的对象转换为字节序列（字符串）
反序列化：将字节序列转换为内存中的对象

4.1 pickle模块

写文件：

>>> import pickle
>>> d = dict(name='Bob', age=20, score=88)
>>> pickle.dumps(d)    #序列化 dumps()
b'\x80\x03}q\x00(X\x03\x00\x00\x00ageq\x01K\x14X\x05\x00\x00\x00scoreq\x02KXX\x04\x00\x00\x00nameq\x03X\x03\x00\x00\x00Bobq\x04u.'

>>> f = open('dump.txt', 'wb')  #设置一个文件名，用于写入序列字符串
>>> pickle.dump(d, f)  #与文件相关，不加s
>>> f.close()

读文件：

>>> f = open('dump.txt', 'rb')
>>> d = pickle.load(f)   #与文件相关，不加s
>>> f.close()
>>> d
{'age': 20, 'score': 88, 'name': 'Bob'}

4.2 json模块

1.dumps() 序列化方法

dic={'a':'1',"b":'2'}
print(type(dic),dic)
import json
str_dic=json.dumps(dic)
print(type(str_dic),str_dic)

#结果：
<class 'dict'> {'a': '1', 'b': '2'}
<class 'str'> {"a": "1", "b": "2"}

2.loads反序列化方法

dic_d=json.loads(str_dic)
print(type(dic_d),dic_d)
#结果:
<class 'dict'> {'a': '1', 'b': '2'}

3.dump 和 load 不加 s (和文件相关的操作)

import json
dic={'a':'1',"b":'2'}
f=open('fff','w',encoding='utf-8')   #文件名可以自设定
json.dump(dic,f)  #不加s
f.close()

f=open('fff')
res=json.load(f)   #不加s
print(type(res),res)
结果：
<class 'dict'> {'a': '1', 'b': '2'}

4.内容出现中文时
当出现中文的时候：加上ensure_ascii=False，文件中显示的就是中文了：

import json
dic={'成绩':'优秀'}
print(type(dic),dic)

str_dic1=json.dumps(dic)    #显示字符
print(type(str_dic1),str_dic1)

str_dic2=json.dumps((dic),ensure_ascii=False)  #显示中文
print(type(str_dic2),str_dic2)

#结果:
<class 'dict'> {'成绩': '优秀'}    
<class 'str'> {"\u6210\u7ee9": "\u4f18\u79c0"}
<class 'str'> {"成绩": "优秀"}