python读取文件和包

最新推荐文章于 2024-07-06 00:50:12 发布

milliechu

最新推荐文章于 2024-07-06 00:50:12 发布

阅读量1.4k

点赞数 1

文章标签： python

使用 open 读取文件

.read()
- 使用 open 不需要任何包就可以读取文件
- r表示read
- 如果文件和 notebook 在同一个文件夹里则只需要写文件名就可以了，不需要填上完整的路径。
- file.read 表示读取这个文件内所有内容

file = open('data.txt','r')
print(file.read())             #读取这个文件内所有内容

,A,B,C,D
0,foo,one,small,1
1,foo,one,large,2
2,foo,one,large,8
3,foo,two,small,3
4,foo,two,small,3
5,bar,one,large,4
6,bar,one,small,5
7,bar,two,small,6
8,bar,two,large,7

file = open('data.txt','r')
print(file.read(5))           #读取这个文件内前五个字符

,A,B,

.readlines()
- 如果把.read 改成.readlines 会出现不同的效果:
- 把所有东西放在一个列表里，里面的每一项等于原来文件中的每一行，\表示换行。读取所有内容包括格式。它能够读取指定某一行的内容，.read 不行

file = open('data.txt','r')
print(file.readlines())          #以列表的形式读取全部内容

[',A,B,C,D\n', '0,foo,one,small,1\n', '1,foo,one,large,2\n', '2,foo,one,large,8\n', '3,foo,two,small,3\n', '4,foo,two,small,3\n', '5,bar,one,large,4\n', '6,bar,one,small,5\n', '7,bar,two,small,6\n', '8,bar,two,large,7\n']

file = open('data.txt','r')
print(file.readlines()[5])         #读取指定的第5行内容

4,foo,two,small,3

在读取数据时对里面的内容进行操作(使用 for 循环)

file = open('data.txt','r') 
i =1
for line in file:
    print('read line',i) 
    i = i+1 
    print(line)              #自动换行了

read line 1
,A,B,C,D

read line 2
0,foo,one,small,1

read line 3
1,foo,one,large,2

read line 4
2,foo,one,large,8

read line 5
3,foo,two,small,3

read line 6
4,foo,two,small,3

read line 7
5,bar,one,large,4

read line 8
6,bar,one,small,5

read line 9
7,bar,two,small,6

read line 10
8,bar,two,large,7

读取其他形式的文件
- csv

file = open('data.csv','r') 
i =1
for line in file:
    print('read line',i) 
    i = i+1 
    print(line)

read line 1
,A,B,C,D

read line 2
0,foo,one,small,1

read line 3
1,foo,one,large,2

read line 4
2,foo,one,large,8

read line 5
3,foo,two,small,3

read line 6
4,foo,two,small,3

read line 7
5,bar,one,large,4

read line 8
6,bar,one,small,5

read line 9
7,bar,two,small,6

read line 10
8,bar,two,large,7

写入一个文件

w 表示 write
一定要写 file.close()表示已经写完了
\n 表示换行

file = open('hello.txt','w')
file.write('Hello World!')
file.close()

使用 pandas 来读取文件

用 open 打开 txt 文件最方便，有大量数据的话一般使用 pandas 来读取
使用 pandas 能够还原数据写在 csv 中表格的样子。
有一个多出来的一列Uname:0,是因为pandas会给一个index而表格本身也有一个 index，如果想消掉这个，加上 index_col=0

import pandas as pd
df = pd.read_csv('data.csv')
df

	Unnamed: 0	A	B	C	D
0	0	foo	one	small	1
1	1	foo	one	large	2
2	2	foo	one	large	8
3	3	foo	two	small	3
4	4	foo	two	small	3
5	5	bar	one	large	4
6	6	bar	one	small	5
7	7	bar	two	small	6
8	8	bar	two	large	7

import pandas as pd
df = pd.read_csv('data.csv',index_col=0)
df

	A	B	C	D
0	foo	one	small	1
1	foo	one	large	2
2	foo	one	large	8
3	foo	two	small	3
4	foo	two	small	3
5	bar	one	large	4
6	bar	one	small	5
7	bar	two	small	6
8	bar	two	large	7

读取excel文件

import pandas as pd
df = pd.read_excel('data.xlsx')
df

	A	B	C	D
0	foo	one	small	1
1	foo	one	large	2
2	foo	one	large	8
3	foo	two	small	3
4	foo	two	small	3
5	bar	one	large	4
6	bar	one	small	5
7	bar	two	small	6
8	bar	two	large	7

读取txt文件

import pandas as pd
df = pd.read_table('data.txt')
df

	,A,B,C,D
0	0,foo,one,small,1
1	1,foo,one,large,2
2	2,foo,one,large,8
3	3,foo,two,small,3
4	4,foo,two,small,3
5	5,bar,one,large,4
6	6,bar,one,small,5
7	7,bar,two,small,6
8	8,bar,two,large,7

import pandas as pd
df = pd.read_table('data.txt',index_col=0)          #删掉了第一列的索引
df


,A,B,C,D
0,foo,one,small,1
1,foo,one,large,2
2,foo,one,large,8
3,foo,two,small,3
4,foo,two,small,3
5,bar,one,large,4
6,bar,one,small,5
7,bar,two,small,6
8,bar,two,large,7

import pandas as pd
df = pd.read_table('data.txt',sep = ',',index_col=0)          #删掉了第一列的索引&用逗号来分隔
df

	A	B	C	D
0	foo	one	small	1
1	foo	one	large	2
2	foo	one	large	8
3	foo	two	small	3
4	foo	two	small	3
5	bar	one	large	4
6	bar	one	small	5
7	bar	two	small	6
8	bar	two	large	7

存储文件

df.to_excel('dat.xlsx')
df.to_csv('dat.csv')
df.to_csv('dat.txt')

读取复杂格式的 txt

skiprows 表示有几行跳过不读
header=None 表示每一个数据每一列的名字不存在与数据之中,从0开始数
names 表示自己命名
nrows=5 表示五个五个读

df = pd.read_table('data1.txt',sep = ',')
df

				# real data
#num1	num2	num3	num4	message
# good data	NaN	NaN	NaN	NaN
# csv file	NaN	NaN	NaN	NaN
1	2	3	4	hello
5	6	7	8	world
9	10	11	12	hello
1	2	3	4	hello
5	6	7	8	good
9	10	11	12	fine

df = pd.read_table('data1.txt',sep = ',',skiprows = [0,1,2,3],header = None,names = ['n1','n2','n3','n4','message'], index_col = ['message'],nrows = 5)
df

	n1	n2	n3	n4
message
hello	1	2	3	4
world	5	6	7	8
hello	9	10	11	12
hello	1	2	3	4
good	5	6	7	8

df = pd.read_table('data1.txt',sep = ',',skiprows = [0,2,3], index_col = ['message'],nrows = 3)         #只读前三行
df

	#num1	num2	num3	num4
message
hello	1	2	3	4
world	5	6	7	8
hello	9	10	11	12

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_table.html

这个网页里介绍了 pandas.read_table 如果遇到不会的函数时，可以去找这些函数的说明文档查看。

milliechu

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
python读取文件和包

使用 open 读取文件.read()使用 open 不需要任何包就可以读取文件r表示read如果文件和 notebook 在同一个文件夹里则只需要写文件名就可以了，不需要填上完整的路径。file.read 表示读取这个文件内所有内容file = open('data.txt','r')print(file.read()) #读取这个文件内所有内容...
复制链接

扫一扫