笔记
导入文件
##导入和读写
fobj=open("mbox.txt")
for line in fobj:
print(line)
fobj.close
注意导入对象的类型!!
直接打开的的文本型,遍历出来的对象是句子(长字符串)遍历行的对象是字母(in)
用.read()输出的是字符串,遍历出来是字母
文章后缀的状态
"r"read
"a"append添加
"w"write
“x"create
文件类型
“t”-text-文本文件,用ASC码表示
“b”-binary-二进制储存的文件"rb”
#默认是文本文件和read状态
a=open('mbox.txt','rt')
.readlines()的用法——按照横排打出
poem=open('Mei_flower.txt').read()
print(poem)
#输出文本中的行数--read打开转化为,输入字符串中特定的字符
print(poem[9:17])
>冰雪林中著此身,
不同桃李混芳尘。
忽然一夜清香发,
散作乾坤万里春。
不同桃李混芳尘。
poem=open('Mei_flower.txt').readlines()
print(poem)
>['冰雪林中著此身,\n', '不同桃李混芳尘。\n', '忽然一夜清香发,\n', '散作乾坤万里春。']
找到文本中以特定字符串在(开头——starswith/句子中——in)的句子
#method1
a=open('mbox-short.txt')
for b in a:
b=b.rstrip()
if b.startswith('From:'):
print(b)
a.close()
#method2
a = open('mbox-short.txt')
for b in a:
b=b.rstrip()
if not b.startswith('From:'):
continue
print(b)
a.close()
>>>From: stephen.marquard@uct.ac.za
From: louis@media.berkeley.edu
From: zqian@umich.edu
From: rjlowe@iupui.edu
From: zqian@umich.edu
...
#句子中
a = open('mbox-short.txt')
for b in a:
b=b.rstrip()
if not '@uct.ac.za' in b:
continue
print(b)
a.close()
>>>From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
X-Authentication-Warning: nakamura.uits.iupui.edu: apache set sender to stephen.marquard@uct.ac.za using -f
From: stephen.marquard@uct.ac.za
Author: stephen.marquard@uct.ac.za
统计文件中某词语出现的次数
8.41
写文件
用with关闭文件
8.45
w+新写,有也闪电
同时读和写文件
8.49
重新设置文件位置
seek 跳到下标为n的位置8.54
tell 当前的总下标,移动到某位置8.58
西文汉字的案例8.59
改写,新建的编写9.00,9.01
pickling
9.08,9.10
练习题
统计文件的行数
Write a Python function to return the number of lines in a text file of given name.
def linesInFile(m):
count=0
a=open(m)
for i in a:
count+=1
return count
a.close
linesInFile('mbox-short.txt')
统计文件中全部字符的个数
Write a Python function to return the number of characters in a text file of given name
# 不加read是文件,按照行遍历,加了.read()是字符串,按照字母遍历
def charsInFile(m):
a=open(m).read()
for i in a:
return(len(a))
a.close
charsInFile('mbox-short.txt')
# 用with的打开方式
def charsInFile(f):
with open(f) as fd:
return len(fd.read())
统计文件中特定字符串的个数
##这个方法有问题!无法统计一行出现了两个特定字符
def strsInFile(a,b):
c=open(a)
#print(type(c))#c是文本形式
e=0
for d in c:
if b in d:
e+=1
return(e)
strsInFile('mbox.txt', 'from')
#直接用计数函数
def strsInFile(f,s):
with open(f) as fd:
return fd.read().count(s)
删除文中所有空格
def removeBlankLine(f):
with open('testFile.txt','w') as file:
with open(f) as fd:
for line in fd:
if line != '\n':
file.write(line)
# Check if there exists a file named testFile.txt. It should not be there.
# If there is a file named testFile.txt and you want to keep it,
# please move it to another place so that it will not be overwritten.
import os
os.path.isfile('testFile.txt')
# Run your function and create a new file named testFile.txt
removeBlankLine('mbox-short.txt')
# Now, that file should exists
os.path.isfile('testFile.txt')
linesInFile('mbox-short.txt')
linesInFile('testFile.txt')
removeBlankLine('mbox.txt')
print(linesInFile('mbox.txt')-linesInFile('testFile.txt'), 'blank lines removed.')
# delete the file testFile.txt
# Run the following command if your OS is Windows
!del testFile.txt
# Run the following command if your OS is OSX or Linux
#!rm testFile.txt