python文本处理-排序行,重新格式化段落

排序行

很多时候,我们需要对文件的内容进行排序以进行分析。 例如,我们希望得到不同学生写的句子,按名称的字母顺序排列。 这将涉及排序不仅仅是行的第一个字符,而是从左边开始的所有字符。 在下面的程序中,首先从文件中读取行,然后使用sort函数打印它们,sort函数是标准python库的一部分。

打印文件

FileName = ("D:/path/poem.txt")
data=file(FileName).readlines()
for i in range(len(data)):
    print data[i]

当我们运行上面的程序时,得到以下输出 -

Summer is here.

Sky is bright.

Birds are gone.

Nests are empty.

Where is Rain?

对文件中的行进行排序

FileName = ("D:\pathto\poem.txt")
data=file(FileName).readlines()
data.sort()
for i in range(len(data)):
    print data[i]

当我们运行上面的程序时,得到以下输出 -

Birds are gone.

Nests are empty.

Sky is bright.

Summer is here.

Where is Rain?

重新格式化段落

当我们处理大量文本并将其呈现为可呈现的格式时,需要格式化段落。可能只想打印具有特定宽度的每一行,或者在打印诗词时增加每一行的缩进。 在本章中,将使用textwrap3模块根据需要格式化段落。
首先,需要安装所需的包,如下所示 -

pip install textwrap3

环绕固定宽度

在此示例中,为段落的每一行指定了30个字符的宽度。通过为width参数指定值来使用wrap函数。

from textwrap3 import wrap

text = 'In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleones daughter Connie (Talia Shire) and Carlo Rizzi (Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia family, is known to friends and associates as Godfather. He and Tom Hagen (Robert Duvall), the Corleone family lawyer, are hearing requests for favors because, according to Italian tradition, no Sicilian can refuse a request on his daughters wedding day.'

x = wrap(text, 30)
for i in range(len(x)):
    print(x[i])

当运行上面的程序时,我们得到以下输出 -

In late summer 1945, guests
are gathered for the wedding
reception of Don Vito
Corleones daughter Connie
(Talia Shire) and Carlo Rizzi
(Gianni Russo). Vito (Marlon
Brando), the head of the
Corleone Mafia family, is
known to friends and
associates as Godfather. He
and Tom Hagen (Robert Duvall),
the Corleone family lawyer,
are hearing requests for
favors because, according to
Italian tradition, no Sicilian
can refuse a request on his
daughters wedding day.

变量缩进

在这个例子中,增加了要打印诗语的每一行的缩进。

import textwrap3

FileName = ("path\poem.txt")

print("**Before Formatting**")
print(" ")

data=file(FileName).readlines()
for i in range(len(data)):
   print data[i]

print(" ")
print("**After Formatting**")
print(" ")
data=file(FileName).readlines()
for i in range(len(data)):
    dedented_text = textwrap3.dedent(data[i]).strip()
    print dedented_text

当运行上面的程序时,得到以下输出 -

**Before Formatting**

 Summer is here.
  Sky is bright.
    Birds are gone.
     Nests are empty.
      Where is Rain?

**After Formatting**

Summer is here.
Sky is bright.
Birds are gone.
Nests are empty.
Where is Rain?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

资料小助手

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值