python基础教程笔记-项目1-即时标记-Day2

昨天主要了解了下生成器,用文档测下lines

def lines(file):
    for line in file: yield line
yield '\n'

for i in lines(sys.stdin):
	if i:
		print i
		print '---'

测试文档test_input.txt:

hello

how are you
how do you do

fine

执行:



输出结果 test_output.txt
hello

---


---
how are you

---
how do you do

---


---
fine
---


---

这里可以注意到hello后面有个换行,然后再是“---”。个人认为原因如下:

首先,test_input.txt实际上是一个list:

test_input = ['hello\n','\n','how areyou\n','how do you do\n','\n','fine']

其次,print打印出的东西自带换行效果:

print ‘1’
print ‘2’

执行效果为:



即先打印出1,然后换行,再打印2,再换行,最后执行结束。

在test_output.txt中也是这样:

先打印’hello\n’,然后换行,然后打印‘---’,再打印’\n’,再换行,再打印‘---’。。。

 

接下来看生成器blocks:

def blocks(file):
    block = []
    for line in lines(file):
        if line.strip():
            block.append(line)
        elif block:
            yield ''.join(block).strip()
            block = []
			
test_input = ['hello\n','\n','how are you\n','how do you do\n','\n','fine']			
for i in blocks(test_input):
	if i:
		print i
		print '---'

执行结果:



strip()的功能为删除字符串中的’\n’等空白字符(只删除首尾的!!!,中间的不删,比如’\nhello\nhello’.strip(),返回的结果为’hello\nhello,并返回结果。append为再之后添加,’’.join的意思是将block中的各元素用’’连接起来,返回连接后的字符串.

执行流程:

首先是line = ’hello\n’,lines.strip()为True,经过if后,block的值为’hello\n’。之后line = ’\n’,if中的line.strip()返回的是False,进入elif,block的值是’hello\n’,返回hello并置空block。之后line = ‘how are you\n’,if中判断为True,block为’how are you\n’,再然后line = ‘how do you do\n’,if中判断仍为True,此时block为:[‘how are you\n’,’how do youdo\n’],再之后line = ‘\n’,if中判断为False,进入elif,’’.join(block)执行后返回的值为’how are you\nhow do youdo\n’,执行strip()后返回’how are you\nhow do you do’(这里要注意,strip()只删除字符串首尾的空白字符,不会删除字符串中间的):

综上,

test_input = ['hello\n','\n','how areyou\n','how do you do\n','\n','fine']   经过生成器blocks后,生成的结果应该是:

[‘hello’,how are you\nhow do you do’,’fine’]

也就是将输入的文本返回块


利用blocks生成器就可以做一些简单的工作了:

util.py:

import sys, re

def lines(file):
    for line in file: yield line
    yield '\n'

def blocks(file):
    block = []
    for line in lines(file):
        if line.strip():
            block.append(line)
        elif block:
            yield ''.join(block).strip()
            block = []
			

print '<html><head><title>...</title><body>'
title = True
for block in blocks(sys.stdin):
    block = re.sub(r'\*(.+?)\*', r'<em>\1</em>', block)
    if title:
        print '<h1>'
        print block
        print '</h1>'
        title = False
    else:
        print '<p>'
        print block
        print '</p>'

print '</body></html>'

其中re.sub是正则表达式,将*XXX*替换为:<em>XXX</em>

正则表达式讲解可见:http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

执行:


test_input.txt:



Welcome to World Wide Spam, Inc.


These are the corporate web pages of *World Wide Spam*, Inc. We hope
you find your stay enjoyable, and that you will sample many of our
products.

A short history of the company

World Wide Spam was started in the summer of 2000. The business
concept was to ride the dot-com wave and to make money both through
bulk email and by selling canned meat online.

After receiving several complaints from customers who weren't
satisfied by their bulk email, World Wide Spam altered their profile,
and focused 100% on canned goods. Today, they rank as the world's
13,892nd online supplier of SPAM.

Destinations

From this page you may visit several of our interesting web pages:

  - What is SPAM? (http://wwspam.fu/whatisspam)

  - How do they make it? (http://wwspam.fu/howtomakeit)

  - Why should I eat it? (http://wwspam.fu/whyeatit)

How to get in touch with us

You can get in touch with us in *many* ways: By phone (555-1234), by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).
执行结果out.html:

<html><head><title>...</title><body>
<h1>
Welcome to World Wide Spam, Inc.
</h1>
<p>
These are the corporate web pages of <em>World Wide Spam</em>, Inc. We hope
you find your stay enjoyable, and that you will sample many of our
products.
</p>
<p>
A short history of the company
</p>
<p>
World Wide Spam was started in the summer of 2000. The business
concept was to ride the dot-com wave and to make money both through
bulk email and by selling canned meat online.
</p>
<p>
After receiving several complaints from customers who weren't
satisfied by their bulk email, World Wide Spam altered their profile,
and focused 100% on canned goods. Today, they rank as the world's
13,892nd online supplier of SPAM.
</p>
<p>
Destinations
</p>
<p>
From this page you may visit several of our interesting web pages:
</p>
<p>
- What is SPAM? (http://wwspam.fu/whatisspam)
</p>
<p>
- How do they make it? (http://wwspam.fu/howtomakeit)
</p>
<p>
- Why should I eat it? (http://wwspam.fu/whyeatit)
</p>
<p>
How to get in touch with us
</p>
<p>
You can get in touch with us in <em>many</em> ways: By phone (555-1234), by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).
</p>
</body></html>



  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值