【Python】读书笔记:Python基础教程-项目1-即时标记

功能:给文本添加HTML标记,使得到的文档能够在浏览器中显示并能作为一个网页使用。

要求:

  1. 输入不应包含人工代码或标签;
  2. 应能处理不同的块,比如标题、段落、列表项、内嵌文本(比如被强调的文本、URL等);
  3. 具有可拓展性。

涉及到的知识点:(参见《Python基础教程 第2版》)

  1. 要对文件进行读写(11章),或者至少从标准输入(sys.stdin)读取,用print输出
  2. 需要对所有输入的行进行迭代(11章)
  3. 需要一些字符串方法(3章)
  4. 需要一个或两个生成器(9章)
  5. 可能还需要re模块(10章)

测试文档:【text_input.txt】
Welcome to World Wide Spam, Inc.

These are the corporate web pages of World Wide Spam, Inc. We hope
you find your stay enjoyable, and that you will sample many of our
products.

A short history of the company

World Wide Spam was started in the summer of 2000. The business
concept was to ride the dot-com wave and to make money both through
bulk email and by selling canned meat online.

After receiving several complaints from customers who weren’t
satisfied by their bulk email, World Wide Spam altered their profile,
and focused 100% on canned goods. Today, they rank as the world’s
13,892nd online supplier of SPAM.

Destinations

From this page you may visit several of our interesting web pages:

How to get in touch with us

You can get in touch with us in many ways: By phone (555-1234), by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).

文本块生成器:【url.py】

def lines(file):
    for line in file:yield line
    yield '\n'
def blocks(file):
    block=[]
    for line in lines(file):
        if line.strip():
            block.append(line)
        elif block:
            yield ''.join(block).strip()
            block=[]

标记程序:【simple_markup.py】

import sys
import re
from util import *
print '<html><head><title>...</title><body>'
title=True
for block in blocks(sys.stdin):
  block=re.sub(r'\*(.+?)\*',r'<em>\1</em>',block)
  if title:
    print '<h1>'
    print block
    print '</h1>'
  else:
    print '<p>'
    print block
    print '</p>'
print '</body></html>'

执行:

python simple_markup.py <text_input.txt> text_output.html

输出:【text_output.html】

<html><head><title>...</title><body>
<h1>
Welcome to World Wide Spam, Inc.
</h1>
<h1>
These are the corporate web pages of <em>World Wide Spam</em>, Inc. We hope
you find your stay enjoyable, and that you will sample many of our
products.
</h1>
<h1>
A short history of the company
</h1>
<h1>
World Wide Spam was started in the summer of 2000. The business
concept was to ride the dot-com wave and to make money both through
bulk email and by selling canned meat online.
</h1>
<h1>
After receiving several complaints from customers who weren't
satisfied by their bulk email, World Wide Spam altered their profile,
and focused 100% on canned goods. Today, they rank as the world's
13,892nd online supplier of SPAM.
</h1>
<h1>
Destinations
</h1>
<h1>
From this page you may visit several of our interesting web pages:
</h1>
<h1>
- What is SPAM? (http://wwspam.fu/whatisspam)
</h1>
<h1>
- How do they make it? (http://wwspam.fu/howtomakeit)
</h1>
<h1>
- Why should I eat it? (http://wwspam.fu/whyeatit)
</h1>
<h1>
How to get in touch with us
</h1>
<h1>
You can get in touch with us in <em>many</em> ways: By phone (555-1234), by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).
</h1>
</body></html>
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值