html2text找不到,html2text

html2text

c943b268dfd006565b2ddfded6526c88.png

56534eb169bac1dd64b55138c6f1e15e.png

687474703a2f2f62616467652e6b6c6f756435312e636f6d2f707970692f642f68746d6c32746578742e706e67

687474703a2f2f62616467652e6b6c6f756435312e636f6d2f707970692f762f68746d6c32746578742e706e67

687474703a2f2f62616467652e6b6c6f756435312e636f6d2f707970692f776865656c2f68746d6c32746578742e706e67

687474703a2f2f62616467652e6b6c6f756435312e636f6d2f707970692f666f726d61742f68746d6c32746578742e706e67

687474703a2f2f62616467652e6b6c6f756435312e636f6d2f707970692f6c6963656e73652f68746d6c32746578742e706e67

html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).

Usage: html2text [filename [encoding]]

Option

Description

--version

Show program's version number and exit

-h, --help

Show this help message and exit

--ignore-links

Don't include any formatting for links

--escape-all

Escape all special characters. Output is less readable, but avoids corner case formatting issues.

--reference-links

Use reference links instead of links to create markdown

--mark-code

Mark preformatted and code blocks with [code]...[/code]

For a complete list of options see the docs

Or you can use it from within Python:

>>> import html2text

>>>

>>> print(html2text.html2text("

Zed's dead baby, Zed's dead.

"))

**Zed's** dead baby, _Zed's_ dead.

Or with some configuration options:

>>> import html2text

>>>

>>> h = html2text.HTML2Text()

>>> # Ignore converting links from HTML

>>> h.ignore_links = True

>>> print h.handle("

Hello, world!")

Hello, world!

>>> print(h.handle("

Hello, world!"))

Hello, world!

>>> # Don't Ignore links anymore, I like links

>>> h.ignore_links = False

>>> print(h.handle("

Hello, world!"))

Hello, [world](https://www.google.com/earth/)!

Originally written by Aaron Swartz. This code is distributed under the GPLv3.

How to install

$ pip install html2text

How to run unit tests

tox

To see the coverage results:

coverage html

then open the ./htmlcov/index.html file in your browser.

Documentation

Documentation lives here

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值