I'm learning BeautifulSoup, and found many "html2text" solutions, but the one i'm looking for should mimic the formatting:
- One
- Two
Would become
* One
* Two
and
Some text
More magnificent text here
Final text
to
Some text
More magnificent text here
Final text
I'm reading the docs, but I'm not seeing anything straight forward. Any help? I'm open to using something other than beautifulsoup.
解决方案
Take a look at Aaron Swartz's html2text script (can be installed with pip install html2text). Note that the output is valid Markdown. If for some reason that doesn't fully suit you, some rather trivial tweaks should get you the exact output in your question:
In [1]: import html2text
In [2]: h1 = """
...:
One...:
Two...:
"""In [3]: print html2text.html2text(h1)
* One
* Two
In [4]: h2 = """
Some text
...:
...: More magnificent text here
...:
...: Final text
"""In [5]: print html2text.html2text(h2)
Some text
> More magnificent text here
Final text