内部html和整体,美化组内部HTML?

TL;博士

对于BeautifulSoup 4,如果希望使用UTF-8编码的testring,请使用element.encode_contents();如果希望使用Python Unicode字符串,请使用element.decode_contents()。例如,DOM's innerHTML method可能看起来像这样:def innerHTML(element):

"""Returns the inner HTML of an element as a UTF-8 encoded bytestring"""

return element.encode_contents()

这些函数目前不在联机文档中,因此我将引用代码中的当前函数定义和文档字符串。

encode_contents-从4.0.4开始def encode_contents(

self, indent_level=None, encoding=DEFAULT_OUTPUT_ENCODING,

formatter="minimal"):

"""Renders the contents of this tag as a bytestring.

:param indent_level: Each line of the rendering will be

indented this many spaces.

:param encoding: The bytestring will be in this encoding.

:param formatter: The output formatter responsible for converting

entities to Unicode characters.

"""

另请参见documentation on formatters;您很可能使用formatter="minimal"(默认)或formatter="html"(对于html entities),除非您希望以某种方式手动处理文本。

encode_contents返回已编码的bytestring。如果需要Python Unicode字符串,请改用decode_contents。

decode_contents-从4.0.1开始

decode_contents与encode_contents执行相同的操作,但返回的是Python Unicode字符串,而不是经过编码的bytestring。def decode_contents(self, indent_level=None,

eventual_encoding=DEFAULT_OUTPUT_ENCODING,

formatter="minimal"):

"""Renders the contents of this tag as a Unicode string.

:param indent_level: Each line of the rendering will be

indented this many spaces.

:param eventual_encoding: The tag is destined to be

encoded into this encoding. This method is _not_

responsible for performing that encoding. This information

is passed in so that it can be substituted in if the

document contains a tag that mentions the document's

encoding.

:param formatter: The output formatter responsible for converting

entities to Unicode characters.

"""

美化组3

BeautifulSoup 3没有上述功能,而是有renderContentsdef renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,

prettyPrint=False, indentLevel=0):

"""Renders the contents of this tag as a string in the given

encoding. If encoding is None, returns a Unicode string.."""

为了与BS3兼容,这个函数被添加回BeautifulSoup 4(in 4.0.4)。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值