利用beatifulsoup统计网页tag个数

最新推荐文章于 2022-01-12 18:24:33 发布

nbu04william

最新推荐文章于 2022-01-12 18:24:33 发布

阅读量533

点赞数

分类专栏： python bs4 网页

本文链接：https://blog.csdn.net/nbu2004/article/details/108829080

版权

python 同时被 3 个专栏收录

23 篇文章 0 订阅

订阅专栏

bs4

1 篇文章 0 订阅

订阅专栏

网页

1 篇文章 0 订阅

订阅专栏

利用beatifulsoup统计网页tag个数

#!/usr/bin/env python3

import bs4

def _count(soup):
    # count the tags under Beatifulsoup object soup
    if soup.contents:
        c = {soup.name:1}
        for a in soup.contents:
            if a.name:
                c0=_count(a)
                for k, v in c0.items():
                    if k in c:
                        c[k] += v
                    else:
                        c[k] = v
    else:
        c = {soup.name:1}
    return c


def count(s):
    """Count tags in string s
    Arguments:
    s: str -- the HTML code
    """
    soup = bs4.BeautifulSoup(s, 'lxml')
    html = soup.html
    body = html.body
    return _count(body)

# read a html file or load one with `requests`
with open('zzjc.htm') as fo:
    s = fo.read()
    print(count(s))

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

nbu04william

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
利用beatifulsoup统计网页tag个数

利用beatifulsoup统计网页tag个数#!/usr/bin/env python3import bs4def _count(soup): # count the tags under Beatifulsoup object soup if soup.contents: c = {soup.name:1} for a in soup.contents: if a.name: c0=_cou
复制链接

扫一扫