html页面遍历,BeautifulSoup如何遍历整个html页面以将类添加到每个<td>

我正在使用BeautifulGroup对表元素进行更改。更具体地说,我向tbody和td元素添加了一个类。这很好用,但只适用于第一个匹配元素。我不知道如何迭代页面上的其余匹配元素。在soup = BeautifulSoup(combine_html, "html.parser")

soup.find('tbody')['class'] = 'list'

soup.find('td')['class'] = 'fuzzy'

soup

发生以下变化

^{pr2}$

~~~更新~~~

我没有得到任何输入,所以可能我的问题没有贴上正确的标签,或者答案很简单,所以没有人发布。在

我可以让它工作-但它真的很难看。参见以下代码:import csv

import pandas as pd

# import numpy as np

from bs4 import BeautifulSoup, Tag, NavigableString

# Select columns from csv file

csv_columns = ['Email', 'Recipient Name', 'Department', 'Clicked Link?']

# Set input csv file to read from nd specify columns using csv_columns variable

df = pd.read_csv('camp1_beneficiary_fullcsv.csv', skipinitialspace=True, usecols=csv_columns)

# Set the HTML header

# Set Bootstrap CSS

# Set CSS location for list.min.js Javascript - mainly the list class

# Set div id for list.min.js

html_header="""

Sort by name

"""

# Set HTML 'footer'

# Specify list.min.js external javascript file and code

html_footer ="""

var options = {

valueNames: [ 'fuzzy' ]

};

var userList = new List('users', options);

"""

# Generate HTML body using df.to_html from Pandas

html_body = df.to_html(classes=["table-bordered", "table-striped", "table-hover"])

# Combine html header, body, and footer into variable

combine_html = (html_header + html_body + html_footer)

# Find elements in HTML and add classes to support javascript classes for filtering

soup = BeautifulSoup(combine_html, "html.parser")

soup.find('tbody')['class'] = 'list'

soup

f = open('test.html','w')

f.write(str(soup))

f.close()

f = open('test.html', 'r')

filedata = f.read()

f.close()

newdata = filedata.replace("

", "")

f = open('final.html', 'w')

f.write(newdata)

f.close()

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
<div style=";text-align:center;;height:auto;" class="datagrid-cell datagrid-cell-c1-checkId">已通过</div></td><td field="button"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-button"><a href="#" style="color: red" onclick="fileManager(0)">图片管理</a></div></td><td field="truckNo"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-truckNo">辽PD6885</div></td><td field="truckCardColor"><div style=";text-align:center;;height:auto;" class="datagrid-cell datagrid-cell-c1-truckCardColor">黄牌</div></td><td field="vtNam"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-vtNam">秦皇岛九福物流有限公司</div></td><td field="driverNam"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-driverNam">叶红建</div></td> <div style=";text-align:center;;height:auto;" class="datagrid-cell datagrid-cell-c1-checkId">已通过11</div></td><td field="button"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-button"><a href="#" style="color: red" onclick="fileManager(0)">图片管理</a></div></td><td field="truckNo"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-truckNo">辽PD6885</div></td><td field="truckCardColor"><div style=";text-align:center;;height:auto;" class="datagrid-cell datagrid-cell-c1-truckCardColor">黄牌</div></td><td field="vtNam"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-vtNam">秦皇岛九福物流有限公司</div></td><td field="driverNam"><div style=";height:auto;" class="datagrid-cell datagrid-cell-c1-driverNam">叶红建1</div></td> 以上代码为网页源码,帮我写一段python程序从以上代码中找出drivernam和datagrid-cell-c1-checkId并保存数据库中
06-13
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值