我尝试在一个html页面中使用beauthoulsoup和Python删除div,我还需要在同一个html页面中的特定标记中添加一些属性。在
我的代码是这样的:
原始HTML:
Parental div which I want to delete, that contains two other divs, one of which containing a table too.
需要最终的HTML:
^{pr2}$
我的python代码:from bs4 import BeautifulSoup
def replace_all(text, dic):
for i, j in dic.iteritems():
text = text.replace(i, j)
return text
html_data = open("index.html").read()
old_wanted_div = '''
new_wanted_div = '''
soup = BeautifulSoup(html_data)
old_unwanted_div = soup.find("div", attrs={"id": "to_delete"})
old_unwanted_div_str = '''{}'''.format(str(old_unwanted_div))
new_unwanted_div = ''' '''
reps = {old_wanted_div:new_wanted_div, old_unwanted_div_str:new_unwanted_div}
new_html = replace_all(html_data, reps)
f = open("index.html", "w")
f.write(new_html)
f.close()
现在,这段代码的作用是添加一个属性,但是没有删除不需要的div,我不知道错误在哪里。在