我试着用beauthoulsoup从网页上获取数据,并(最终)将其输出到csv。作为这方面的第一步,我尝试获取相关表的文本。我设法做到了这一点,但是当我重新运行代码时,代码不再给我相同的输出:当我运行for循环时,它不再返回所有12372条记录,而是保存最后一条记录。在
我的代码的缩写版本是:from bs4 import BeautifulSoup
BirthsSoup = BeautifulSoup(browser.page_source, features="html.parser")
print(BirthsSoup.prettify())
# this confirms that the soup has captured the page as I want it to
birthsTable = BirthsSoup.select('#t2 td')
# selects all the elements in the table I want
birthsLen = len(birthsTable)
# birthsLen: 12372
for i in range(birthsLen):
print(birthsTable[i].prettify())
# this confirms that the beautifulsoup tag object correctly captured all of the table
for i in range(birthsLen):
birthsText = birthsTable[i].getText()
# this was supposed to compile the text for every element in the table
但是for循环只保存表中最后一个(即12372nd)元素的文本。我是否需要做一些其他的事情,以便它在循环时保存每个元素?我想我之前的(期望的)输出在新行中包含了每个元素的文本。在
这是我第一次使用python,如果我犯了一个明显的错误,请道歉。在