求大神指导,本人刚接触到python爬虫,有一些问题,再此感激不尽!!!!
我想爬取一些英文新闻标题,然后把他们存在一个csv文件里面
我的代码如下
import csv, requests, re
from bs4 import BeautifulSoup
urls = ['https://www.defense.gov/News/Archive/?Page={}'.format(str(i)) for i in range(1,10)]
def get_titles(urls,data = None):
html = requests.get(urls).text
soup = BeautifulSoup(html, 'html.parser')
articles = []
for article in soup.find_all(class_='info'):
Label = 'Archive'
News = article.find(class_='title').get_text()
articles.append([Label,News])
with open(r'1.csv','a', newline='') as f:
writer = csv.writer(f)
writer.writerow(['Label','News'])
for row in articles:
writer.writerow(row)
for titles in urls:
get_titles(titles)
想这样来爬取1-9页的新闻标题,但是最后运行结果是这样
每增加一个新闻标题都会把之前的标题重复写入csv中。
求大神指导!!