看之前的贴就知道我最近对股票有点兴趣,可是我对股票一窍不通怎么办呢。发现了一个网站
https://baike.baidu.com/wikitag/taglist?tagId=62991
长这样:
感觉很不错,一下子基本上涵盖了大多数的相关概念,往下翻了几下,是在太多了。一个个点不知道要看到什么时候,不如写个爬虫爬下来把。
爬文字和网页很基础,直接看代码:
import scrapy
import time
class StockTermsSpider(scrapy.Spider):
name = 'stock_term_spider'
def start_requests(self):
urls = [
'https://baike.baidu.com/wikitag/taglist?tagId=62991'
]
headers = {
'Referer': 'http://quote.eastmoney.com',
'Cookie': 'BAIDUID=DF49F703A7888AE4A13F4B2D8876DFC1:FG=1; PSTM=1532004266; BIDUPSID=8DCBE11F4D26DD0AD222967F8EE139E1; BDUSS=HI5TnV6Q1d4cHB0V2F5bXhaflFaU1NRUnN0S1Z0QTMyYnVVcndTWX5CSlhicE5iQVFBQUFBJCQAAAAAAAAAAAEAAABm0OcAMDk2MDUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFfha1tX4WtbR; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; BDSFRCVID=YyDOJeC62AMTf_QwK_QLUwtTNgg3AIbTH6aoDtn3juxV0-WB7ubtEG0Pqx8g0KubKq9vogKK0eOTHkCF_2uxOjjg8UtVJeC6EG0P3J; H_BDCLCKID_SF=tJAO_KK-tIL3fP36qR70h-4shgT22-usy65R2hcH0KLKEq62blnG5UPm0tvP-b3nBDvMohjYKfb1MRjvQbO_DPI02bDjelbGJ5TaLp5TtUJaJKnTDMRh-xPLeqjyKMnitIT9-pno0hQrh459XP68bTkA5bjZKxtq3mkjbIOFfJOKHICRe5KBDM5; H_PS_PSSID=1456_21126_29238_28519_29099_28839_29220_26350; Hm_lvt_55b574651fcae74b0a9f1cf9c8d7c93a=1561451549,1562393926,1562587230,1562588395; Hm_lpvt_55b574651fcae74b0a9f1cf9c8d7c93a=1562588395; delPer=0; PSINO=1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
'Host': 'baike.baidu.com'
}
for url in urls:
print("url%s" % url)
# meta: type:0 first request 1:next request<