browser = webdriver.Firefox() # Get local session of Firefox
browser.get("www.baidu.com") # Load page
我们需要爬取的信息在一般的静态网页中,是直接写在源代码里面的。我们可以方便使用正则表达式抓取,比如:
rr.firstInit({"data":[{"author":"袁理,翟堃","change":"首次","companyCode":"80116848","datetime":"2016-01- 28T08:13:29","infoCode":"APPH2FEzZ2tFASearchReport","insCode":"80000031","insName":"东吴证券","insStar":"3","jlrs": ["206000000","259000000","352000000","",""],"rate":"增持","secuFullCode":"002322.SZ","secuName":"理工监测","sratingName":"增