CS109 Lecture 7 Data Scraping Sources From a Web SitesWith An API Copyrights and permission Be careful and politeGive creditCare about media lawDon’t be evil Useful tags <h1></h1> <p></p> <br> <a href = 'url'>Link</a> Useful Libraries for Scraping urllibbeautifulsouppatternLXML Get Data From Website url = 'url' scource = urllib2.urlopen(url).read() soup = bs4.BeautifulSoup(source) soup.findAll('a') # find <a><\a> tag tag = soup.find('a') tag.get('href') C = soup.findAll('p',{'class':'Event'}) t=C[0] t.findNextSiblings Get Data With An API import json # JavaScript Obejct Notation import requests api_key = 'mykey' url = 'url' + api_key scource = urllib2.urlopen(url).read() #---simple example-------- a = {'a':1,'b':2} s = json.dump(a) a2 = json.loads(s) #------------------------- dataDict = json.loads(data) dtatDict.keys()