最近在学习Peter Harrington的《机器学习实战》。
Craigslist个人广告链接已经找不到了,改用该网站的社会事件event与政治politics的RSS源
https://newyork.craigslist.org/search/eve?format=rss&sale_date=2018-06-11
https://losangeles.craigslist.org/search/eve?format=rss&sale_date=2018-06-11
https://newyork.craigslist.org/search/pol?format=rss
https://sfbay.craigslist.org/search/pol?format=rss
from numpy import *
import feedparser
import operator
def loadDataset():
postingList = [['my', 'dog', 'has', 'flea', 'problems', 'help', 'please'],
['maybe', 'not', 'take', 'him', 'to', 'dog', 'park', 'stupid'],
['my', 'dalmation', 'is', 'so', 'cute', 'I', 'love', 'him'],
['stop', 'posting', 'stupid', 'worthless', 'garbage'],
['mr', 'licks', 'ate', 'my', 'steak', 'how', 'to', 'stop', 'him'],
['quit', 'buying', 'worthless', 'dog', 'food', 'stupid']]
classVec = [0, 1, 0, 1, 0, 1]
return postingL