1.建立字典:两种方法
方法一:
file = sc.textFile(add_keyWordWithFeature)
Dict = {}
def wordSplitAndBuilDict(x):
return Dict
Action = file.map(lambda line:wordSplitAndBuilDict(json.loads(line)))
这种方法返回的是许多dict组成的列表,之后reduce一下组成一个新的大字典就好了
方法二:
file = sc.textFile(add_keyWordWithFeature)
def wordSplitAndBuilDict(x,y):
return Dict
def mergeDict(x,y):
return Dict
def BuilDict(a,b):
if type(a) != dict:
preDict = {}
Dict = wordSplitAndBuilDict(preDict,a)
reDict = wordSplitAndBuilDict(Dict,b)
else:
if type(b) == dict:
reDict = mergeDict(a,b)
else:
Dict = a
reDict = wordSplitAndBuilDict(Dict,b)
return reDict
Dict = file.reduce(lambda (a,b):BuilDict(json.loads(a),json.loads(b)))
这种方法其实就是把第一种方法中的两个步骤合并起来了.