作业链接
答案(欢迎指出不足)
Question 1.1: Implement distinct_words [code] (2 points)
# ------------------
# Write your implementation here.
temp = []
for i in corpus:
for k in i:
temp.append(k)
corpus_words = sorted(list(set(temp)))
num_corpus_words = len(corpus_words)
# ------------------
Question 1.2: Implement compute_co_occurrence_matrix [code] (3 points)
按照原题中的window_size =1来说,就是中心词两边最相邻的词。
我下面的解法相当于给每个句子首尾加上window_size个单词(None)。------【代码第10-13行】<