不多说了自己的笔记, 只是计算一个分词组成的列表, 中窗口大小的分词对
def word_combine(words, window=2):
if window < 2:
window = 2
for x in range(1, window):
print("\n 跨度:", x)
if x >= len(words):
break
words2 = words[x:]
res = zip(words, words2)
for r in res:
yield r
# 测试
res = word_combine(["我", "爱", "中国", "人民", "和", "共产党"], 3)
for i in res:
print(i)
结果是:
跨度: 1
('我', '爱')
('爱', '中国')
('中国', '人民')
('人民', '和')
('和', '共产党')
跨度: 2
('我', '中国')
('爱', '人民')
('中国', '和')
('人民', '共产党')
我们可以想象成,整个列表整体向后移动[1 : window - 1]个位置, 然后进行拉链操作
===========================================================
转移矩阵和重要度的计算和迭代
matrix = np.array([[0.1, 0, 0.7],
[0.4, 0.4, 0.2],
[0.01, 0.8, 0.0]])
g = np.array([[0.33], [0.33], [0.33]])
iters = 5
alpha = 0.3
for i in range(iters):
g = alpha + (1 - alpha) * np.dot(matrix, g)
g = g / g.sum()
if i > iters - 10:
print(g)
print()
多次迭代后的结果是(这里的迭代次数较少, 不过也可以当做收敛了):
[[0.32257421]
[0.35331457]
[0.32411122]]
[[0.31751476]
[0.35262363]
[0.32986161]]
[[0.31887964]
[0.3518058 ]
[0.32931455]]
[[0.31888035]
[0.35198255]
[0.3291371 ]]
[[0.31881529]
[0.35199036]
[0.32919434]]