- 已知词项文档矩阵
利用LSI算法,求词项与文档各自的2维表示。
from numpy import linalg as la
from numpy import mat
A = mat([[1,0,1,0,0,0],[0,1,0,0,0,0],[1,1,0,0,0,0],[1,0,0,1,1,0],[0,0,0,1,0,1]])
U,S,T = la.svd(A)//SVD降维
# print(S)
print(U)
# print(T)
u = U[:,0:2]//取出矩阵的前两列构成词项矩阵
print(u)
t = T[0:2,:]//去除矩阵的前两行构成文档矩阵
print(t)