的列,您可以使用scipy.sparse.hstack到水平堆叠这两个(逐列)。我们只需要在列表转换为列向量(稀疏矩阵来说的),或者与单个列的二维数组 -
scipy.sparse.hstack((tweets, csr_matrix(lex).T))
scipy.sparse.hstack((tweets, np.asarray(lex)[:,None]))
采样运行 -
In [189]: from scipy.sparse import csr_matrix
In [194]: import scipy as sp
In [190]: a = np.random.randint(0,4,(5,10))
In [192]: a
Out[192]:
array([[2, 1, 1, 1, 0, 3, 1, 3, 2, 1],
[0, 2, 1, 2, 3, 0, 1, 1, 2, 3],
[0, 1, 1, 1, 2, 3, 0, 1, 0, 1],
[0, 0, 3, 0, 3, 0, 1, 0, 3, 1],
[1, 0, 2, 3, 3, 3, 2, 2, 0, 1]])
In [193]: b = [9,8,7,6,5] # equivalent to lex
In [191]: A = csr_matrix(a) # equivalent to tweets
In [195]: sp.sparse.hstack((A, csr_matrix(b).T))
Out[195]:
<5x11 sparse matrix of type ''
with 42 stored elements in COOrdinate format>
In [197]: _.toarray() # verify values by converting to dense array
Out[197]:
array([[2, 1, 1, 1, 0, 3, 1, 3, 2, 1, 9],
[0, 2, 1, 2, 3, 0, 1, 1, 2, 3, 8],
[0, 1, 1, 1, 2, 3, 0, 1, 0, 1, 7],
[0, 0, 3, 0, 3, 0, 1, 0, 3, 1, 6],
[1, 0, 2, 3, 3, 3, 2, 2, 0, 1, 5]])