我想用python(Jupyter)打印一个向量(148K长度)。但只有8-10个字符。就像这样:[0 0。。。0 0 0]
我想看看结果。
import re
import nltk
import numpy
import pandas
from nltk.corpus import stopwords
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
vocab=numpy.array(pandas.read_excel('E:\\for_bow.xlsx'))
def word_extraction(sentence):
ignore = set(stopwords.words('english'))
words = re.sub("[^\w]", " ", sentence).split()
cleaned_text = [w.lower() for w in words
if w not in ignore]
return cleaned_text
def generate_bow(allsentences):
for sentence in allsentences:
words = word_extraction(sentence)
bag_vector = numpy.zeros(len(vocab),int)
for w in words:
for i,word in enumerate(vocab):
if word == w:
bag_vector[i] += 1
print("{0}\n{1}\n".format(sentence,numpy.array(bag_vector)))
input_text=["text1", "text2", "text3"]
generate_bow(input_text)
我在另一篇文章中读到,我应该使用以下代码:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
我试过,但没成功。