:主成分分析在PCA中,PCA在sklean中
scikit-learn中PCA的使用方法:https://blog.csdn.net/u012162613/article/details/42192293
jieba中文处理:https://blog.csdn.net/ch1209498273/article/details/78401637
collection库中defaultdict方法的使用:https://blog.csdn.net/brucewong0516/article/details/810
与Python课程相似的一个讲解:python+gensim︱jieba分词、词袋doc2bow、TFIDF文本挖掘:
https://blog.csdn.net/sinat_26917383/article/details/71436563
gensim使用方法以及例子(计算文本相似度):https://blog.csdn.net/u014595019/article/details/52218249
DOC2VEC所涉及的参数:class gensim.models.doc2vec.Doc2Vec(documents=None, dm_mean=None, dm=1, dbow_words=0, dm_concat=0, dm_tag_count=1, docvecs=None, docvecs_mapfile=None, comment=None, trim_rule=None, **kwargs) :
https://blog.csdn.net/HHTNAN/article/details/78750270