![](https://img-blog.csdnimg.cn/20201014180756913.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
py-BI-分词聚类
vicky428
这个作者很懒,什么都没留下…
展开
-
xiaodu_julei.py-20180830
# -*- coding: utf-8 -*-"""Created on Thu Aug 30 11:46:33 2018@author: wenyun.wxw"""import time import re import os import sysimport codecsimport shutilimport numpy as ...原创 2019-07-24 12:44:36 · 106 阅读 · 0 评论 -
advcase1.py-20180704
#!/usr/bin/env python3# -*- coding: utf-8 -*-"""Created on Tue Jul 3 17:43:53 2018@author: vicky"""# 导入第三方包import pandas as pdimport numpy as npimport statsmodels.formula.api as smffrom ...原创 2019-07-23 12:49:55 · 80 阅读 · 0 评论 -
julei.py-20180721
# -*- coding: utf-8 -*-"""Created on Tue Jul 17 21:00:19 2018@author: wenyun.wxw"""#特征提取#- Tf-idf# 词频矩阵:矩阵元素a[i][j] 表示j词在i类文本下的词频 from sklearn.feature_extraction.text import CountVectorize...原创 2019-07-23 12:50:15 · 240 阅读 · 0 评论 -
julei2.py-20180721
# -*- coding: utf-8 -*-"""Created on Wed Jul 18 09:35:01 2018@author: wenyun.wxw"""import jieba from sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn.cluster import KMeans...原创 2019-07-23 12:50:25 · 121 阅读 · 0 评论 -
fenci.py-20180722
# -*- coding: utf-8 -*-"""Created on Fri Jul 13 20:00:33 2018@author: wenyun.wxw"""import jiebaimport reratecontent2=ratecontentfor i in range(len(ratecontent)): if ratecontent[i]=='此...原创 2019-07-23 12:50:35 · 143 阅读 · 0 评论 -
pinglun.py-20180722
# -*- coding: utf-8 -*-"""Created on Thu Jul 12 15:08:33 2018@author: wenyun.wxw"""import requestsimport re#https://rate.tmall.com/list_detail_rate.htm?#itemId=567925396518 #商品id#&spu...原创 2019-07-23 12:50:42 · 211 阅读 · 0 评论 -
importdata.py-20180729
# -*- coding: utf-8 -*-"""Created on Thu Jul 26 10:51:30 2018@author: wenyun.wxw"""#计算行数count = -1for count,line in enumerate(open('data.txt','r',encoding='utf-8')): passcount=count+1...原创 2019-07-23 12:50:49 · 80 阅读 · 0 评论 -
fenceng.py-20180730
# -*- coding: utf-8 -*-"""Created on Fri Jul 27 11:42:16 2018@author: wenyun.wxw"""import numpy as np import pandas as pdimport matplotlib.pyplot as pltfrom sklearn.cluster import KMeans i...原创 2019-07-23 12:51:00 · 97 阅读 · 0 评论 -
fenceng2.py-20180816
#!/usr/bin/env python3# -*- coding: utf-8 -*-"""Created on Sun Jul 29 22:37:01 2018@author: vicky"""import numpy as np import pandas as pdimport matplotlib.pyplot as pltfrom sklearn.clust...原创 2019-07-23 12:51:10 · 99 阅读 · 0 评论 -
julei3.py-20180824
# -*- coding: utf-8 -*-"""Created on Wed Jul 18 11:01:41 2018@author: wenyun.wxw"""# coding=utf-8 """ Created on 2016-01-06 @author: Eastmount """ import time import re ...原创 2019-07-23 12:51:38 · 409 阅读 · 0 评论 -
fenceng3.py-20180827
#!/usr/bin/env python3# -*- coding: utf-8 -*-"""Created on Fri Aug 3 16:46:12 2018@author: vicky"""import numpy as np import pandas as pdimport matplotlib.pyplot as pltfrom sklearn.clust...原创 2019-07-24 12:43:39 · 112 阅读 · 0 评论 -
xiaodu.py-20180830
# -*- coding: utf-8 -*-"""Created on Mon Aug 27 14:42:09 2018@author: wenyun.wxw"""import requestsimport re#创建循环链接urls = []#替换页面数为i,取前100页评论for i in list(range(1,99)): urls.append('h...原创 2019-07-24 12:43:46 · 197 阅读 · 0 评论 -
xiaodu_jd.py-20180830
# -*- coding: utf-8 -*-"""Created on Tue Aug 28 11:02:24 2018@author: wenyun.wxw"""import requestsimport redef xiaodu(score): #score=0为全部,1为差评,2为中评,3为好评, 4配图评论 urls = [] #替换页面数为i,...原创 2019-07-24 12:43:56 · 194 阅读 · 0 评论 -
cipin.py-20180830
# -*- coding: utf-8 -*-"""Created on Tue Aug 28 15:09:32 2018@author: wenyun.wxw"""data=ratecontent_less+ratecontent_jd_less #合并天猫评论和京东评论num=len(data)for i in range(num-1): if data[i]==...原创 2019-07-24 12:44:06 · 320 阅读 · 1 评论 -
xiaodu_fenci.py-20190830
# -*- coding: utf-8 -*-"""Created on Tue Aug 28 14:38:16 2018@author: wenyun.wxw"""import jiebaimport refrom wordcloud import WordCloudimport matplotlib.pyplot as pltdata=ratecontent_less...原创 2019-07-24 12:44:24 · 189 阅读 · 0 评论 -
advcase.py-20180704
#!/usr/bin/env python3# -*- coding: utf-8 -*-"""Created on Tue Jul 3 17:43:53 2018@author: vicky"""# 导入第三方包import pandas as pdimport numpy as npimport statsmodels.formula.api as smffrom ...原创 2019-07-22 19:46:22 · 130 阅读 · 0 评论