- 博客(12)
- 收藏
- 关注
原创 糗事百科多线程介绍
import urllib.requestimport urllib.errorimport reimport threadingheaders = ("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537....
2018-05-31 19:32:58 254
原创 urllib多线程介绍代码实现
import threading # 导入多线程包class A(threading.Thread): # 创建一个多线程A def __init__(self): # 必须包含的两个方法之一:初始化线程 threading.Thread.__init__(self) def run(self): # 必须包含的两个方法之一:线程运行方法 ...
2018-05-31 19:25:44 428
原创 普通爬虫(糗事百科)
import urllib.requestimport urllib.errorimport reheaders = ("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36")opener = url...
2018-05-31 19:23:49 160
原创 微信爬虫,爬取网页信息(使用代理和模拟浏览器)
#http://weixin.sogou.com/import reimport urllib.requestimport timeimport urllib.errorimport urllib.requestimport scipy#自定义函数,功能为使用代理服务器爬一个网址def use_proxy(proxy_addr,url): #建立异常处理机制 tr...
2018-05-31 18:45:00 5654
原创 自动登录豆瓣(不出现验证码情况)
import urllib.requestimport xlsxwriterimport re#模拟post请求import urllib.parse, urllib.request, http.cookiejar, recookie = http.cookiejar.CookieJar()cookieProc = urllib.request.HTTPCookieProcesso...
2018-05-30 22:43:51 1015
原创 Python xlrd,xlsxwriter操作 excel
import xlrd,xlwt#打开excel文件并获取所有sheetworkbook = xlrd.open_workbook(r'D:\1.xlsx')sheetlist=workbook.sheet_names()# print (sheetlist)coldata=[]for ele in sheetlist: sheet=workbook.sheet_by_name...
2018-05-30 13:25:17 1391 1
原创 python正则表达式
import re## ^ 匹配开始# $ 匹配行尾# . 匹配出换行符以外的任何单个字符# [......] 匹配括号内任何当个字符# [^......] 匹配单个字符或多个字符不在括号内# * 匹配0个或多个匹配前面的表达式# + 匹配1个或多个前面出现的表达式# ? 匹配0次或1次前面出现的表达式# {n} 精确匹配前面出现的表达式的数量# {n,m} 匹配至...
2018-05-30 10:31:56 115
原创 python英文分词及字典排序
speak='''Chief Justice Roberts, President Carter, President Clinton, President Bush, President Obama, fellow Americans and people of the world, thank you.We, the citizens of America, are now joined in...
2018-05-28 14:40:31 1979
原创 聚类分析
import pandas as pdaimport numpy as npimport missingnoimport matplotlib.pyplot as pltimport seaborn as sns#读入数据data=pda.read_csv("114_congress.csv")#显示前几行print(data.head())#查看缺失值missingno....
2018-05-02 16:14:39 319
原创 用户流失预测(KNN SVC RF)
import pandas as pdaimport numpy as npimport missingnoimport matplotlib.pyplot as pltuserData=pda.read_csv("churn.csv")print(userData.shape)# print(userData.describe())# print(userData.colum...
2018-05-02 16:04:20 861
原创 信用卡异常检查(过采样,下采样、逻辑回归,混淆矩阵)
import pandas as pdaimport numpy as npimport matplotlib.pyplot as pltimport itertoolsimport missingnodata=pda.read_csv("creditcard.csv")# print(data.head())count_class=pda.value_counts(data.C...
2018-05-02 15:52:46 1272
原创 PCA 降维
import numpy as npimport pandas as pdafrom sklearn.datasets import load_irisimport matplotlib.pyplot as plt#加载数据iris=load_iris()# print(iris)data=iris["data"]labels=iris["target"]# print(da...
2018-05-02 09:55:19 220
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人