2017年12月_大蛇王

原创 python 调用航空公司的接口获取机票数据 api简单案例

首先简单说下接口测试，现在常用的2种接口就是http api和rpc协议的接口，今天主要说：http api接口是走http协议通过路径来区分调用的方法，请求报文格式都是key-value形式，返回报文一般是json串；接口协议：http、webservice、rpc等。请求方式：get、post方式请求参数格式：　　a. get请求都是通过url?param=xx

2017-12-29 17:24:48 7822 1

转载 python中response.text与response.content的区别

requests.content返回的是二进制响应内容而requests.text则是根据网页的响应来猜测编码，如果服务器不指定的话，默认编码是" ISO-8859-1"（我当初看到这里的时候，在想为啥默认编码不设置为utf8呢，然后看到了原来是http协议是这样的，所以...）所以这是为什么你用 response.text 返回的是乱码的原因。你可以用response.enc

2017-12-29 11:53:34 12670 2

原创 python 对字符串进行md5加密

运行环境 python3 使用模块hashlibimport hashlibdef md5(str): m = hashlib.md5() m.update(str.encode("utf8")) print(m.hexdigest()) return m.hexdigest()def md5GBK(str1): m = hashlib.md

2017-12-29 09:45:56 21083

原创 mysql 建表语句及完整案例

1、最简单的：CREATE TABLE t1( id int not null, name char(20));2、带主键的：a：CREATE TABLE t1( id int not null primary key, name char(20));b：复合主键CREATE TABLE t1( id int

2017-12-28 17:35:10 166988 4

原创 mysql数据库查询比较日期时间段的方法多条件查找判断

mysql日期比较语句select * from student where '2012-02-27 00:00:00' created_dateselect * from student where UNIX_TIMESTAMP('2012-02-27 00:00:00') UNIX_TIMESTAMP(created_date); www.2cto.com

2017-12-27 15:21:36 17523 1

原创 MYSQL中更新数据超简单方法 replace into的用法以及常规增删查改

今天在编程的时候，学习了replace into的用法，真的很好用，是insert into的增强版。在向表中插入数据时，我们经常会遇到这样的情况：1、首先判断数据是否存在；2、如果不存在，则插入；3、如果存在，则更新。在SQL Server中可以这样处理：if not exists (select 1 from t where id = 1)?insert into t(id

2017-12-27 10:00:32 7371

原创 python 批量获取验证码图片简单案例

# coding:utf8import requestsdef downimage(i): # 构建session sess = requests.Session() # 建立请求头 headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHT

2017-12-26 10:45:35 5135 1

原创 python 针对selenium+phontomjs等模拟浏览器爬虫的反爬技术点

使用selenium+phontomjs爬取航空公司网站为例子1访问元素丰富度普通用户在打开网页时会有比较丰富的地址访问，而自动爬虫通常只有少数固定的页面访问，比如航司活动专版、舱位价格页面、航线动态等。图为岂安科技风控产品监控界面2访问轨迹连贯性用户在进行页面访问时，通常是有一个合理的访问轨迹，如从首页跳转到机票搜索，但爬虫在自动获取数据时，往往是对页面地

2017-12-25 17:30:43 2931

转载 Redis 数据库详细使用方法及拓展

【学会启动redis】启动redis非常简单，直接./redis-server就可以启动服务端了，还可以用下面的方法指定要加载的配置文件：复制代码代码如下:./redis-server ../redis.conf默认情况下，redis-server会以非daemon的方式来运行，且默认服务端口为6379。有关作者为什么选择6379作为默认

2017-12-25 11:21:18 2304 1

转载 Redis 数据库安装及使用方法

Redis 安装Window 下安装下载地址：https://github.com/MSOpenTech/redis/releases。Redis 支持 32 位和 64 位。这个需要根据你系统平台的实际情况选择，这里我们下载 Redis-x64-xxx.zip压缩包到 C 盘，解压后，将文件夹重新命名为 redis。打开一个 cmd 窗口使用cd命令切换

2017-12-25 10:15:34 1027

翻译 python Requests用法进阶（二）

会话对象会话对象让你能够跨请求保持某些参数。它也会在同一个 Session 实例发出的所有请求之间保持 cookie，期间使用 urllib3 的 connection pooling 功能。所以如果你向同一主机发送多个请求，底层的 TCP 连接将会被重用，从而带来显著的性能提升。 (参见 HTTP persistent connection).会话对象具有主要的 Request

2017-12-25 09:55:26 1567

翻译 python Requests用法进阶

发送请求使用 Requests 发送网络请求非常简单。一开始要导入 Requests 模块：>>> import requests然后，尝试获取某个网页。本例子中，我们来获取 Github 的公共时间线：>>> r = requests.get('https://github.com/timeline.json')现在，我们有一个名为 r

2017-12-25 09:51:26 1537

转载 Python+OpenCV将图像转换为二进制格式

在学习tensorflow的过程中，有一个问题，tensorflow在训练的过程中读取的是二进制图像数据库文件，而不是图像文件，因此在进行训练、测试之前需要将图像文件转换为二进制格式。下面是我在ubuntu中使用python+OpenCV读取图像并转换为二进制格式文件的代码。#coding=utf-8'''Created on 2016年3月24日使用Ope

2017-12-25 09:37:21 10805

转载 MySQL5.7 添加用户、删除用户与授权

mysql -uroot -prootMySQL5.7 mysql.user表没有password字段改 authentication_string；一. 创建用户:命令:CREATE USER 'username'@'host' IDENTIFIED BY 'password';例子: CREATE USER 'dog'@'localhost' IDENT

2017-12-22 13:24:39 745

原创 mysql 首次安装后简单操作与语句新手入门

首先cd到安装目录中bin路径：这是我的安装路径以管理员身份打开cmd（防止权限不足）cd E:\>cd E:\mysql\mysql-5.5.40-winx64\bin首次安装需要输入 mysqld.exe -install启动mysqlE:\mysql\mysql-5.5.40-winx64\bin>net start mysqlMySQL 服务正在启动 .MySQL 服务已经启动...

2017-12-22 13:21:44 15068 1

原创 python中xpath常用方法小结

这是一个test.html文件内容first itemsecond itemthird itemfourth itemfifth item以下是xpath使用方法#coding:utf-8import lxmlimport lxml.etreehtml=lxml.etree.parse("test.html")print type(html)res=h

2017-12-21 10:55:31 2515

原创 python爬虫webdriver.Chrome 数据可视化简单案例matplotlib

这个项目的功能是在智联上搜索python几个方向的工作岗位数量，并以图片形式显示#coding:utf-8from selenium import webdriverimport re #正则表达式import matplotlib.pyplot as plt #数据可视化import matplotlibdef getworknumbersbyname(searchn

2017-12-21 10:46:31 1528

原创 python 模拟登陆联合航空处理验证码

其中验证码部分采用手动输入# coding:utf8import requests#识别验证码转换数据def captcha(captcha_data): with open("chunqiu.jpg","wb") as f: f.write(captcha_data) text=raw_input("输入验证码：") return text

2017-12-21 10:32:20 524

原创 python中递归的两个小案例

# coding:utf8# 使用递归函数需要注意防止栈溢出def fact(n): if n==1: return 1 return n*fact(n-1)a=fact(5)print(a)def fact(n): if n==1: return 1 return n+fact(n-1)print(fact(3)

2017-12-21 10:19:27 995

原创 python2和python3中urllib的用版本区别及用法爬虫基础

首先在python2中urllib和urllib2的区别：1.urllib2可以接受一个Request类的实例来设置URL请求的headers，urllib仅可以接受URL。这意味着，你不可以通过urllib模块伪装你的User Agent字符串等（伪装浏览器）。2.urllib提供urlencode方法用来GET查询字符串的产生，而urllib2没有。这是为何urllib常和urllib

2017-12-21 10:04:46 4534

原创 python中对文件的读写操作以及如何边写入边保存flush()

首先 python中打开文件大致常用的几类如下：1.写入文件write#这种写入方式会将原文本删除，重新写入File = open("test.txt",'w') 2.读取文件readFile = open("test.txt",'a+') 3.添加写入#这种写入是在原文件的基础上，继续写入File = open("test.txt",'a') ...

2017-12-20 16:30:19 20582 2

原创 python 线程互斥锁用法简单案例 threading.Lock()

# encoding: UTF-8import threadingimport time# # 创建锁# lock=threading.Lock()# # 锁定# lock.acquire()# # 释放# lock.release()def test_xc(num): f = open("test.txt", "a") f.write(str(num) +

2017-12-19 13:12:55 1375

原创 python 每天如何定时启动爬虫任务

python2.7环境下运行安装相关模块想要每天定时启动，最好是把程序放在linux服务器上运行，毕竟linux可以不用关机，即定时任务一直存活； #coding:utf8import datetimeimport timedef doSth(): # 把爬虫程序放在这个类里 print(u'这个程序要开始疯狂的运转啦')# 一般网站都是1:00点更新...

2017-12-18 14:15:45 29804 11

原创 python 直接用账号密码cookie登陆人人网获取页面

#coding:utf-8import urllibimport urllib2import cookielib#通过CookieJar（）类构建一个cookieJar（）对象，用来保存cookie的值cookie=cookielib.CookieJar()#通过HTTPCookieProcessor（）处理器构建一个处理器对象，用来处理cookkie#参数就是构建的Cook

2017-12-18 09:47:14 2292

原创 python selenium+phontomjs的详细用法及简单案例

运行环境python2.7相关模块需要自行下载安装将phontomjs插件放入环境变量所在的路径（提示：1.因为phontomjs是无界面浏览器，所以可以通过截图来直观展示 2.selenium类似于按键精灵，代替手动点击网页）模拟访问百度并截图#coding:utf8# 导入包from selenium import webdriver# 使用插件p

2017-12-18 09:32:14 455

转载 python 下载保存图片的urllib.urlretrieve()函数简单用法

运行环境python2.7#coding=utf-8import urllibimport redef getHtml(url): page = urllib.urlopen(url) html = page.read() return htmldef getImg(html): reg = r'src="(.+?\.jpg)" pic_ext'

2017-12-18 09:25:36 3989

转载 python 模拟登陆csdn

模拟用户登陆并提交用户信息，关键是找到相关元素，并给相关元素填充用户信息，webdriver+phontomjs(无界面浏览器)可以很好的做到这些。from selenium import webdriverfrom selenium.webdriver.common.desired_capabilities import DesiredCapabilitiesimport reque

2017-12-12 14:50:14 504

原创 python 爬取斗鱼 Ajax动态加载js分页使用phontomjs无界面浏览器

python2.7版本#coding:utf8import unittestfrom selenium import webdriverfrom bs4 import BeautifulSoup as bsclass douyu(unittest.TestCase): # 初始化方法，必须是setUp() def setUp(self): self.d

2017-12-08 13:56:33 1464

原创 mysql数据库下载安装配置详细说明附赠：网盘资源

从官网下载zip https://www.mysql.com/downloads/百度网盘地址zip https://pan.baidu.com/s/1kVORuWR 提取密码：vee0下载的是个压缩包，我是解压到F:\mysql-5.7.20-winx64然后将 F:\mysql-5.7.20-winx64\bin 加入环境变量（环境变量不会的查百度）在任

2017-12-07 15:14:46 1141

原创 vcredist_x64 百度网盘安全资源下载

https://pan.baidu.com/s/1gfB2rN9提取码 n5tg主要针对部分dll组件缺少，一键安装即可

2017-12-07 14:39:12 6343

原创 python phontomjs爬虫项目如何使用代理IP

from selenium import webdriverfrom random import choice# 添加ip列表，随机切换使用ips=['61.135.217.7:80', '153.99.16.84:8118', '101.68.73.54:53281', '219.138.58.86:3128', '101.69.23.183:88

2017-12-06 14:07:48 2358

原创 python多线程爬取糗事百科案例爬取结果保存json

# coding:utf-8# 使用了线程库import threading# 队列from Queue import Queue# 解析库from lxml import etree# 请求处理import requests# json处理import jsonimport timeclass ThreadCrawl(threading.Thread): de

2017-12-06 13:50:25 864

原创 python爬虫使用selenium+phontomjs 模拟点击输入获取东航加载后的源码机票价格

#coding:utf8from selenium import webdriverimport timedriver = webdriver.PhantomJS()driver.get('http://www.ceair.com/flight2014/pvg-nay-171201_CNY.html')time.sleep(1)driver.save_screenshot('5.

2017-12-06 13:42:30 4306 3

原创 python爬虫如何解析json文件 json文件的解析提取和jsonpath的应用

这是通过抓包工具抓取到的json文件然后json文件在线解析，把内容复制粘贴进去解析得出下面的内容（右边框内）json文件的地址url="http://www.lagou.com/lbs/getAllCitySearchLabels.json"用python来解析并提取出其中的城市名代码如下：#coding:utf8import urlli

2017-12-06 10:43:40 56786 1

原创 python 随机生成整数浮点数字符排序简单案例

#coding:utf8import random# 随机生成0到1之间的浮点数a=random.random()print(a)# 随机生成固定范围的浮点数b=random.uniform(1, 10)print(b)# 取字符串中随机一个字符c=random.choice('abcdefg&#%^*f')print(c)#取字符串中随机多个字符,生成列表d=ra

2017-12-06 09:38:16 11424

原创 python urllib2模拟浏览器请求爬虫

#coding:utf-8import urllib2ua_headers={ "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0"}request=urllib2.Request("http://baidu.com/",headers=ua_header

2017-12-04 14:49:03 1941

原创 python 列表字符串合并字符串切割去空格

#coding:utf8list=[' sfsfsf','ADFFDS','adas dasd']#将列表字符串合并str="".join(list)print str#去掉开头空格str=str.strip()print str#去掉中间空格str=str.replace(" ","")print str#将列表中字符串合并，连接处用字符‘+’连接str2="+".

2017-12-04 14:46:14 6599

原创 python时间格式输出以及递增日期加一天

import datetime#现在的时间now=datetime.datetime.now()#递增的时间delta=datetime.timedelta(days=1)#六天后的时间endnow=now+datetime.timedelta(days=6)#六天后的时间转换成字符串endnow=str(endnow.strftime('%Y-%m-%d'))of...

2017-12-04 14:30:23 14055

原创 python 使用免费爬取百度首页网页简单案例

#coding:utf8import urllib2url="http://www.baidu.com/"# 代理开关，表示是否启用代理# 西刺代理网址 http://www.xicidaili.com/proxyswitch=True# 构建一个Handler处理器对象，参数是一个字典类型，包括代理类型和代理服务器IP+PORThttpproxy_handler=urllib

2017-12-04 14:19:27 3021

原创 python 爬取西刺免费代理ip 并使用telnetlib.Telnet验证是否有效

最近运行使用时间2017.12.01运行结果正常运行环境python.27#coding:utf8from bs4 import BeautifulSoupimport urllib2import sysreload(sys)import telnetlibdef getProxyList(targeturl="http://www.xicidaili.com/nn/

2017-12-01 17:28:02 2026

空空如也

空空如也