- 博客(9)
- 收藏
- 关注
原创 JAVA简易控制台选择题答题,改自书上代码
读出txt文件中按一定规则存储的题目,以2017年的一次软考题目为例。效果如下:StandardExam.javapackage com.company.com.fancy;import java.io.*;import java.util.Scanner;import java.util.regex.Matcher;import java.util.regex.Pattern;//简易...
2018-04-21 22:27:54 6105 1
原创 scrapy使用selenium中间件 爬取半次元图片
spider.py# -*- coding: utf-8 -*-import scrapyimport logging,timefrom sec_bcy.items import BcyItemclass BcyspiderSpider(scrapy.Spider): name = 'bcySpider' page_index = 2 url='https:/...
2018-04-01 22:22:29 906
原创 scrapy爬取豆瓣“选电影”各分类中的电影信息
分类中电影的详细链接可在ajax返回的json中查看。spider.py:# -*- coding: utf-8 -*-import scrapy,jsonfrom urllib.parse import quotefrom sec_douban.items import SecDoubanItemclass SpidermovieSpider(scrapy.Spider): na...
2018-03-25 20:49:46 1043
原创 Scrapy 粗略爬取豆瓣影视相关信息
douban.py# -*- coding: utf-8 -*-import scrapyfrom scrapy.linkextractors import LinkExtractorfrom scrapy.spiders import CrawlSpider, Rulefrom crawl_douban.items import CrawlDoubanItemclass Douba...
2018-03-24 17:00:15 316
原创 PY27 RE匹配 输入账号密码获取盐城工学院教务系统成绩
# coding:utf-8# 盐工教务系统import urllib2,urllibimport cookielib,re#临时存储学生cookiestuCookie=''#声明一个CookieJar对象实例来保存cookiecookie = cookielib.CookieJar()#利用urllib2库的HTTPCookieProcessor对象来创建cookie处理器...
2018-03-24 16:48:06 1483 1
转载 python多线程threading基本案例
# coding:utf-8import threading,requests,jsonfrom Queue import Queue# empty put getfrom lxml import etreeCRAW_EXIT=FalsePARSE_EXIT=Falsetotal=1class ThreadCrawl(threading.Thread): def __i...
2018-03-24 16:43:52 406
原创 Scrapy默认spider爬取熊猫星颜主播头像
pandaSpider.py# -*- coding: utf-8 -*-import scrapy,jsonfrom crawl_pandatv.items import PandatvItemimport loggingclass PandaspiderSpider(scrapy.Spider): name = 'panda' allowed_domains = ['...
2018-03-24 16:39:42 379
转载 Scrapy CrawlSpider demo
dongguan.py# -*- coding: utf-8 -*-import scrapyfrom scrapy.linkextractors import LinkExtractorfrom scrapy.spiders import CrawlSpider, Rulefrom lx_dongguan.items import LxDongguanItemclass Donggu...
2018-03-24 16:35:05 202
原创 python3 selenium xpath 下载斗鱼颜值主播头像 入门demo
#coding:utf-8#下载斗鱼主播图片from selenium import webdriverfrom selenium.webdriver.chrome.options import Optionsfrom lxml import etreeimport requestsimport time#设置使用chrome headlesschrome_options = O...
2018-03-19 20:17:08 456
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人