Python
兵工厂三剑客
莫愁前路无知己,天下谁人不识君
展开
-
scrapy+scrapy-splash爬取猫眼电影榜单。
1.创建工程 scrapy startproject mtmovies2.生成爬虫 cd mtmovies scrapy genspider mt https://maoyan.com/3.编写items.py# Define here the models for your scraped items## See documentation in:# https://docs.scrapy.org/en/latest/topics/items.htmli...原创 2020-07-09 18:12:54 · 359 阅读 · 0 评论 -
python操作hdfs
安装hdfs包 pip3 install hdfs查看hdfs目录 1 2 3 4 5 6 [root@hadoop hadoop]# hdfs dfs -ls -R / drwxr-xr-x - root supergroup 0 2017-05-18 23:57 /Demo -rw-r--r-- 1 root supergroup 3494 2017-05-18 23:57 /Demo/hadoop-.原创 2020-07-09 13:22:45 · 645 阅读 · 0 评论 -
scrapy爬取js执行后的页面并保存数据到mysql
版本:scrapy-2.2.0,Python-2.7.51 安装pymysql 执行命令:pip3 install pymysql2 创建项目 执行命令:scrapy startproject jdproject3 生成spider cdjdproject scrapy genspider jd www.jd.com4.编辑items.py# Define here the models for your scraped items## See docum...原创 2020-07-07 14:40:53 · 464 阅读 · 0 评论