Python
兵工厂三剑客
莫愁前路无知己,天下谁人不识君
展开
-
scrapy+scrapy-splash爬取猫眼电影榜单。
1.创建工程 scrapy startproject mtmovies 2.生成爬虫 cd mtmovies scrapy genspider mt https://maoyan.com/ 3.编写items.py # Define here the models for your scraped items # # See documentation in: # https://docs.scrapy.org/en/latest/topics/items.html i...原创 2020-07-09 18:12:54 · 368 阅读 · 0 评论 -
python操作hdfs
安装hdfs包 pip3 install hdfs 查看hdfs目录 1 2 3 4 5 6 [root@hadoop hadoop]# hdfs dfs -ls -R / drwxr-xr-x - root supergroup 0 2017-05-18 23:57 /Demo -rw-r--r-- 1 root supergroup 3494 2017-05-18 23:57 /Demo/hadoop-.原创 2020-07-09 13:22:45 · 672 阅读 · 0 评论 -
scrapy爬取js执行后的页面并保存数据到mysql
版本:scrapy-2.2.0,Python-2.7.5 1 安装pymysql 执行命令:pip3 install pymysql 2 创建项目 执行命令:scrapy startproject jdproject 3 生成spider cdjdproject scrapy genspider jd www.jd.com 4.编辑items.py # Define here the models for your scraped items # # See docum...原创 2020-07-07 14:40:53 · 474 阅读 · 0 评论