java 网络爬虫jsoup 抓取全中国 省市县镇村 完整全集信息 代码
代码下载地址scofield7419/ChinesePCCTVLocationExtraction
ChinesePCCTVLocationExtraction
This’s a cool try, budy.
Description
Complete Chinese location infos in the format of province_city_county_town_village.
Cautions of the methods calling order:
For the first use of this program, you have to run “readAllProv();” method first under the annotated method “getAllMaps();” in order to get all the property files.Then, annotate this method and call method “getAllMaps();”.
just like this:
The roadmap is constructed in accordance with the following four guidelines:
- the datas was crawed from “中华人民共和国国家统计局2015数据”.
- the 3rd-party lib was jsoup.
- Because of the data trafic constraint of the target server “中华人民共和国国家统计局2015数据”,I couldn’t get all the datas at once program running.So I just design a approach by utilizing the property files and sovled the problem.
here is the properties folder:
properties/北京市.properties
here is the outputs folder:
outputs/province_city_county_town_village.txt
and the output file was writed like this:
other file in assets:
assets/2015年全国城市省市县区行政级别对照表.xls
assets/province_city_county.txt
ps:formats in PCC.
and it look like this:
Scofield.Phil
Email: feish7419@163.com
move fast, break things.