java抓取京东省市区县数据

一般的系统都会有地区信息数据,如果要你一个个录取全国的地区信息,你可能会抓狂!下面的程序或许能帮到你:

public class AreaUtils {

	private final static Map<Integer,String> provinces=new HashMap<Integer,String>();
	
	static{
		provinces.put(1, "北京");
		provinces.put(2, "上海");
		provinces.put(3, "天津");
		provinces.put(4, "重庆");
		provinces.put(5, "河北");
		provinces.put(6, "山西");
		provinces.put(7, "河南");
		provinces.put(8, "辽宁");
		provinces.put(9, "吉林");
		provinces.put(10, "黑龙江");
		provinces.put(11, "内蒙古");
		provinces.put(12, "江苏");
		provinces.put(13, "山东");
		provinces.put(14, "安徽");
		provinces.put(15, "浙江");
		provinces.put(16, "福建");
		provinces.put(17, "湖北");
		provinces.put(18, "湖南");
		provinces.put(19, "广东");
		provinces.put(20, "广西");
		provinces.put(21, "江西");
		provinces.put(22, "四川");
		provinces.put(23, "海南");
		provinces.put(24, "贵州");
		provinces.put(25, "云南");
		provinces.put(26, "西藏");
		provinces.put(27, "陕西");
		provinces.put(28, "甘肃");
		provinces.put(29, "青海");
		provinces.put(30, "宁夏");
		provinces.put(31, "新疆");
		provinces.put(32, "台湾");
		provinces.put(42, "香港");
		provinces.put(43, "澳门");
		provinces.put(84, "钓鱼岛");
	}
	private static final String area_pattern="\\[.+?\\]";
	public static String areaUrl="http://passport.jd.com/emReg/AjaxService.aspx?action=GetAreas&level=[level]&parentId=[parentId]";
	/**
	 * 
	 * @author YLPan
	 * @date 2013-5-15
	 * @param level 1 获取市 2获取区县
	 * @param parentId
	 * @return
	 * @throws Exception
	 */
	public static List<Map<String,Object>> getAreas(Integer level,Integer parentId) throws Exception{
		String cityUrl=areaUrl.replaceAll("\\[level\\]",String.valueOf(level)).replaceAll("\\[parentId\\]", String.valueOf(parentId));
		System.out.println("cityUrl:"+cityUrl);
		String cityJson=NetTool.getTextContent(cityUrl, "gbk");
		Pattern pattern = Pattern.compile(area_pattern);
		Matcher matcher = pattern.matcher(cityJson);
		if(matcher.find()){
			cityJson=matcher.group();
			List<Map<String,Object>> cityList=JsonUtils.readJson2ListMap(cityJson);
			return cityList;
		}
		return null;
	}
	public static void areaInit() throws Exception{
		for(Entry<Integer,String> entry : provinces.entrySet()){
			System.out.println("province:"+entry.getValue());
				List<Map<String,Object>> cityList=getAreas(1,entry.getKey());
				if(cityList==null)continue;
				for(Map<String,Object> citymap : cityList){
					Integer cityId=(Integer)citymap.get("Id");
					String cityName=(String)citymap.get("Name");
					System.out.println("--cityName:"+cityName);
					List<Map<String,Object>> countyList=getAreas(2,cityId);
					if(countyList==null)continue;
						for(Map<String,Object> countyMap : countyList){
							Integer countyId=(Integer)countyMap.get("Id");
							String countyName=(String)countyMap.get("Name");
							System.out.println("----countyName:"+countyName);
					}
			}
		}
	}
	public static void main(String[] args) {
		try {
			areaInit();
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

 输出数据:

province:北京
cityUrl:http://passport.jd.com/emReg/AjaxService.aspx?action=GetAreas&level=1&parentId=1
--cityName:朝阳区
cityUrl:http://passport.jd.com/emReg/AjaxService.aspx?action=GetAreas&level=2&parentId=72
----countyName:三环以内
----countyName:三环到四环之间
----countyName:四环到五环之间
----countyName:五环到六环之间
----countyName:管庄
----countyName:北苑
----countyName:定福庄
--cityName:海淀区
cityUrl:http://passport.jd.com/emReg/AjaxService.aspx?action=GetAreas&level=2&parentId=2800
----countyName:三环以内
----countyName:三环到四环之间
----countyName:四环到五环之间
----countyName:五环到六环之间
----countyName:六环以外
----countyName:上地
----countyName:西三旗
----countyName:清河
----countyName:圆明园西路
----countyName:农业大学西校区
----countyName:西二旗
........................................

 可能要浏览器是访问http://passport.jd.com/emReg/AjaxService.aspx?action=GetAreas&level=1&parentId=1,返回 的数据格式如下:

({"Areas":[{"Id":72,"Name":"朝阳区"},{"Id":2800,"Name":"海淀区"},{"Id":2801,"Name":"西城区"},{"Id":2802,"Name":"东城区"},{"Id":2803,"Name":"崇文区"},{"Id":2804,"Name":"宣武区"},{"Id":2805,"Name":"丰台区"},{"Id":2806,"Name":"石景山区"},{"Id":2807,"Name":"门头沟"},{"Id":2808,"Name":"房山区"},{"Id":2809,"Name":"通州区"},{"Id":2810,"Name":"大兴区"},{"Id":2812,"Name":"顺义区"},{"Id":2814,"Name":"怀柔区"},{"Id":2816,"Name":"密云区"},{"Id":2901,"Name":"昌平区"},{"Id":2953,"Name":"平谷区"},{"Id":3065,"Name":"延庆县"}]})

 说明:其他NetTool,JsonUtils是封装好的工具类,已上传

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
province:北京 --cityName:朝阳区 ----countyName:三环以内 ----countyName:三环到四环之间 ----countyName:四环到五环之间 ----countyName:五环到六环之间 ----countyName:管庄 ----countyName:北苑 ----countyName:定福庄 --cityName:海淀区 ----countyName:三环以内 ----countyName:三环到四环之间 ----countyName:四环到五环之间 ----countyName:五环到六环之间 ----countyName:六环以外 ----countyName:上地 ----countyName:西三旗 ----countyName:清河 ----countyName:圆明园西路 ----countyName:农业大学西校区 ----countyName:西二旗 --cityName:西城区 ----countyName:内环到二环里 ----countyName:二环到三环 --cityName:东城区 ----countyName:内环到三环里 --cityName:崇文区 ----countyName:一环到二环 ----countyName:二环到三环 --cityName:宣武区 ----countyName:内环到三环里 --cityName:丰台区 ----countyName:四环到五环之间 ----countyName:二环到三环 ----countyName:三环到四环之间 ----countyName:五环到六环之间 ----countyName:六环之外 --cityName:石景山区 ----countyName:四环到五环内 ----countyName:石景山城区 ----countyName:八大处科技园区 --cityName:门头沟 ----countyName:郊区 ----countyName:城区以内 --cityName:房山区 ----countyName:郊区 ----countyName:城区以内 --cityName:通州区 ----countyName:五环到六环之间 ----countyName:六环以外(其他地区) ----countyName:六环以外(张家湾镇、台湖镇、漷镇、宋庄镇) ----countyName:六环以外(于家务乡) --cityName:大兴区 ----countyName:四环至五环之间 ----countyName:六环以外 ----countyName:五环至六环之间 ----countyName:北京经济技术开发区 --cityName:顺义区 ----countyName:顺义区(城区内,天竺镇,马坡镇,牛栏山镇,后沙峪镇城区) ----countyName:顺义区(郊区) ----countyName:顺义区(南彩镇、李桥镇) --cityName:怀柔区 ----countyName:郊区 ----countyName:城区以内 --cityName:密云区 ----countyName:城区以外 ----countyName:城区 --cityName:昌平区 ----countyName:城区以外 ----countyName:六环以内 ----countyName:城区 --cityName:平谷区 ----countyName:城区以外 ----countyName:城区 --cityName:延庆 ----countyName:百泉路南,京新高速北,康张路西,京银路东 ----countyName:百泉路北,京新高速南,康张路东,京银路西 province:上海 里面还包含:jsonUtils工具类,Nettool工具类
province:北京 --cityName:朝阳区 ----countyName:三环以内 ----countyName:三环到四环之间 ----countyName:四环到五环之间 ----countyName:五环到六环之间 ----countyName:管庄 ----countyName:北苑 ----countyName:定福庄 --cityName:海淀区 ----countyName:三环以内 ----countyName:三环到四环之间 ----countyName:四环到五环之间 ----countyName:五环到六环之间 ----countyName:六环以外 ----countyName:上地 ----countyName:西三旗 ----countyName:清河 ----countyName:圆明园西路 ----countyName:农业大学西校区 ----countyName:西二旗 --cityName:西城区 ----countyName:内环到二环里 ----countyName:二环到三环 --cityName:东城区 ----countyName:内环到三环里 --cityName:崇文区 ----countyName:一环到二环 ----countyName:二环到三环 --cityName:宣武区 ----countyName:内环到三环里 --cityName:丰台区 ----countyName:四环到五环之间 ----countyName:二环到三环 ----countyName:三环到四环之间 ----countyName:五环到六环之间 ----countyName:六环之外 --cityName:石景山区 ----countyName:四环到五环内 ----countyName:石景山城区 ----countyName:八大处科技园区 --cityName:门头沟 ----countyName:郊区 ----countyName:城区以内 --cityName:房山区 ----countyName:郊区 ----countyName:城区以内 --cityName:通州区 ----countyName:五环到六环之间 ----countyName:六环以外(其他地区) ----countyName:六环以外(张家湾镇、台湖镇、漷镇、宋庄镇) ----countyName:六环以外(于家务乡) --cityName:大兴区 ----countyName:四环至五环之间 ----countyName:六环以外 ----countyName:五环至六环之间 ----countyName:北京经济技术开发区 --cityName:顺义区 ----countyName:顺义区(城区内,天竺镇,马坡镇,牛栏山镇,后沙峪镇城区) ----countyName:顺义区(郊区) ----countyName:顺义区(南彩镇、李桥镇) --cityName:怀柔区 ----countyName:郊区 ----countyName:城区以内 --cityName:密云区 ----countyName:城区以外 ----countyName:城区 --cityName:昌平区 ----countyName:城区以外 ----countyName:六环以内 ----countyName:城区 --cityName:平谷区 ----countyName:城区以外 ----countyName:城区 --cityName:延庆 ----countyName:百泉路南,京新高速北,康张路西,京银路东 ----countyName:百泉路北,京新高速南,康张路东,京银路西 province:上海 --cityName:黄浦区 里面包含了jsonutils工具类 地区json NetTool类

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值