这个是我自己爬下来的,全的,按照省、市、县进行划分,通俗易懂不解释。
话说我不知道怎么在文章中添加附件,有木有大神教我一下。3Q。
http://files.cnblogs.com/xiaoxiongbuwawa/XML3.XML_Format.xml
来源:http://blog.163.com/yuanzhf_2012/blog/static/2112011482012929454663/
关于爬这部分内容的代码,我这两天整理了一下。原本,我是使用循环套循环,一层一层的做的,代码写了几百行,而且几乎都是重复的代码,今天特意整理了一下,使用递归来实现这个东西。拿出来和大家分享一下,如果有不明白的可以找我。
1 package my.android.weather; 2 3 import java.io.File; 4 import java.io.FileOutputStream; 5 import java.io.InputStream; 6 import java.net.HttpURLConnection; 7 import java.net.URL; 8 9 import org.dom4j.Document; 10 import org.dom4j.DocumentHelper; 11 import org.dom4j.Element; 12 import org.dom4j.io.XMLWriter; 13 14 public class Weather 15 { 16 public static void main(String[] args) throws Exception 17 { 18 19 Element rootElement = DocumentHelper.createElement("中国"); 20 Document document = DocumentHelper.createDocument(rootElement); 21 22 new Weather().f(0, "", rootElement); 23 24 XMLWriter xmlWriter = new XMLWriter(new FileOutputStream(new File("d:/XML3.XML"))); 25 xmlWriter.write(document); 26 xmlWriter.close(); 27 } 28 29 public void f(int flag, String id, Element e) throws Exception 30 { 31 InputStream is = connection("http://m.weather.com.cn/data5/city" + id + ".xml"); 32 33 byte[] b = new byte[1024]; 34 int length; 35 StringBuffer sb = new StringBuffer(); 36 while ((length = is.read(b)) != -1) 37 { 38 String string = new String(b, 0, length); 39 sb.append(string); 40 } 41 42 while (sb.length() > 0) 43 { 44 int index = sb.indexOf(","); 45 if (index < 0) 46 index = sb.length(); 47 48 int i = sb.indexOf("|"); 49 50 String cID = sb.substring(0, i); 51 String cName = (String) sb.substring(i + 1, index); 52 53 if (flag == 3) 54 { 55 e.addAttribute("ID", cName); 56 } 57 else 58 { 59 Element cElement = e.addElement(cName); 60 61 f(flag + 1, cID, cElement); 62 } 63 sb.delete(0, index + 1); 64 } 65 is.close(); 66 } 67 68 public InputStream connection(String url) throws Exception 69 { 70 URL Url = new URL(url); 71 HttpURLConnection HttpConn = (HttpURLConnection) Url.openConnection(); 72 HttpConn.setRequestMethod("GET"); 73 HttpConn.setReadTimeout(5000); 74 return HttpConn.getInputStream(); 75 } 76 }
关于XML,我是使用的DMO4J大家需要导入一下jar包或者自己改写一下换成自己的XML转换生成方式。