使用xPath读取html文件

最新推荐文章于 2023-07-23 23:52:44 发布

没有能与不能只有想与不想

最新推荐文章于 2023-07-23 23:52:44 发布

阅读量1k

点赞数

分类专栏： XML-----------------------

本文链接：https://blog.csdn.net/yanghui07216/article/details/53664818

版权

XML----------------------- 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

读取一个html文件中的联系人的所有信息

html文件：personList.html

<html>
	<head>
		<title>传智播客1月18号班通讯录</title>
		<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
	</head>
	<body>
		<center><h1>12月16号就业班通讯录</h1></center>
		<table border="1" align="center" id="contactForm">
			<thead>	
				<tr><th>编号</th><th>姓名</th><th>性别</th><th>年龄</th><th>地址</th><th>电话</th></tr>
			</thead>
			<tbody>
				<tr>
				<td>001</td>
				<td>张三</td>
				<td>男</td>
				<td>18</td>
				<td>广州市天河区</td>
				<td>134000000000</td>
				</tr>
				<tr>
				<td>002</td>
				<td>李四</td>
				<td>女</td>
				<td>20</td>
				<td>广州市越秀区</td>
				<td>13888888888</td>
				</tr>
				<tr>
				<td>002</td>
				<td>郭靖</td>
				<td>男</td>
				<td>30</td>
				<td>广州市番禺区</td>
				<td>1342214321</td>
				</tr>
			</tbody>
		</table>
	</body>
</html>

实现的主程序：Demo_xPath_html.java

package xPath;

import java.io.File;
import java.util.List;

import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

public class Demo_xPath_html {
	public static void main(String[] args) throws Exception{
		Document doc = new SAXReader().read(new File("./src/personList.html"));
		System.out.println(doc);
		//读取title标签
		Element titleElem = (Element) doc.selectSingleNode("//title");
		String title = titleElem.getText();
		System.out.println(title);
		
		/*
		 * 练习：读取联系人的所有信息
		 * 按照以下格式输出：
		 * 		编号：001 姓名：。。。
		 *  	。。。
		 */
		//1.读取出所有的tbody中的tr标签
		List<Element> list = (List<Element>)doc.selectNodes("//tbody/tr");
		//2.遍历
		for(Element elem : list) {
			//编号
			//String id = ((Element)elem.elements().get(0)).getText();//方法一
			String id = ((Element)elem.selectSingleNode("td[1]")).getText();//方法二
			String name = ((Element)elem.elements().get(1)).getText();
			String gender = ((Element)elem.elements().get(2)).getText();
			String age = ((Element)elem.elements().get(3)).getText();
			String address = ((Element)elem.elements().get(4)).getText();
			String phone = ((Element)elem.elements().get(5)).getText();
			System.out.println("编号："+id+"\t姓名："+name+"\t性别："+gender+"\t地址："+address+"\t电话："+phone);
		}
	}
}

没有能与不能只有想与不想

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
使用xPath读取html文件

读取一个html文件中的联系人的所有信息html文件：personList.html 传智播客1月18号班通讯录 12月16号就业班通讯录编号姓名性别年龄地址电话 001 张三男 18 广州市天河区 134000000000 002
复制链接

扫一扫