xpath java html,用Java中的XPath查询HTML页面

Can anyone advise me a library for Java that allows me to perform an XPath Query over an html page?

I tried using JAXP but it keeps giving me a strange error that I cannot seem to fix (thread "main" java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd).

Thank you very much.

EDIT

I found this:

// Create a new SAX Parser factory

SAXParserFactory factory = SAXParserFactory.newInstance();

// Turn on validation

factory.setValidating(true);

// Create a validating SAX parser instance

SAXParser parser = factory.newSAXParser();

// Create a new DOM Document Builder factory

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

// Turn on validation

factory.setValidating(true);

// Create a validating DOM parser

DocumentBuilder builder = factory.newDocumentBuilder();

from http://www.ibm.com/developerworks/xml/library/x-jaxpval.html But turning the argumrent to false did not change anything.

解决方案

Setting the parser to "non validating" just turns off validation; it does not inhibit fetching of DTD's. Fetching of DTD is needed not just for validation, but also for entity expansion... as far as I recall.

If you want to suppress fetching of DTD's, you need to register a proper EntityResolver to the DocumentBuilderFactory or DocumentBuilder. Implement the EntityResolver's resolveEntity method to always return an empty string.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值