java sax 解析任何一个xml,使用SAX(Java)从单个TCP流中解析多个XML消息

I'm in a position where I use Java to connect to a TCP port and am streamed XML documents one after another, each delimited with the <?xml start of document tag. An example which demonstrates the format:

Fred Bloggs

Peter Jones

I'm using the org.xml.sax.* api. The SAX parsing works perfectly for the first document but throws an exception when it comes across the start of the second document:

Exception in thread "main" org.xml.sax.SAXParseException: The processing instruction

target matching "[xX][mM][lL]" is not allowed.

The following skeleton class demonstrates the setup I'm using:

import org.xml.sax.InputSource;

import org.xml.sax.XMLReader;

import org.xml.sax.helpers.DefaultHandler;

import org.xml.sax.helpers.XMLReaderFactory;

import java.io.FileReader;

public class XMLTest extends DefaultHandler {

public XMLTest() {

super();

}

public static void main(String[] args) throws Exception {

XMLReader xr = XMLReaderFactory.createXMLReader();

XMLTest handler = new XMLTest();

xr.setContentHandler(handler);

xr.setErrorHandler(handler);

xr.parse(new InputSource(new Socket("127.0.0.1", 4555).getInputStream()));

}

}

I have no control over the format of the xml (it's a financial data feed), but I need to be able to parse it efficiently, and parse all the documents. I've spent the afternoon/evening trying different things but none have yielded results. Any help would be greatly appreciated.

解决方案

You'd like to split the stream on every <?xml version="1.0"?> and parse them all separately. The BufferedReader may be helpful in this. Kickoff example:

reader = new BufferedReader(new InputStreamReader(input, "UTF-8"));

StringBuilder builder = null;

for (String line; (line = reader.readLine()) != null;) {

if (line.startsWith("<?xml ")) {

if (builder != null) {

xr.parse(new InputSource(builder.toString()));

}

builder = new StringBuilder();

}

builder.append(line);

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值