1.首先,统计方式分为两种,一个是通过java内置的xpath统计,另一个是通过StAX统计
2.xpath统计
1) 很遗憾的是,这种方式如果xml文件比较大的话会导致内存溢出。
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
/**
* @create: 2019/5/3 10:25
**/
public class testXml {
public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
// 解析xml获得数据
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
inputFactory.setProperty("http://www.oracle.com/xml/jaxp/properties/getEntityCountInfo", "yes");
// 设置entity size , 否则会报 JAXP00010004 错误
inputFactory.setProperty("http://www.oracle.com/xml/jaxp/properties/totalEntitySizeLimit", Integer.MAX_VALUE);
File file = new File("D:\\workplace\\xml.xml");
InputStream isS = new FileInputStream(file);
XMLStreamReader streamReader = inputFactory.createXMLStreamReader(isS);
int count = 0;
while (streamReader.hasNext()) {
streamReader.next();
if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT) {
// System.out.println(streamReader.getLocalName());
if (streamReader.getLocalName().equals("jiedian1")) {
count++;
}
}
}
System.out.println("count: " + count);
}
}
其他匹配模式可以参考:https://blog.csdn.net/luoww1/article/details/49736559
3.StAX统计
1) 首先是统计元素的方法,大文件也能统计
/**
* 根据xml文件路径,统计指定元素的数量
* @param xpath 文件路径
* @param elementName 元素名称
* @return 统计的数量
*/
public int xpathForCountByEN(String xpath, String elementName) throws FileNotFoundException, XMLStreamException {
File file = new File(xpath);
InputStream is = new FileInputStream(file);
XMLStreamReader streamReader = inputFactory.createXMLStreamReader(is);
int count = 0;
while (streamReader.hasNext()) {
streamReader.next();
if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT) {
// System.out.println(streamReader.getLocalName());
if (streamReader.getLocalName().equals(elementName)) {
count++;
}
}
}
System.out.printf("---> %s count: %d \n", elementName, count);
return count;
}
2) 统计多个元素的方法
public Map xpathForCountByEN(String xpath, List<String> elementNames) throws FileNotFoundException, XMLStreamException {
File file = new File(xpath);
Map<String, Integer> result = new HashMap<>();
InputStream is = new FileInputStream(file);
XMLStreamReader streamReader = inputFactory.createXMLStreamReader(is);
while (streamReader.hasNext()) {
streamReader.next();
if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT) {
for(String elementName: elementNames) {
if (streamReader.getLocalName().equals(elementName)) {
if (result.containsKey(elementName)) {
result.put(elementName, result.get(elementName) + 1);
} else {
result.put(elementName, 1);
}
}
}
}
}
System.out.printf("---> %s result: %s \n", file.getName(), result);
return result;
}