Parse XML Using Java DOM API_using sax api using java file-CSDN博客

Java XPath Tutorial: How To Parse XML File Using XPath In Java

By Viral Patel on May 13, 2013

XPath is a language for finding information in an XML file. You can say that XPath is (sort of) SQL for XML files. XPath is used to navigate through elements and attributes in an XML document. You can also use XPath to traverse through an XML file in Java.

XPath comes with powerful expressions that can be used to parse an xml document and retrieve relevant information.

For demo, let us consider an xml file that holds information of employees.

 
    
         <? 
         xml 
          version 
         = 
         "1.0" 
         ?> 
        
 
         < 
         Employees 
         > 
        
 
              
         < 
         Employee 
          emplid 
         = 
         "1111" 
          type 
         = 
         "admin" 
         > 
        
 
                  
         < 
         firstname 
         >John</ 
         firstname 
         > 
        
 
                  
         < 
         lastname 
         >Watson</ 
         lastname 
         > 
        
 
                  
         < 
         age 
         >30</ 
         age 
         > 
        
 
                  
         < 
         email 
         >johnwatson@sh.com</ 
         email 
         > 
        
 
              
         </ 
         Employee 
         > 
        
 
              
         < 
         Employee 
          emplid 
         = 
         "2222" 
          type 
         = 
         "admin" 
         > 
        
 
                  
         < 
         firstname 
         >Sherlock</ 
         firstname 
         > 
        
 
                  
         < 
         lastname 
         >Homes</ 
         lastname 
         > 
        
 
                  
         < 
         age 
         >32</ 
         age 
         > 
        
 
                  
         < 
         email 
         >sherlock@sh.com</ 
         email 
         > 
        
 
              
         </ 
         Employee 
         > 
        
 
              
         < 
         Employee 
          emplid 
         = 
         "3333" 
          type 
         = 
         "user" 
         > 
        
 
                  
         < 
         firstname 
         >Jim</ 
         firstname 
         > 
        
 
                  
         < 
         lastname 
         >Moriarty</ 
         lastname 
         > 
        
 
                  
         < 
         age 
         >52</ 
         age 
         > 
        
 
                  
         < 
         email 
         >jim@sh.com</ 
         email 
         > 
        
 
              
         </ 
         Employee 
         > 
        
 
              
         < 
         Employee 
          emplid 
         = 
         "4444" 
          type 
         = 
         "user" 
         > 
        
 
                  
         < 
         firstname 
         >Mycroft</ 
         firstname 
         > 
        
 
                  
         < 
         lastname 
         >Holmes</ 
         lastname 
         > 
        
 
                  
         < 
         age 
         >41</ 
         age 
         > 
        
 
                  
         < 
         email 
         >mycroft@sh.com</ 
         email 
         > 
        
 
              
         </ 
         Employee 
         > 
        
 
         </ 
         Employees 
         > 
        
 
  

I have saved this file at path C:\employees.xml. We will use this xml file in our demo and will try to fetch useful information using XPath. Before we start lets check few facts from above xml file.

There are 4 employees in our xml file
Each employee has a unique employee id defined by attribute emplid
Each employee also has an attribute type which defines whether an employee is admin or user.
Each employee has four child nodes: firstname, lastname, age and email
Age is a number

Let’s get started…

1. Learning Java DOM Parsing API

In order to understand XPath, first we need to understand basics of DOM parsing in Java. Java provides powerful implementation of domparser in form of below API.

1.1 Creating a Java DOM XML Parser

First, we need to create a document builder using DocumentBuilderFactory class. Just follow the code. It’s pretty much self explainatory.

 
         import 
          javax.xml.parsers.DocumentBuilder; 
        
         import 
          javax.xml.parsers.DocumentBuilderFactory; 
        
         import 
          javax.xml.parsers.ParserConfigurationException; 
        
         //... 
        
         DocumentBuilderFactory builderFactory = 
        
         DocumentBuilderFactory.newInstance(); 
        
         DocumentBuilder builder =  
         null 
         ; 
        
         try 
          { 
        
         builder = builderFactory.newDocumentBuilder(); 
        
         }  
         catch 
          (ParserConfigurationException e) { 
        
         e.printStackTrace();   
        
         }

1.2 Parsing XML with a Java DOM Parser

Once we have a document builder object. We uses it to parse XML file and create a document object.

 
         import 
          org.w3c.dom.Document; 
        
         import 
          java.io.IOException; 
        
         import 
          org.xml.sax.SAXException; 
        
         //... 
        
         try 
          { 
        
         Document document = builder.parse( 
        
         new 
          FileInputStream( 
         "c:\\employees.xml" 
         )); 
        
         }  
         catch 
          (SAXException e) { 
        
         e.printStackTrace(); 
        
         }  
         catch 
          (IOException e) { 
        
         e.printStackTrace(); 
        
         }

In above code, we are parsing an XML file from filesystem. Sometimes you might want to parse XML specified as String value instead of reading it from file. Below code comes handy to parse XML specified as String.

 
         String xml = ...; 
        
         Document xmlDocument = builder.parse( 
         new 
          ByteArrayInputStream(xml.getBytes()));

1.3 Creating an XPath object

Once we have document object. We are ready to use XPath. Just create an xpath object using XPathFactory.

 
         import 
          javax.xml.xpath.XPath; 
        
         import 
          javax.xml.xpath.XPathFactory; 
        
         //... 
        
         XPath xPath =  XPathFactory.newInstance().newXPath();

1.4 Using XPath to parse the XML

Use xpath object to complie an XPath expression and evaluate it on document. In below code we read email address of employee having employee id = 3333. Also we have specified APIs to read an XML node and a nodelist.

 
         String expression =  
         "/Employees/Employee[@emplid='3333']/email" 
         ; 
        
         //read a string value 
        
         String email = xPath.compile(expression).evaluate(xmlDocument); 
        
         //read an xml node using xpath 
        
         Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); 
        
         //read a nodelist using xpath 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);

2. Learning XPath Expressions

As mentioned above, XPath uses a path expression to select nodes or list of node from an xml document. Heres a list of useful paths and expression that can be used to select any node/nodelist from an xml document.

Expression	Description
`nodename`	Selects all nodes with the name “nodename”
`/`	Selects from the root node
`//`	Selects nodes in the document from the current node that match the selection no matter where they are
`.`	Selects the current node
`..`	Selects the parent of the current node
`@`	Selects attributes
`employee`	Selects all nodes with the name “employee”
`employees/employee`	Selects all employee elements that are children of employees
`//employee`	Selects all book elements no matter where they are in the document

Below list of expressions are called Predicates. The Predicates are defined in square brackets [ ... ]. They are used to find a specific node or a node that contains a specific value.

Path Expression	Result
`/employees/employee[1]`	Selects the first employee element that is the child of the employees element.
`/employees/employee[last()]`	Selects the last employee element that is the child of the employees element
`/employees/employee[last()-1]`	Selects the last but one employee element that is the child of the employees element
`//employee[@type='admin']`	Selects all the employee elements that have an attribute named type with a value of ‘admin’

There are other useful expressions that you can use to query the data.

Read this w3school page for more details: http://www.w3schools.com/xpath/xpath_syntax.asp

3. Examples: Query XML document using XPath

Below are few examples of using different expressions of xpath to fetch some information from xml document.

3.1 Read firstname of all employees

Below expression will read firstname of all the employees.

 
         String expression =  
         "/Employees/Employee/firstname" 
         ; 
        
         System.out.println(expression); 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         }

Output:

 
         John 
        
         Sherlock 
        
         Jim 
        
         Mycroft

3.2 Read a specific employee using employee id

Below expression will read employee information for employee with emplid = 2222. Check how we used API to retrieve node information and then traveresed this node to print xml tag and its value.

 
         String expression =  
         "/Employees/Employee[@emplid='2222']" 
         ; 
        
         System.out.println(expression); 
        
         Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); 
        
         if 
         ( 
         null 
          != node) { 
        
         nodeList = node.getChildNodes(); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; 
         null 
         !=nodeList && i < nodeList.getLength(); i++) { 
        
         Node nod = nodeList.item(i); 
        
         if 
         (nod.getNodeType() == Node.ELEMENT_NODE) 
        
         System.out.println(nodeList.item(i).getNodeName() +  
         " : " 
          + nod.getFirstChild().getNodeValue());  
        
         } 
        
         }

Output:

 
         firstname : Sherlock 
        
         lastname : Homes 
        
         age : 32 
        
         email : sherlock@sh.com

3.3 Read firstname of all employees who are admin

This is again a predicate example to read firstname of all employee who are admin (defined by type=admin).

 
         String expression =  
         "/Employees/Employee[@type='admin']/firstname" 
         ; 
        
         System.out.println(expression); 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         }

Output:

 
         John 
        
         Sherlock

3.4 Read firstname of all employees who are older than 40 year

See how we used predicate to filter employees who has age > 40.

 
         String expression =  
         "/Employees/Employee[age>40]/firstname" 
         ; 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         }

Output:

 
         Jim 
        
         Mycroft

3.5 Read firstname of first two employees (defined in xml file)

Within predicates, you can use position() to identify the position of xml element. Here we are filtering first two employees using position().

 
         String expression =  
         "/Employees/Employee[position() <= 2]/firstname" 
         ; 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         }

Output:

 
         John 
        
         Sherlock

4. Complete Java source code

In order to execute this source, just create a basic Java project in your IDE or just save below code in Main.java and execute. It will need employees.xml file as input. Copy the employee xml defined in start of this tutorial at c:\\employees.xml.

 
         package 
          net.viralpatel.java; 
        
         import 
          java.io.File; 
        
         import 
          java.io.FileInputStream; 
        
         import 
          java.io.FileNotFoundException; 
        
         import 
          java.io.IOException; 
        
         import 
          javax.xml.parsers.DocumentBuilder; 
        
         import 
          javax.xml.parsers.DocumentBuilderFactory; 
        
         import 
          javax.xml.parsers.ParserConfigurationException; 
        
         import 
          javax.xml.xpath.XPath; 
        
         import 
          javax.xml.xpath.XPathConstants; 
        
         import 
          javax.xml.xpath.XPathExpressionException; 
        
         import 
          javax.xml.xpath.XPathFactory; 
        
         import 
          org.w3c.dom.Document; 
        
         import 
          org.w3c.dom.Node; 
        
         import 
          org.w3c.dom.NodeList; 
        
         import 
          org.xml.sax.SAXException; 
        
         public 
          class 
          Main { 
        
         public 
          static 
          void 
          main(String[] args) { 
        
         try 
          { 
        
         FileInputStream file =  
         new 
          FileInputStream( 
         new 
          File( 
         "c:/employees.xml" 
         )); 
        
         DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); 
        
         DocumentBuilder builder =  builderFactory.newDocumentBuilder(); 
        
         Document xmlDocument = builder.parse(file); 
        
         XPath xPath =  XPathFactory.newInstance().newXPath(); 
        
         System.out.println( 
         "*************************" 
         ); 
        
         String expression =  
         "/Employees/Employee[@emplid='3333']/email" 
         ; 
        
         System.out.println(expression); 
        
         String email = xPath.compile(expression).evaluate(xmlDocument); 
        
         System.out.println(email); 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee/firstname" 
         ; 
        
         System.out.println(expression); 
        
         NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[@type='admin']/firstname" 
         ; 
        
         System.out.println(expression); 
        
         nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[@emplid='2222']" 
         ; 
        
         System.out.println(expression); 
        
         Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); 
        
         if 
         ( 
         null 
          != node) { 
        
         nodeList = node.getChildNodes(); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; 
         null 
         !=nodeList && i < nodeList.getLength(); i++) { 
        
         Node nod = nodeList.item(i); 
        
         if 
         (nod.getNodeType() == Node.ELEMENT_NODE) 
        
         System.out.println(nodeList.item(i).getNodeName() +  
         " : " 
          + nod.getFirstChild().getNodeValue());  
        
         } 
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[age>40]/firstname" 
         ; 
        
         nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         System.out.println(expression); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[1]/firstname" 
         ; 
        
         System.out.println(expression); 
        
         nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[position() <= 2]/firstname" 
         ; 
        
         System.out.println(expression); 
        
         nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         expression =  
         "/Employees/Employee[last()]/firstname" 
         ; 
        
         System.out.println(expression); 
        
         nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); 
        
         for 
          ( 
         int 
          i =  
         0 
         ; i < nodeList.getLength(); i++) { 
        
         System.out.println(nodeList.item(i).getFirstChild().getNodeValue());  
        
         } 
        
         System.out.println( 
         "*************************" 
         ); 
        
         }  
         catch 
          (FileNotFoundException e) { 
        
         e.printStackTrace(); 
        
         }  
         catch 
          (SAXException e) { 
        
         e.printStackTrace(); 
        
         }  
         catch 
          (IOException e) { 
        
         e.printStackTrace(); 
        
         }  
         catch 
          (ParserConfigurationException e) { 
        
         e.printStackTrace(); 
        
         }  
         catch 
          (XPathExpressionException e) { 
        
         e.printStackTrace(); 
        
         }        
        
         } 
        
         }

Reference from: http://viralpatel.net/blogs/java-xml-xpath-tutorial-parse-xml/