XML文件到RTF和PDF的转换

原创 2004年10月21日 11:02:00



Author: Holyfair
E-Mail: Holyfair@sina.com



一.    序

     在一些运用中,我们通常会把一些文本和配置信息转换成XML文件进行传输,修改,保存.特别是具有一定模板性质的文档用XML文件来实现其管理就显得相当的方便了.提供对于XML文件的操作的java API很多,诸于DOM,JDOM,Castor,SAX,XMLReader,XPath,XSLT等等. 具体的这些API的用法这里就不多提了. 当使用这些接口实现XML的操作后,对于有些文档而言最终必须呈现给用户看的还是我们通常所熟悉的WORD和PDF文档.我们这里就来看一下从一个XML文件到RTF和PDF文件转换的实现.



二.    从XML到PDF

    对于一个具有一定模板性质的XML文件,我们可以用FOP API来实现其到PDF的转换.

   FOP需要fop.jar. 我们可以到http://xml.apache.org/fop/ 上获取和了解其用法.

   以一个一般复杂的XML文件为例:

   要转换XML文档 test.xml 如下:
  

<FeatureSRS title="SRS">
 <introduction>
  <objective>objective here</objective>
  <scope>scope here</scope>
  <responsibilities>responsibilities here</responsibilities>
  <references>reference here</references>
  <DAA>
    <term>
      term here
   </term>
   <definition>
       definition here
   </definition>
  </DAA>
 </introduction>
 <generalDescription>
  <featureName>
   <summary>summary here</summary>
   <breakdown>breakdown here</breakdown>
  </featureName>
  <requirement>
   <content>
        content here.
   </content>
  </requirement>
  <requirement>
   <content>
      content2 here.
   </content>
  </requirement>
 <featureInteractions>featureInteractions here</featureInteractions>
 </generalDescription>
 <strResources>
  <strResource>
   <estring>
    estring here
   </estring>
   <resourceid>
      resourceid here
   </resourceid>
   <rqmt>
     rqmt here.
   </rqmt>
  </strResource>
  </strResources>
</FeatureSRS>

     对于这样一个XML文档,我们要将其转化成PDF格式必须建立一个XSL-FO文件,来定义对各element和value格

式的转换.

     我们建立XSL-FO文件 test.xsl 如下:  
 

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" exclude-result-prefixes="fo">
 <xsl:output method="xml" version="1.0" omit-xml-declaration="no" indent="yes"/>
 <!-- ========================= -->
 <!-- root element: projectteam -->
 <!-- ========================= -->
 <xsl:template match="FeatureSRS">
  <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
   <fo:layout-master-set>
    <fo:simple-page-master master-name="simpleA4" page-height="29.7cm" page-width="21cm" margin-top="2cm" margin-bottom="2cm" margin-left="2cm" margin-right="2cm">
     <fo:region-body/>
    </fo:simple-page-master>
   </fo:layout-master-set>
   <fo:page-sequence master-reference="simpleA4">
    <fo:flow flow-name="xsl-region-body">
     <fo:block font-size="20pt" font-weight="bold" space-after="5mm" text-align="center">Cardiac Feature SRS
          </fo:block>
     <fo:block font-size="10pt">
      <xsl:apply-templates/>
     </fo:block>
    </fo:flow>
   </fo:page-sequence>
  </fo:root>
 </xsl:template>
 <!-- ========================= -->
 <!-- child element: member     -->
 <!-- ========================= -->
 <xsl:template name="introduction" match="introduction">
  <fo:block font-size="18pt" font-weight="bold" space-after="5mm">1.  Intruction</fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">1.1 Objective</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:value-of select="objective"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">1.2 Scope</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:value-of select="scope"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">1.3. Responsibilities</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:value-of select="responsibilities"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">1.4. References</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:value-of select="references"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">1.5. Definitions, Acronyms, and Abbreviations</fo:block>
  <fo:block font-size="10pt" font-weight="bold" space-after="5mm" margin-left="5mm">
   <fo:table table-layout="fixed" border="2cm" background-color="#fff2d9" >
    <fo:table-column column-width="4cm"/>
    <fo:table-column column-width="6cm"/>
    <fo:table-body>
     <fo:table-row border="2">
      <fo:table-cell>
       <fo:block>
        <xsl:text>Term</xsl:text>
       </fo:block>
      </fo:table-cell>
      <fo:table-cell>
       <fo:block>
        <xsl:text>Definition</xsl:text>
       </fo:block>
      </fo:table-cell>
     </fo:table-row>
     <xsl:for-each select="DAA">
      <fo:table-row border="2">
       <fo:table-cell>
        <fo:block>
         <xsl:value-of select="term"/>
        </fo:block>
       </fo:table-cell>
       <fo:table-cell>
        <fo:block>
         <xsl:value-of select="definition"/>
        </fo:block>
       </fo:table-cell>
      </fo:table-row>
     </xsl:for-each>
    </fo:table-body>
   </fo:table>
  </fo:block>
 </xsl:template>
 <xsl:template name="generalDescription" match="generalDescription">
  <fo:block font-size="18pt" font-weight="bold" space-after="5mm">2. General Description</fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">2.1. Feature Name</fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="7mm">2.1.1. Feature Summary</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="9mm">
   <xsl:value-of select="featureName/summary"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="7mm">2.1.2. Feature Breakdown</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="9mm">
   <xsl:value-of select="featureName/breakdown"/>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">2.2. Feature Requirements</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:for-each select="requirement">
    <xsl:value-of select="content"/>
   </xsl:for-each>
  </fo:block>
  <fo:block font-size="14pt" font-weight="bold" space-after="5mm" margin-left="5mm">2.3. Feature Interactions</fo:block>
  <fo:block font-size="10pt" font-weight="normal" space-after="5mm" margin-left="7mm">
   <xsl:value-of select="featureInteractions"/>
  </fo:block>
 </xsl:template>
 <xsl:template name="strResources" match="strResources">
  <fo:block font-size="18pt" font-weight="bold" space-after="5mm">3. String Resources </fo:block>
  <fo:block font-size="10pt" font-weight="bold" space-after="5mm" margin-left="5mm">
   <fo:table table-layout="fixed" border="2cm" background-color="#fff2d9" >
    <fo:table-column column-width="4cm"/>
    <fo:table-column column-width="10cm"/>
    <fo:table-column column-width="4cm"/>
    <fo:table-body>
     <fo:table-row border="2">
      <fo:table-cell>
       <fo:block>
        <xsl:text>English String</xsl:text>
       </fo:block>
      </fo:table-cell>
      <fo:table-cell>
       <fo:block>
        <xsl:text>Resource ID</xsl:text>
       </fo:block>
      </fo:table-cell>
      <fo:table-cell>
       <fo:block>
        <xsl:text>Rqmt</xsl:text>
       </fo:block>
      </fo:table-cell>
     </fo:table-row>
     <xsl:for-each select="strResource">
      <fo:table-row border="2">
       <fo:table-cell>
        <fo:block>
         <xsl:value-of select="estring"/>
        </fo:block>
       </fo:table-cell>
       <fo:table-cell>
        <fo:block>
         <xsl:value-of select="resourceid"/>
        </fo:block>
       </fo:table-cell>
       <fo:table-cell>
        <fo:block>
         <xsl:value-of select="rqmt"/>
        </fo:block>
       </fo:table-cell>
      </fo:table-row>
     </xsl:for-each>
    </fo:table-body>
   </fo:table>
  </fo:block>
 </xsl:template>
</xsl:stylesheet>


其具体的XSL-FO文件格式的语法可以参照一些其他资料.

建立好了此文件之后,我们就可以用FOP提供的一些接口方便的进行转换了.

FOP提供了XML->FO,XML->PDF,FO-PDF,OBJ->FO,OBJ->PDF的转换接口.

我们这里以XML->PDF的为例,其余的可以参照FOP包里相应的DEMO.

public class ExampleXML2PDF {

    public void convertXML2PDF(File xml, File xslt, File pdf)
                throws IOException, FOPException, TransformerException {
         Driver driver = new Driver();
        Logger logger = new ConsoleLogger(ConsoleLogger.LEVEL_INFO);
        driver.setLogger(logger);
        MessageHandler.setScreenLogger(logger);

        //Setup Renderer (output format)       
        driver.setRenderer(Driver.RENDER_PDF);
       
        //Setup output
        OutputStream out = new java.io.FileOutputStream(pdf);
        try {
            driver.setOutputStream(out);

            //Setup XSLT
            TransformerFactory factory = TransformerFactory.newInstance();
            Transformer transformer = factory.newTransformer(new StreamSource(xslt));
       
            //Setup input for XSLT transformation
            Source src = new StreamSource(xml);
       
            //Resulting SAX events (the generated FO) must be piped through to FOP
            Result res = new SAXResult(driver.getContentHandler());

            //Start XSLT transformation and FOP processing
            transformer.transform(src, res);
        } finally {
            out.close();
        }
    }


    public static void main(String[] args) {
        try {
            System.out.println("FOP ExampleXML2PDF/n");
            System.out.println("Preparing...");

            //Setup directories
            File baseDir = new File(".");
            File outDir = new File(baseDir, "out");
            outDir.mkdirs();

            //Setup input and output files           
            File xmlfile = new File(baseDir, "test.xml");
            File xsltfile = new File(baseDir, "test.xsl");
            File pdffile = new File(outDir, "test.pdf");

            System.out.println("Input: XML (" + xmlfile + ")");
            System.out.println("Stylesheet: " + xsltfile);
            System.out.println("Output: PDF (" + pdffile + ")");
            System.out.println();
            System.out.println("Transforming...");
           
            ExampleXML2PDF app = new ExampleXML2PDF();
            app.convertXML2PDF(xmlfile, xsltfile, pdffile);
           
            System.out.println("Success!");
        } catch (Exception e) {
            System.err.println(ExceptionUtil.printStackTrace(e));
            System.exit(-1);
        }
    }
}


       这样我们就很轻易地实现了XML文档到PDF文档的转换.

     如果这些用在webservice的servlet中,想从xml文件直接生成pdf传输给浏览者而并不生成的pdf文件,我们可以如

下实现:

public class FOPServlet extends HttpServlet {
    public static final String FO_REQUEST_PARAM = "fo";
    public static final String XML_REQUEST_PARAM = "xml";
    public static final String XSL_REQUEST_PARAM = "xsl";

    public void doGet(HttpServletRequest request,
                      HttpServletResponse response) throws ServletException {
        try {
            String xmlParam =getServletContext().getRealPath("WEB-INF/doc/xml/test.xml");
            String xslParam =getServletContext().getRealPath("WEB-INF/doc/xsl/test.xsl");

            if ((xmlParam != null) && (xslParam != null)) {
                XSLTInputHandler input =
                  new XSLTInputHandler(new File(xmlParam),
                                       new File(xslParam));
                renderXML(input, response);
            } else {
                PrintWriter out = response.getWriter();
                out.println("<html><head><title>Error</title></head>/n"+
                            "<body><h1>FopServlet Error</h1><h3>No 'fo' "+
                            "request param given.</body></html>");
            }
        } catch (ServletException ex) {
            throw ex;
        }
        catch (Exception ex) {
            throw new ServletException(ex);
        }
    }
    public void renderXML(XSLTInputHandler input,
                          HttpServletResponse response) throws ServletException {
        try {
            ByteArrayOutputStream out = new ByteArrayOutputStream();

            response.setContentType("application/pdf");

            Driver driver = new Driver();
            driver.setRenderer(Driver.RENDER_PDF);
            driver.setOutputStream(out);
            driver.render(input.getParser(), input.getInputSource());

            byte[] content = out.toByteArray();
            response.setContentLength(content.length);
            response.getOutputStream().write(content);
            response.getOutputStream().flush();
        } catch (Exception ex) {
            throw new ServletException(ex);
        }
    }

}





三.    XML to RTF

    xml到rtf的转换稍微有一些麻烦,我们没有直接从XML到RTF的API, 我们将要用的JFor API还没有整合到FOP

中去. JFor API可以实现 从 FO文件到RTF文件的转换, 它也提供了consle接口.
 
   我们可以从 www.jfor.org 上获取jfor相关信息.

    我们从XML文件到RTF文件的转换可以分为两步:

         1.    用FOP将 xml  转换成 fo

         2.    用JFor将 fo 转换成RTF

    3.1    用FOP将 xml  转换成 fo
         
           这一步我们可以很轻易的沿用上面所述的方法,如下实现xml到fo 的转换,依然会用到上面所用的xml文件

和xsl-fo文件.

             OutputStream foOut = new FileOutputStream(fofile);
            TransformerFactory factory = TransformerFactory.newInstance();
            Transformer transformer = factory.newTransformer(new StreamSource(
                    xsltfile));
            Source src = new StreamSource(xmlfile);
            Result res = new StreamResult(foOut);          
            transformer.transform(src, res);
            foOut.close();

      3.2 用JFor将 fo 转换成RTF

      仅以Serlvet需求的实现为例:

            InputStream foInput = new FileInputStream(fofile);
            InputSource inputSource = new InputSource(foInput);
           
            ByteArrayOutputStream out = new ByteArrayOutputStream();
           Writer output = new OutputStreamWriter(out);

            response.setContentType("application/msword");

            new Converter(inputSource,output,Converter.createConverterOption ());
            output.flush();

            byte[] content = out.toByteArray();
           
            System.out.println(out.toString());

            response.setContentLength(content.length);
            response.getOutputStream().write(content);
            response.getOutputStream().flush();
           
            foInput.close();
            output.close();
            out.close();


这样我们就成功地将xml转化成了RTF格式的文件.

本文仅简述了大体的实现过程,具体的细节可参照各技术点的详细自述.

使用Apache FOP将XML导出成PDF

FOP是由James Tauber发起的一个开源项目,最初的目的是利用xsl-fo将xml文件转换成pdf文件。目前最新的版本是2001年9月29日发布的 0.20.2,它可以将xml文件转换成p...
  • yangyigen
  • yangyigen
  • 2012年08月10日 20:20
  • 7031

将XML转化为pdf的demo

  • 2012年01月31日 21:37
  • 1.16MB
  • 下载

xml类型的word转pdf

  • 2015年03月16日 11:29
  • 416KB
  • 下载

python将xml+xsl转换成PDF的方法

这几天一直在找从xml+xsl转换成PDF文档的方案,最好是用python实现的,找了好多国外的网站,最终还是在csdn上找到了,看来还是自家兄弟靠谱啊,在此谢谢wyuan8913了:-): 转换过...
  • nolove
  • nolove
  • 2012年04月11日 16:52
  • 1289

XML Publisher RTF模版开发技巧

Template Builder->Preview->RTF Tag-Group  Template的建立过程中,Group的概念很重要。 Group:一笔资料就是为一个group. Grou...
  • papaya14
  • papaya14
  • 2012年05月24日 11:31
  • 4818

通过xml解析生成pdf(2012-11-2)

package com.isoftstone.impl.doc; import java.io.File; import java.io.FileNotFoundException; imp...
  • ITrookieGe
  • ITrookieGe
  • 2012年11月02日 16:34
  • 1543

PDF格式和HTML,XML格式

 一个PDF文档从根本上来说是一个8字节序。其实PDF格式和我们已经熟知的HTML,XML等结构化的文件格式一样,包含有关键字,分隔符,数据等等。  不同的是PDF文件是按照二进制流的方式保存的,而h...
  • yueyue369
  • yueyue369
  • 2009年11月02日 00:10
  • 2792

把Doc文档转换成rtf格式

先在项目引用里添加上对Microsoft Word 9.0 object library的引用。 using System; namespace DocConvert { class DoctoRtf...
  • 21aspnet
  • 21aspnet
  • 2007年03月24日 15:36
  • 3784

把Doc文档转换成rtf格式 [C#]

先在项目引用里添加上对Microsoft Word 9.0 object library的引用。using System; namespace DocConvert {  class DoctoRtf...
  • 46539492
  • 46539492
  • 2008年06月23日 14:16
  • 605

把Txt文件转换成Xml文件

package com.utils; import java.io.BufferedReader; import java.io.BufferedWriter; import java....
  • foxaoin
  • foxaoin
  • 2015年06月23日 15:42
  • 2245
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:XML文件到RTF和PDF的转换
举报原因:
原因补充:

(最多只允许输入30个字)