H2p完善——h2p文件合并和用xsl解析h2p文件

最新推荐文章于 2011-04-03 14:40:00 发布

weixin_30556161

最新推荐文章于 2011-04-03 14:40:00 发布

阅读量177

点赞数

原文链接：http://www.cnblogs.com/javaei/archive/2009/08/31/1557221.html

版权

自从javaei网站推出h2p以来，得到了很多人的支持和鼓励，也给出了很多宝贵意见，再次深表谢意。

这些意见主要是三个方面的问题。第一：为什么要生成pdf；第二：h2p文件分成两个，比较繁琐，为什么不合成一个；第三：h2p-tool由c#和java实现，用起来不够简单，需要进一步完善。这三方面的意见提得非常好，在设计h2p之前，我其实是深思熟虑过的。

为什么要转换成pdf？回答这个问题很简单也很难？我就简单的说一下，首先pdf已成为广泛支持的具有丰富表现力的文档格式；其次pdf是聚合、收藏网络文章的最便利的文档格式，因为其强大的书签功能和瘦小的体积；还有很多，不一一列举了。说到底，h2p就是一个把网络文章整理成pdf的一个完整的解决方案，有些网站也有制作电子书的功能，但是h2p是一个更加通用的解决方案。

关于h2p文件的问题，分成两个文件来描述的确不方便，现在回想起来，当时分成两个文件是考虑的过多了，合并成一个文件迫在眉睫，而这是本文的主要内容。

合并后的h2p文件其后缀为.h2p.xml,主要描述url的信息和url的层次结构，h2p-tool根据h2p文件生成有书签的pdf文档。合并后，对h2p文件的操作变得简单，还可以通过xsl直接展示url的层次结构，而且合作网站对h2p的支持也将变得简单。

经过慎重考虑和仔细设计，h2p文件的一个例子如下：

1 <? xml version="1.0" encoding="UTF-8" ?>
2 <? xml-stylesheet type="text/xsl" href="http://www.javaei.com/content/h2p/h2p.xsl" ?>
3 <! DOCTYPE book PUBLIC "-//JavaEI/JavaEI h2p Configuration DTD//CN" "http://www.javaei.com/dtd/javaei-h2p.dtd" >
4 < book name ="我的PDF书" >
5      < chapter name ="163" >
6          < chapter name ="163新闻" >
7              < href id ="11111" > <![CDATA[ http://news.163.com ]]> </ href >
8          </ chapter >
9          < chapter name ="163体育" >
10              < href id ="2222" > <![CDATA[ http://sports.163.com ]]> </ href >
11          </ chapter >
12      </ chapter >
13      < chapter name ="sohu" >
14          < href id ="333" > <![CDATA[ http://www.sohu.com ]]> </ href >
15          < chapter name ="sohu新闻" >
16              < href id ="444" > <![CDATA[ http://news.sohu.com ]]> </ href >
17          </ chapter >
18      </ chapter >
19 </ book >
20

对应的dtd如下：

1 <! ELEMENT book (chapter+) >
2 <! ATTLIST book      name   CDATA #REQUIRED >
3 <! ELEMENT chapter (href?,chapter*) >
4 <! ATTLIST chapter      name  CDATA  #REQUIRED >
5 <! ELEMENT href (#PCDATA) >
6 <! ATTLIST href      id  CDATA  #REQUIRED >
7

利用xsl对该样例文件的解析效果如图：

层次结构的深度是没有限制的，h2p的xsl实现了对xml的解析和树节点的构造，如果对这方面的问题感兴趣的可以参考我这个xsl。合作网站可以提供自己的xsl。Xsl解析xml生成树的核心代码如下：

1      < xsl:template match ="//chapter" >
2          < xsl:for-each select ="." >
3                  < li >
4                      < img class ="nodeimg" >
5                      < xsl:choose >
6                      < xsl:when test ="./chapter" >
7                          < xsl:attribute name ="src" > http://www.javaei.com/res/images/closed.gif </ xsl:attribute >
8                          < xsl:attribute name ="onclick" > clicknode(this) </ xsl:attribute >
9                      </ xsl:when >
10                      < xsl:otherwise >
11                          < xsl:attribute name ="src" > http://www.javaei.com/res/images/leaf.gif </ xsl:attribute >
12                      </ xsl:otherwise >
13                      </ xsl:choose >
14                      </ img >
15                      < xsl:choose >
16                          < xsl:when test ="./href" >
17                              < a >
18                                  < xsl:attribute name ="href" >< xsl:value-of select ="./href" /></ xsl:attribute >
19                                  < xsl:attribute name ="target" > right </ xsl:attribute >
20                                  < xsl:attribute name ="class" > h2pnodestyle </ xsl:attribute >
21                                  < xsl:value-of select ="text()" />< xsl:value-of select ="@name" />
22                              </ a >
23                          </ xsl:when >
24                          < xsl:otherwise >
25                              < xsl:value-of select ="text()" />< xsl:value-of select ="@name" />
26                          </ xsl:otherwise >
27                      </ xsl:choose >
28                      < ul class ="collapsed" >
29                          < xsl:apply-templates select ="./chapter" />
30                      </ ul >
31                  </ li >
32          </ xsl:for-each >
33      </ xsl:template >

关键是这两句：

<xsl:template match="//chapter">

<xsl:apply-templates select="./chapter" />

这实际上形成了递归调用。

关于 h2p-tool 的问题，不得不说，这是个难题。要求根据 url 生成的 pdf 展现效果与浏览器里展现的效果一致，这无异于做一个浏览器，难度可想而知。在 java 领域，目前还没找到一个可以用的解决方案，所以才不得不借助于别人的 c# 的组件。 h2p-tool 的完善工作将是以后的主要方向，大家如果有好的思路，请不吝赐教。
h2p详细介绍
h2p文件示例

转载于:https://www.cnblogs.com/javaei/archive/2009/08/31/1557221.html