pdf文件在pdfbox中对应的数据结构的一点浅见


官方文档 版本:2.0.27

基本对象的类继承关系

COSBase (org.apache.pdfbox.cos)
    COSDictionary (org.apache.pdfbox.cos)
        UnmodifiableCOSDictionary (org.apache.pdfbox.cos)
        COSStream (org.apache.pdfbox.cos)
    COSName (org.apache.pdfbox.cos)
    COSDocument (org.apache.pdfbox.cos)
    COSArray (org.apache.pdfbox.cos)
    COSNumber (org.apache.pdfbox.cos)
        COSInteger (org.apache.pdfbox.cos)
        COSFloat (org.apache.pdfbox.cos)
    COSNull (org.apache.pdfbox.cos)
    COSObject (org.apache.pdfbox.cos)
    COSBoolean (org.apache.pdfbox.cos)
    COSString (org.apache.pdfbox.cos)

整体文档层面

文档对象的类

org.apache.pdfbox.pdmodel.PDDocument

其它文档层面的类请参见官方文档

通过树来管理和组织页面。
(这儿没怎么研究,有空再研究补充)

页面层次

页面对象的类

org.apache.pdfbox.pdmodel.PDPage

页面层次操作的类主要通过

org.apache.pdfbox.pdmodel.PDPageContentStream

实现,可以绘制文字、图片等。

其它页面层次操作的类请参见官方文档

重点

PDPage类有个名为page的字段,是COSDictionary类型,它包含了一个页面的绝大部分信息,也就是说,一个页面的主要结构是一个字典。字典键值对类型<COSName,COSBase>,详见源码

而在这个字典中有一个键为COSName.CONTENTS的键值对,这个键值对中值的实际类型是

org.apache.pdfbox.pdmodel.common.COSStream

org.apache.pdfbox.cos.COSArray

当是一个COSArray时,其内的元素类型亦是COSStream。(把一个COSStream看作PhtotShop中的一个图层,你就会理解COSArray在此中的意义)

但在页面层次使用时的类是

org.apache.pdfbox.pdmodel.common.PDStream

它内部封装了个COSStream,但在页面字典实际保存的是COSStream,因此,它们必须在恰当的时候相互转化地使用

要在一个页面上绘制(呈现)文字、图片等对象时,绝大部分操作都是通过该键值对的值中的COSStream(或其包装类PDStream)对象实现的

也就是说,当两个页面的COSStream被交换时,它们页面内容就被交换了。(PDPage对象有方法setContents去替换COSStream)

PDStream对象的方法createOutputStream,创建一个输出流output(其实是由内部COSStream对象创建的),它接收一系列操作符操作数。借此,把这些操作符和操作数保存在一个PDStream(即内部的COSStream)对象中。

操作符详见下述类或其子类的源码

org.apache.pdfbox.contentstream.operator.OperatorName
org.apache.pdfbox.contentstream.operator.OperatorProcessor

操作数可以是某些基本对象,详见下述类的源码或其它类

org.apache.pdfbox.pdmodel.PDPageContentStream

因此,一个PDStream(COSStream)对象相当于一个源文件。

那么,这些"源文件"是什么时候编译的呢?详见下述方法的源码

org.apache.pdfbox.pdmodel.PDDocument#save

现在,比起直接使用PDPageContentStream,你应该可以定义你自己的PDPageContentStream

简要流程图

Created with Raphaël 2.3.0 PostScript output to use COSArray? COSStream COSArray COSDictionary PDPage PDDocument COSStream yes no

简单示例

package org.example;

import org.apache.pdfbox.contentstream.operator.OperatorName;
import org.apache.pdfbox.cos.COSBase;
import org.apache.pdfbox.cos.COSStream;
import org.apache.pdfbox.io.ScratchFile;
import org.apache.pdfbox.pdmodel.common.COSObjectable;
import org.apache.pdfbox.pdmodel.common.PDStream;
import org.apache.pdfbox.util.Charsets;
import org.apache.pdfbox.util.NumberFormatUtil;

import java.awt.geom.PathIterator;
import java.io.Closeable;
import java.io.File;
import java.io.IOException;
import java.io.OutputStream;
import java.text.NumberFormat;
import java.util.Locale;

public final class PDSimpleGraphicsStreamEngine implements Closeable, COSObjectable {
    private COSStream cosStream;
    private OutputStream output;

    private static final NumberFormat formatDecimal = NumberFormat.getNumberInstance(Locale.US);
    private static final int digits = formatDecimal.getMaximumFractionDigits();
    private static final byte[] formatBuffer = new byte[32];
    private static final float[] pointSet = new float[6];

    /**
     * @return It can be taken as a method parameter.The method is {@link org.apache.pdfbox.pdmodel.PDPage#setContents}
     * @throws IOException
     */
    public PDStream getPdStream() throws IOException {
        close();
        return new PDStream(cosStream);
    }

    /**
     * Constructor
     *
     * @param scratchFileDir A folder to store a page of data,see {@link ScratchFile#ScratchFile(File scratchFileDirectory)}
     * @param filters        filter,see {@link COSStream#createOutputStream(COSBase filters)}
     * @throws IOException
     */
    public PDSimpleGraphicsStreamEngine(String scratchFileDir, COSBase filters) throws IOException {
        if (scratchFileDir != null && scratchFileDir.length() > 0) {
            File dir = new File(scratchFileDir);
            if (!dir.exists()) {
                dir.mkdirs();
            }
            ScratchFile scratchFile = new ScratchFile(dir);
            cosStream = new COSStream(scratchFile);
        } else {
            cosStream = new COSStream();
        }
        output = cosStream.createOutputStream(filters);
    }

    public PDSimpleGraphicsStreamEngine() throws IOException {
        this(null, null);
    }

    private static int formatFloat(final float value) {
        return NumberFormatUtil.formatFloatFast(value, digits, formatBuffer);
    }

    private void writeOperand(final float value) throws IOException {
        int n = formatFloat(value);
        output.write(formatBuffer, 0, n);
        output.write(' ');
    }

    private void writeOperator(final String operatorName) throws IOException {
        output.write(operatorName.getBytes(Charsets.US_ASCII));
        output.write(' ');
    }

    /**
     * @param x x-coordinate of the target point.
     * @param y y-coordinate of the target point.
     * @throws IOException
     */
    public void moveTo(float x, float y) throws IOException {
        writeOperand(x);
        writeOperand(y);
        writeOperator(OperatorName.MOVE_TO);
    }

    /**
     * @param x x-coordinate of the end point of the line.
     * @param y y-coordinate of the end point of the line.
     * @throws IOException
     */
    public void lineTo(float x, float y) throws IOException {
        writeOperand(x);
        writeOperand(y);
        writeOperator(OperatorName.LINE_TO);
    }

    /**
     * Set line width to the given value.
     *
     * @param lineWidth The width which is used for drawing.
     * @throws IOException If the content stream could not be written
     */
    public void setLineWidth(float lineWidth) throws IOException {
        writeOperand(lineWidth);
        writeOperator(OperatorName.SET_LINE_WIDTH);
    }

    /**
     * @param x1 x-coordinate of the first control point.
     * @param y1 y-coordinate of the first control point.
     * @param x2 x-coordinate of the second control point.
     * @param y2 y-coordinate of the second control point.
     * @param x3 x-coordinate of the end point of the curve.
     * @param y3 y-coordinate of the end point of the curve.
     * @throws IOException
     */
    public void curveTo(float x1, float y1, float x2, float y2, float x3, float y3) throws IOException {
        writeOperand(x1);
        writeOperand(y1);
        writeOperand(x2);
        writeOperand(y2);
        writeOperand(x3);
        writeOperand(y3);
        writeOperator(OperatorName.CURVE_TO);
    }

    /**
     * Append a cubic Bézier curve to the current path. The curve extends from the current point to
     * the point (x3, y3), using (x1, y1) and (x3, y3) as the Bézier control points.
     *
     * @param x1 x coordinate of the point 1
     * @param y1 y coordinate of the point 1
     * @param x3 x coordinate of the point 3
     * @param y3 y coordinate of the point 3
     * @throws IOException           If the content stream could not be written.
     * @throws IllegalStateException If the method was called within a text block.
     */
    public void curveTo1(float x1, float y1, float x3, float y3) throws IOException {
        writeOperand(x1);
        writeOperand(y1);
        writeOperand(x3);
        writeOperand(y3);
        writeOperator(OperatorName.CURVE_TO_REPLICATE_FINAL_POINT);
    }

    /**
     * Append a cubic Bézier curve to the current path. The curve extends from the current point to
     * the point (x3, y3), using the current point and (x2, y2) as the Bézier control points.
     *
     * @param x2 x coordinate of the point 2
     * @param y2 y coordinate of the point 2
     * @param x3 x coordinate of the point 3
     * @param y3 y coordinate of the point 3
     * @throws IllegalStateException If the method was called within a text block.
     * @throws IOException           If the content stream could not be written.
     */
    public void curveTo2(float x2, float y2, float x3, float y3) throws IOException {
        writeOperand(x2);
        writeOperand(y2);
        writeOperand(x3);
        writeOperand(y3);
        writeOperator(OperatorName.CURVE_TO_REPLICATE_INITIAL_POINT);
    }

    /**
     * @throws IOException
     */
    public void closePath() throws IOException {
        writeOperator(OperatorName.CLOSE_PATH);
    }


    public void strokePath() throws IOException {
        writeOperator(OperatorName.STROKE_PATH);
    }

    /**
     *
     * @param it see {@link PathIterator}
     * @throws Exception
     */
    public void drawComplexCurve(PathIterator it) throws Exception {
        while (!it.isDone()) {
            switch (it.currentSegment(pointSet)) {
                case PathIterator.SEG_MOVETO:
                    moveTo(pointSet[0], pointSet[1]);
                    break;
                case PathIterator.SEG_LINETO:
                    lineTo(pointSet[0], pointSet[1]);
                    break;
                case PathIterator.SEG_QUADTO:
                    curveTo1(pointSet[0], pointSet[1], pointSet[2], pointSet[3]);
                    break;
                case PathIterator.SEG_CUBICTO:
                    curveTo(pointSet[0], pointSet[1], pointSet[2], pointSet[3], pointSet[4], pointSet[5]);
                    break;
                case PathIterator.SEG_CLOSE:
                    closePath();
                    break;
            }
            it.next();
        }
    }

    @Override
    public COSStream getCOSObject() {
        return cosStream;
    }

    @Override
    public void close() throws IOException {
        if (output != null) {
            output.close();
            output = null;
        }
    }
}
package org.example;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;

import java.awt.geom.AffineTransform;
import java.awt.geom.Ellipse2D;

/**
 * a test class.
 */
public class Test {
    public static void main(String[] args) throws Exception {
        Ellipse2D.Float ellipse = new Ellipse2D.Float(0, 0, 200, 100);

        PDSimpleGraphicsStreamEngine engine = new PDSimpleGraphicsStreamEngine();
        engine.drawComplexCurve(ellipse.getPathIterator(new AffineTransform()));
        engine.strokePath();
        engine.close();

        PDDocument doc = new PDDocument();
        PDPage page = new PDPage();
        page.setContents(engine.getPdStream());
        doc.addPage(page);

        doc.save("aaa.pdf");
        doc.close();
    }
}

访问者模式

pdfbox访问者模式很好的应用实例,详见下述类及其相关类的源码

org.apache.pdfbox.pdfwriter.COSWriter

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值