C# 使用itextsharp 读取pdf中文字坐标

最新推荐文章于 2024-04-21 10:46:52 发布

趁早码

最新推荐文章于 2024-04-21 10:46:52 发布

阅读量2k

点赞数 1

文章标签： c#

本文链接：https://blog.csdn.net/feel0521/article/details/120528186

版权

本文介绍如何使用C#结合iTextSharp库读取并获取PDF文档中的文字坐标信息，详细讲解PdfHelper类的运用。

摘要由CSDN通过智能技术生成

using iTextSharp.text.pdf;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TestIText
{
    class Program
    {
        static void Main(string[] args)
        {
           PdfReader readerTemp = new PdfReader(@"D:\.pdf");

            PdfHelper.LocationTextExtractionStrategyEx pz = new PdfHelper.LocationTextExtractionStrategyEx();

            iTextSharp.text.pdf.parser.PdfReaderContentParser p = new iTextSharp.text.pdf.parser.PdfReaderContentParser(readerTemp);
            p.ProcessContent<PdfHelper.LocationTextExtractionStrategyEx>(1, pz);

            Console.WriteLine(pz.GetResultantText());
            Console.ReadLine();


        }
    }
}

PdfHelper

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

using iTextSharp.text.pdf.parser;

namespace PdfHelper
{
    /// <summary>
    /// Taken from http://www.java-frameworks.com/java/itext/com/itextpdf/text/pdf/parser/LocationTextExtractionStrategy.java.html
    /// </summary>
    class LocationTextExtractionStrategyEx : LocationTextExtractionStrategy
    {
        private List<TextChunk> m_locationResult = new List<TextChunk>();
        private List<TextInfo> m_TextLocationInfo = new List<TextInfo>();
        public List<TextChunk> LocationResult
        {
            get { return m_locationResult; }
        }
        public List<TextInfo> TextLocationInfo
        {
            get { return m_TextLocationInfo; }
        }

        /// <summary>
        /// Creates a new LocationTextExtracationStrategyEx
        /// </summary>
        public LocationTextExtractionStrategyEx()
        {
        }

        /// <summary>
        /// Returns the result so far
        /// </summary>
        /// <returns>a String with the resulting text</returns>
        public override String GetResultantText()
        {
            m_locationResult.Sort();

            StringBuilder sb = new StringBuilder();
            TextChunk lastChunk = null;
            TextInfo lastTextInfo = null;
            foreach (TextChunk chunk in m_locationResult)
            {
                if (lastChunk == null)
                {
                    sb.Append(chunk.Text);
                    lastTextInfo = n

最低0.47元/天解锁文章

趁早码

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
C# 使用itextsharp 读取pdf中文字坐标

using iTextSharp.text.pdf;using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Threading.Tasks;namespace TestIText{ class Program { static void Main(string[] args) { PdfR.
复制链接

扫一扫