PDF处理控件Aspose.PDF功能演示：使用C＃查找和替换PDF文件中的文本

最新推荐文章于 2025-04-27 10:43:27 发布

慧都小妮子

最新推荐文章于 2025-04-27 10:43:27 发布

阅读量1.1k

点赞数

分类专栏： aspose.pdf 教程文章标签： pdf c语言前端 aspose pdf控件

本文链接：https://blog.csdn.net/m0_67129275/article/details/134419909

版权

aspose.pdf 教程专栏收录该内容

57 篇文章

订阅专栏

使用“查找并替换”选项可以一次性替换文档中的特定文本。这样，您不必手动定位和更新整个文档中每次出现的文本。本文甚至更进一步，介绍了如何在PDF文档中自动查找和替换文本功能。特别是，将学习如何使用C＃在整个PDF，特定页面或页面区域中查找和替换文本。

使用C＃查找和替换PDF中的文本
查找和替换特定页面中的文本
定义PDF页面区域以查找和替换文本
使用正则表达式查找和替换PDF中的文本

.NET的Aspose.PDF是一个C＃类库，为.NET应用程序提供基本以及高级的PDF操作功能。该API还允许您以不同的方式无缝地查找和替换PDF文档中的文本。

使用C＃查找和替换PDF中的文本

以下是在PDF文档中查找和替换文本的步骤。

使用Document类使用其路径加载PDF文档。
创建TextFragmentAbsorber类的实例，并将搜索短语提供给其构造函数。
使用Document.Pages.Accept（TextFragmentAbsorber）接受PDF所有页面的文本吸收器。
将提取的文本片段获取到TextFragmentCollection对象中。
遍历找到的TextFragmentCollection并替换每个片段中的文本。
使用Document.Save（String）方法保存更新的PDF文档。

下面的代码示例演示如何使用C＃查找和替换PDF中的文本。

// Open document
Document pdfDocument = new Document("Document.pdf");

// Create TextAbsorber object to find all instances of the input search phrase
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("text");

// Accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber);

// Get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

// Loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
    // Update text and other properties
    textFragment.Text = "TEXT";
    textFragment.TextState.Font = FontRepository.FindFont("Verdana");
    textFragment.TextState.FontSize = 22;
    textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Blue);
    textFragment.TextState.BackgroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Green);
}
            
// Save resulting PDF document.
pdfDocument.Save("updated-document.pdf");

使用C＃查找和替换特定页面中的文本

以下是在PDF文档的特定页面上查找和替换文本的步骤。

使用Document类使用其路径加载PDF文档。
创建TextFragmentAbsorber类的实例，并将搜索短语提供给其构造函数。
使用Document.Pages [1] .Accept（TextFragmentAbsorber）接受所需页面的文本吸收器。
遍历找到的TextFragmentAbsorber.TextFragments集合，并替换每个片段中的文本。
使用Document.Save（String）方法保存更新的PDF文档。

以下代码示例显示了如何使用C＃在PDF的特定页面中查找和替换文本。

// Open document
Document pdfDocument = new Document("Document.pdf");

// Create TextAbsorber object to find all instances of the input search phrase
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("text");

// Accept the absorber for desired
pdfDocument.Pages[1].Accept(textFragmentAbsorber);

// Get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

// Loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
    // Update text and other properties
    textFragment.Text = "TEXT";
    textFragment.TextState.Font = FontRepository.FindFont("Verdana");
    textFragment.TextState.FontSize = 22;
    textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Blue);
    textFragment.TextState.BackgroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Green);
}

// Save resulting PDF document.
pdfDocument.Save("updated-document.pdf");

定义页面区域以查找和替换文本

还可以在PDF文档的页面特定区域中查找和替换文本。以下步骤显示了如何定义特定区域，然后替换其中的文本。

使用Document类使用其路径加载PDF文档。
创建TextFragmentAbsorber类的实例，并将搜索短语提供给其构造函数。
使用Document.Pages [0] .Accept（TextFragmentAbsorber）接受所需页面的文本吸收器。
使用Rectangle类定义页面区域。
循环遍历TextFragmentAbsorber.TextFragments集合，并替换每个片段中的文本。
使用Document.Save（String）方法保存更新的PDF文档。

下面的代码示例演示如何使用C＃在PDF的特定页面区域中查找和替换文本。

// load PDF file
Document pdf = new Document("Document.pdf");

// instantiate TextFragment Absorber object
TextFragmentAbsorber TextFragmentAbsorberAddress = new TextFragmentAbsorber();

// search text within page bound
TextFragmentAbsorberAddress.TextSearchOptions.LimitToPageBounds = true;

// specify the page region for TextSearch Options
TextFragmentAbsorberAddress.TextSearchOptions.Rectangle = new Rectangle(100, 100, 200, 200);

// search text from first page of PDF file
pdf.Pages[1].Accept(TextFragmentAbsorberAddress);

// iterate through individual TextFragment
foreach (TextFragment tf in TextFragmentAbsorberAddress.TextFragments)
{
    // update text to blank characters
    tf.Text = "";
}

// save updated PDF file after text replace
pdf.Save("output.pdf");

使用正则表达式查找和替换PDF中的文本

也可以使用正则表达式来查找和替换与特定模式匹配的文本。为此，您只需要提供一个正则表达式即可代替普通搜索短语并使用TextSearchOptions。以下是执行此操作的步骤。

使用Document类使用其路径加载PDF文档。
创建TextFragmentAbsorber类的实例，并将搜索短语提供给其构造函数。
创建TextSearchOptions类的实例，然后将true传递给其构造函数以启用基于正则表达式的搜索。
分配TextSearchOptions对象TextFragmentAbsorber.TextSearchOptions财产。
使用Document.Pages [0] .Accept（TextFragmentAbsorber）接受所需页面的文本吸收器。
使用Rectangle类定义页面区域。
循环遍历TextFragmentAbsorber.TextFragments集合，并替换每个片段中的文本。
使用Document.Save（String）方法保存更新的PDF文档。

下面的代码示例演示如何使用C＃使用正则表达式查找和替换PDF中的文本。

// Open document
Document pdfDocument = new Document("Document.pdf");

// Create TextAbsorber object to find all the phrases matching the regular expression
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("\\d{4}-\\d{4}"); // Like 1999-2000

// Set text search option to specify regular expression usage
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;

// Accept the absorber for a single page
pdfDocument.Pages[1].Accept(textFragmentAbsorber);

// Get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

// Loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
    // Update text and other properties
    textFragment.Text = "New Phrase";
    // Set to an instance of an object.
    textFragment.TextState.Font = FontRepository.FindFont("Verdana");
    textFragment.TextState.FontSize = 22;
    textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Blue);
    textFragment.TextState.BackgroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Green);
}

// Save PDF
pdfDocument.Save("output.pdf");