How do I change the hyperlinks in pdf using python? I am currently using a pyPDF2 to open up and loop through the pages. How do I actually scan for hyperlinks and then proceed to change the hyperlinks?
解决方案
So I couldn't get what you want using the pyPDF2 library.
I did however get something working with another library: pdfrw. This installed fine for me using pip in Python 3.6:
pip install pdfrw
Note: for the following I have been using this example pdf I found online which contains multiple links. Your mileage may vary with this.
import pdfrw
pdf = pdfrw.PdfReader("pdf.pdf") #Load the pdf
new_pdf = pdfrw.PdfWriter() #Create an empty pdf
for page in pdf.pages: #Go through the pages
for annot in page.Annots or []: #Links are in Annots, but some pages
#don't have links so Annots returns None
old_url = annot.A.URI
#>Here you put logic for replacing the URLs<
#Use the PdfString object to do the encoding for us.
# Note the brackets around the URL here.
new_url = pdfrw.objects.pdfstring.PdfString("(http://www.google.com)")
#Override the URL with ours.
annot.A.URI = new_url
new_pdf.addpage(page)
new_pdf.write("new.pdf")