pdf中的英文论文使用谷歌翻译进行翻译时,由于python读取pdf格式混乱,所以要使用分段粘贴的方法,使用爬虫简化了pdf向网页ctrl CV和向word CTRL CV的过程,并可实现pdf内容去换行。
代码如下
from selenium import webdriver
import time
import docx
from docx.shared import Pt
from docx.oxml.ns import qn
#要翻译的内容
str = """
While being the de facto standard coordinate representation in human pose estimation, heatmap is never systematically investigated in the literature, to our best knowledge.
This work fills this gap by studying the coordinate representation with a particular focus on the heatmap. Interestingly, we found that the process of decoding the predicted
heatmaps into the final joint coordinates in the original image space is surprisingly significant for human pose estimation performance, which nevertheless was not recognised before. In light of the discovered importance, we further probe
the design limitations of the standard coordinate decoding
method widely used by existing methods, and propose a
more principled distribution-aware decoding method. Meanwhile, we improve the standard coordinate encoding process (i.e. transforming ground-truth coordinates to heatmaps)
by generating accurate heatmap distributions for unbiased
model training. Taking the two together, we formulate a
novel Distribution-Aware coordinate Representation of Keypoint (DARK) method. Serving as a model-agnostic plugin, DARK significantly improves the performance of a variety of state-of-the-art human pose estimation models. Extensive experiments show that DARK yields the best results on two common benchmarks, MPII and COCO, consistently validating the usefulness and effectiveness of our
novel coordinate representation idea. The project page is at
https://ilovepose.github.io/coco/
"""
str = str.replace("\n","")#去换行
print(str)
path = "C://Users//Wu//edgedriver_win64//msedgedriver.exe"
driver = webdriver.Edge(path)
#使用谷歌翻译
web = "https://translate.google.cn"
driver.get(web)
#定位翻译输入框
elem = driver.find_element_by_xpath('//*[@id="yDmH0d"]/c-wiz/div/div[2]/c-wiz/div[2]/c-wiz/div[1]/div[2]/div[2]/c-wiz[1]/span/span/div/textarea') #输入框
elem.send_keys(str)
time.sleep(5)
#定位翻译输出框
text_translation=driver.find_element_by_xpath('//*[@id="yDmH0d"]/c-wiz/div/div[2]/c-wiz/div[2]/c-wiz/div[1]/div[2]/div[2]/c-wiz[2]/div[5]/div/div[3]')
string_translation = text_translation.get_attribute("data-text")
print(string_translation)
driver.close()
#写入word
#文件名和路径名
file_name = "Distribution-Aware Coordinate Representation for Human Pose Estimation.docx"
file_path = "F:/人体姿态/"
#创建内存中的word文档对象
file=docx.Document(file_path+file_name)
#写入若干段落
paragraph = file.add_paragraph()#建一个段落
text = paragraph.add_run(string_translation)#段落追加内容
text.font.size = Pt(12)#字号
text.font.name = u'宋体'
text._element.rPr.rFonts.set(qn('w:eastAsia'),u'宋体') # 控制是中文时的字体
#保存
file.save(file_path+file_name)