word中将空格替换为_以编程方式在网页中将Microsoft Word文档显示为图像

word中将空格替换为

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something similar to this list of resume templates:

本文将介绍如何以编程方式在网页中将Microsoft Word文档的首页(例如.doc,.docx等)显示为图像。 我已经在网上搜寻了无法成功完成此操作的方法。 目的是产生类似于以下简历模板列表的内容:

http://office.microsoft.com/en-us/templates/CT010144894.aspx http://office.microsoft.com/en-us/templates/CT010144894.aspx

A functional example of this article can be found here: http://www.patsmitty.com/gview/word_image.php

可以在此处找到本文的功能示例: http : //www.patsmitty.com/gview/word_image.php

The ENTIRE source code is attached as a .zip below.

整个源代码以.zip格式附加在下面。

Article Disclaimer: This article is somewhat advanced and DOES NOT cover or explain any of  the HTML, CSS, PHP functions, and jQuery resources I have used here. It DOES detail and explain how to get an image for the "src" attribute of an image for your msword docs. If you have questions, please comment here as I'd love to help answer them.

文章免责声明:本文有些高级,并且不会覆盖或解释我在这里使用的任何HTML,CSS,PHP函数和jQuery资源。 它确实详细说明了并解释了如何为msword文档的图像的“ src”属性获取图像。 如果您有任何疑问,请在此处评论,我很乐意为您解答。

Take note that this solution is a hack and relies on Google docs. So if Google docs changes or goes away... so does this solution! Also this solution is fairly complex, so allow about an hour to digest it. All of my source files will be attached down at the bottom.

请注意,此解决方案是黑客行为,并且依赖于Google文档。 因此,如果Google文档发生更改或消失……此解决方案也将如此! 另外,该解决方案非常复杂,因此请花大约一个小时来消化它。 我所有的源文件都将附加在底部。

So this solution starts off with the Google docs application. Lets say you have a msword document at http://www.myserver.com/test.docx. If you navigate to http://docs.google.com/gview?url=http://www.myserver.com/test.docx, you will be able to view that document in the 'bulky' Google docs viewer. I choose the word 'bulky' because all I want to accomplish is to obtain an image of the document, I'm not interested in zooming in or out or looking at all the pages, etc... So if we look closer at the Google docs application, we see that if we right-click anywhere inside the document, we see this context menu:

Standard Image Context Menu
From this we can see that Google docs actually generates an image of each page of the msword document. This is exactly what I'm after! When we select "view image" from that context menu we see the image with the image's URL in the address bar. Let's look at the parameters in the URL:

因此,此解决方案从Google docs应用程序开始。 假设您在http://www.myserver.com/test.docx上有一个msword文档。 如果您导航到http://docs.google.com/gview?url=http://www.myserver.com/test.docx ,则可以在“庞大”的Google文档查看器中查看该文档。 我选择“庞大”一词是因为我要完成的只是获取文档的图像,我对放大或缩小或查看所有页面均不感兴趣,等等。因此,如果我们仔细看一下在Google文档应用程序中,我们看到,如果右键单击文档中的任意位置,我们将看到以下上下文菜单: 从中我们可以看到Google文档实际上为msword文档的每个页面生成了一个图像。 这正是我所追求的! 当我们从该上下文菜单中选择“查看图像”时,我们将在地址栏中看到带有图像URL的图像。 让我们看一下URL中的参数:

url - this is the actual URL of the msword document

url-这是msword文档的实际URL

docid - this is some generated id of the image

docid-这是图像的一些生成的ID

a - I don't know what this is, but it always equates the same thing (as far as I've tested...)

一个-我不知道这是什么,但它总是等同于同一件事(据我测试过...)

pagenumber - the page number of the msword document (in this tutorial, it'll always be 1...)

pagenumber-msword文档的页码(在本教程中,它将始终为1 ...)

w - the width of the image in pixels

w-图片的宽度(以像素为单位)

For this article's purposes I have multiple msword documents in a directory on my website located at http://www.patsmitty.com/gview/word_documents/. It doesn't matter what you call your documents as we will programatically obtain them via PHP.

出于本文的目的,我在网站上位于http://www.patsmitty.com/gview/word_documents/的目录中有多个msword文档。 不管您叫什么文件,因为我们将通过PHP以编程方式获取它们。

One more thing before we start, my example contains a js prototype progress bar and a frame for the images so there are "additional" source files and "extraneous" code as well.

在开始之前,还有一件事,我的示例包含一个js原型进度栏和一个图像框架,因此还有“其他”源文件和“外部”代码。

1.刮掉docs.google.com作为docid参数 (1. Scrape docs.google.com for the docid parameter)

Download the PHP Simple HTML DOM Parser here: http://sourceforge.net/projects/simplehtmldom/files/simplehtmldom/1.5/simplehtmldom_1_5.zip/download

在此处下载PHP Simple HTML DOM解析器: http : //sourceforge.net/projects/simplehtmldom/files/simplehtmldom/1.5/simplehtmldom_1_5.zip/download

Create a blank php file and title it "scrapeIt.php".

创建一个空白的php文件,并将其标题为“ scrapeIt.php”。

Include() or require() the DOM Parser

Include()或require()DOM分析器

Create a function called getImageUrl that takes 2 parameters: $file_url, $thumb_width

创建一个名为getImageUrl的函数,该函数带有2个参数:$ file_url,$ thumb_width

This function will contain 2 lines of code that returns the URL of the image we need.

该函数将包含2行代码,这些代码返回我们所需图像的URL。

function getImageURL($file_url, $thumb_width) {
	$html = file_get_html('http://docs.google.com/gview?url=' . $file_url);
	return doUrl(html_to_URL($html, "{svUrl:'", "46chan"), $thumb_width);
}

Create the function that the second line in the snippet above refers to: the html_to_URL function that takes $string, $start, and $end as parameters. The $string parameter is the long URL generated by Google docs that contains some extraneous stuff that we're going to trim out via the positions of the $start and $end variables which are "{svUrl:'" and "46chan". These 2 strings that are used in a js function called by Google's gview app. Below is Google's js function code with the URL for my document:

创建上面代码段第二行引用的函数:html_to_URL函数,该函数以$ string,$ start和$ end作为参数。 $ string参数是Google文档生成的长URL,其中包含一些多余的内容,我们将通过$ start和$ end变量的位置来修剪它们,这些变量分别为“ {svUrl:'”和“ 46chan”。 这2个字符串在Google的gview应用调用的js函数中使用。 以下是Google的js函数代码以及我的文档的网址:

<script type="text/javascript">
            
            function finalizeApp() {
              if (!gviewApp) {
                return;
              }
  
              
              gviewApp.setDisplayData(
                {svUrl:'?url\75http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92\46chan\75DwAAAM2xz0nJvnNLiNHc/RtPoug%3D\46a\75sv',biUrl:'?url\75http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92\46chan\75DwAAAM2xz0nJvnNLiNHc/RtPoug%3D\46a\75bi',chanId:'DwAAAM2xz0nJvnNLiNHc/RtPoug\075',gpUrl:'http://doc-0k-8g-docsviewer.googleusercontent.com/viewer/securedownload/dsn1aovipa7l846lsfcf94nedj8q2p4u/vgceh33q6a9930abfmgliebnltb0nljm/1312587900000/dXJs/AGZ5hq8BgbJY1gwaOYx83cPOdNw6/aHR0cDovL3d3dy5wYXRzbWl0dHkuY29tL2d2aWV3L3dvcmRfZG9jdW1lbnRzL3BhdF9zbWl0aC5kb2N4?a\75gp\46filename\75test.docx\46chan\75DwAAAM2xz0nJvnNLiNHc/RtPoug%3D\46docid\7593c3e45e33f8c096913511ef8fe32a92\46sec\75AHSqidYnfLc11kcuxtjsvPOT1apoyI52utATnDA0dbZG7oiQ3GYFpmaw454_bppEvls9ZMLaqb-V',docId:'93c3e45e33f8c096913511ef8fe32a92',numPages:1,gtUrl:'?url\75http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92\46chan\75DwAAAM2xz0nJvnNLiNHc/RtPoug%3D\46a\75gt',thWidth:138,dlUrl:'http://www.patsmitty.com/gview/word_documents/test.docx',thHeight:179});
              gviewApp.finalizeApp();

              
              gviewApp.loadLateDeps();
            }
            gviewApp.setProgress(90);
            finalizeApp();
            
              window.jstiming.load.tick('prt');
            
          </script>
http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92 http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92
$string = " ".$string;
	$ini = strpos($string,$start);
	if ($ini == 0) return "";
	$ini += strlen($start);
	$len = strpos($string,$end,$ini) - $ini;
	$final = substr($string,$ini,$len-1);
	return $final;
http://www...". This is obviously the url parameter in the final image's URL. So we need to make the beginning look like this instead: "?url= http:// www ...”。这显然是最终图片URL中的url参数。因此,我们需要使开头看起来像这样:“?url = http://www...". Also, a little later, we see a "\46". This is supposed to be an ampersand that declares the next parameter which is the most important: docid. So this last function will take care of this generated URL and make it usable. This function called doURL takes 2 parameters called $final_url and $width. $final_url is the raw URL in quotes above. This function will turn http:// www ...”。此外,再过一会,我们看到一个“ \ 46”。这应该是一个与号,它声明最重要的下一个参数:docid。因此,最后一个函数将处理此生成的URL并使其可用。叫做doURL的参数有两个,分别是$ final_url和$ width。$ final_url是上面引号中的原始URL。
http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92 http://www.patsmitty.com/gview/word_documents/test.docx\46docid\7593c3e45e33f8c096913511ef8fe32a92
http://www.patsmitty.com/word_documents/patty.docx&docid=3e4a3f6ecf6625dccc407c11df17dbfc http://www.patsmitty.com/word_documents/patty.docx&docid=3e4a3f6ecf6625dccc407c11df17dbfc
$url_final = str_replace("\\75", "=", $url_final);
	$url_final = str_replace("\/", "%2F", $url_final);
	$url_final = str_replace("\\46", "&", $url_final);
	$url_final = "http://docs.google.com/gview" . $url_final;
	$url_final = $url_final . "&a=bi&pagenumber=1&w=" . $width;
	return $url_final;
http://docs.google.com/gview" to the beginning and "&a=bi&pagenumber=1&w=" . $width" to the end. So now our URL looks exactly like it does when we click on "view image" or "copy image location" from the context menu when we right-clicked on the image of the first page of our msword document that we viewed in Google docs!!! http://docs.google.com/gview ,结尾是“&a = bi&pagenumber = 1&w =“。$ width”。 因此,现在,当我们右键单击在Google文档中查看的msword文档首页的图像时,单击上下文菜单中的“查看图像”或“复制图像位置”时,我们的URL看起来完全一样! !!

We're not done though. For some reason, when we loaded the html from the Google docs page the docid was generated but the image was not. Probably because something in Google gview app assumes that the root directory is "docs.google.com" and not another URL like mine: "www.patsmitty.com". So when you try to take that url and view it, error 400 spring up. No problem, the next step will explain how to bypass this.

我们还没有完成。 出于某种原因,当我们从Google文档页面加载html时,生成了docid,但未生成图像。 可能是因为Google gview应用中的某些内容假定根目录是“ docs.google.com”,而不是像我这样的其他URL:“ www.patsmitty.com ”。 因此,当您尝试获取该URL并进行查看时,就会出现错误400。 没问题,下一步将说明如何绕过此操作。

2.强制Google文档加载图片 (2. Force Google docs to load your image)

Now, in my example I have an upload for that uploads the msword document and then get's the image's URL via the methods described above. Now I have to get Google docs' gview app to actually render the image before I can use the URL without getting a 400. To do this I call the Google docs URL into the src of a hidden iFrame and wait 5 seconds. I make the js wait 5 seconds to give Google apps ample time to generate the image of my document . Once that finishes I call my final script that incorporates the entire deal. Create a new file called "getImages.php". This sorts through the documents in the given directory and generates the URL. This is a bit redundant. A more efficient way to do this is to create an XML file and store the URLs there when they are uploaded. But this is doable. Here is the code for this file:

现在,在我的示例中,我有一个上传文件,用于上传msword文档,然后通过上述方法获取图片的URL。 现在,我必须先获得Google文档的gview应用程序才能实际渲染图像,然后才能使用该URL,而无需获得400。为此,我将Google文档URL调用到隐藏的iFrame的src中,然后等待5秒钟。 我让js等待5秒钟,以便Google应用有足够的时间来生成文档的图像。 完成后,我将调用包含整个交易的最终脚本。 创建一个名为“ getImages.php”的新文件。 这将对给定目录中的文档进行排序并生成URL。 这有点多余。 一种更有效的方法是创建XML文件,并在上传URL时将URL存储在其中。 但这是可行的。 这是此文件的代码:

<?php
include('msword.php');
if ($handle = opendir('word_documents')) {
	echo '<table>';
	$c = 0;
    while (false !== ($file = readdir($handle))) {
		if(strripos($file, ".doc")!=false):
			if($c==4):
				echo '</tr>';
				$c=0;
			endif;
			if($c==0):
				echo '<tr>';
			endif;
			$url = getImageUrl('http://www.patsmitty.com/gview/word_documents/'.$file, '125');
			$click_url = 'http://docs.google.com/gview?url=http://www.patsmitty.com/gview/word_documents/'.$file;
			echo '<td><div style="padding-left:5px;padding-top:5px;" class="imgdiv"><img class="docs" onmouseover="this.style.cursor=\'pointer\'" onclick="window.location=\''.$click_url.'\'" src="'.$url.'" /></div><td>';
			$c++;
		endif;
    }
	echo '</table>';
    closedir($handle);
}

?>
That's It!!!
We now have images of our word documents that we can show on our page. Take note that Google Docs supports other formats than just .doc and .docx but .pdfs  and spreadsheets as well so these are all feasible with my hack as well. I have only tested .doc, .docx, and .pdf. Also, please look at my source code as it shows how to connect all these functions together as I am not explaining any html, parsing, css, or other php functions consumed here - I'm simply explaining my hack so you too can enhance your web applications. Also remember to change all the references from "
而已!!!
现在,我们有了可以在页面上显示的Word文档的图像。 请注意,Google文档不仅支持.doc和.docx格式,还支持.pdfs和电子表格格式,因此这些格式对于我的黑客工具也是可行的。 我只测试了.doc,.docx和.pdf。 另外,请查看我的源代码,因为它显示了如何将所有这些功能连接在一起,因为我没有解释这里使用的任何html,解析,css或其他php函数-我只是在解释我的hack,因此您也可以增强您的网络应用程序。 还记得将所有引用从“ www.patsmitty.com/gview/..." to your respected servers. www.patsmitty.com/gview/ ...”更改为您所尊敬的服务器。
If you have any questions please post here or email me at psmith@patsmitty.com and I'll post them here.
如果您有任何疑问,请在此处发布或给我发电子邮件psmith@patsmitty.com,我将在此处发布。
gview.zip gview.zip

翻译自: https://www.experts-exchange.com/articles/6830/Displaying-Microsoft-Word-documents-in-a-Web-Page-as-Images-Programatically.html

word中将空格替换为

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值