最近遇到一个需求,需要将非常多内容的(文字/表格)word文档展示出来,这个需求出现在pc端就用插件好了或者直接下载文件?如果需求是在移动端呢?怎么办?转成html吧。。。几十页的word怎么搞?为了造福大家,花了几天时间撸了一个插件word-to-html,可以转嵌套的表格,合并单元格的表格,github地址.
emmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm!
安利一波其他项目:
时间仓促,代码写得有点乱,这里贴一下reamde吧,谁用谁知道哈,如果解决给位的痛点,希望不要吝啬您的star,非常欢迎提issue,大家一起讨论完善。由于用到了jsdom,这个库模拟出来的DOMpaser有点弱,如果你选择在浏览器中用我的给的方法,你甚至能将word中每一行不同文字的字体字号都转成对应的html,借助浏览器的js调试面板的源码我放在了github上对应项目的test/browser文件夹中了。
下面是readme:
word-to-html
A tiny tool to convert Microsoft Word document to HTML in Nodejs and in chrome,
you can use the tool convert tables with merged cells and nested tables to html file in Nodejs or chrome, the online tool wordhtml can not do this.
Beyond that, you can convert words with different font-family or font-size in a line to html string in chrome.
table example
attention
If a line of words have different font-family or font-size in your .docx, it can not convert
your .docx to html expectly in nodejs, but this can be fixed in the browsers such as chrome. because
the npm package jsdom can not realize the DOMParser's function perfectly.
So if you want to convert the font-family and font-size exactly, you can see how to use word2html.js in browsers!
Install
npm i word-to-html --save-dev
复制代码
or
yarn add word-to-html
复制代码
api in nodejs: word2html(absPath [,options,env])
absPath: string | Array
absPath is your file's absolute path
options: {tdTextAlign:string,tdVerticalAlign:string}
tdTextAlign controls the
tag's text-aligntdVerticalAlign controls the
tag's vertical-alignenv: 'node' | 'browser'
default value is 'node',if you want to convert the font-family/font -size/color, env must be 'browser', then you get a .html file, after you open the .html file in chrome, you will get a string in console panel, the result is what you want.
Usage in nodejs
var path = require('path');
var word2html = require('word-to-html');
//Word document's absolute path
var absPath = path.join(__dirname,'test.docx');
word2html(absPath,{tdVerticalAlign:'top'})复制代码
the html generated in your WorkSpace.
Usage in browsers
step 1: execute the code below in your node
var path = require('path');
var word2html = require('word-to-html');
//Word document's absolute path
var absPath = path.join(__dirname,'test.docx');
word2html(absPath,{tdVerticalAlign:'top'},'browser')复制代码
step 2: get the html string in your browser
open the html file generated just now, and copy the result string of the console panel into your html tempalte, you will see the content of your .docx file will be in your html template.