dom to html,Convert an HTML file to a DOM document

mlreportgen.dom.HTMLFile class

Package:mlreportgen.dom

Convert an HTML file to a DOM document

Description

Converts the contents of an HTML file to an mlreportgen.dom.HTMLFile

object containing DOM objects having the same content and format. You can append the

HTMLFile object to a DOM document of any type, including Word and PDF

documents.

Construction

htmlFileObj = HTMLFile(htmlFile)

converts the HTML file to an HTMLFile object containing DOM objects

having the same content and format.

An HTMLFile object supports these HTML elements and attributes. In

addition, HTMLFile objects accept HTML that contains custom CSS

properties, which begin with a hyphen. Custom CSS properties are supported in HTML,

Microsoft® Word, and PDF output.

HTML ElementAttributesaclass, style, href, name

addressclass,

style

bclass, style

bigclass,

style

blockquoteclass,

style

bodyclass, style

brn/a

centerclass,

style

citeclass,

style

codeclass, style

ddclass,

style

delclass, style

dfnclass,

style

divclass, style

dlclass,

style

dtclass,

style

emclass,

style

fontclass, style, color, face, size

h1, h2, h3, h4, h5, h6class, style, align

hrclass, style, align

iclass, style

insclass, style

imgclass, style, src,

height,

width

kbdclass,

style

liclass, style

markclass,

style

nobrclass,

style

olclass, style

pclass, style, align

preclass, style

sclass, style

sampclass,

style

smallclass,

style

spanclass, style

strikeclass, style

strongclass,

style

subclass, style

supclass, style

tableclass, style, align, bgcolor, border, cellspacing, cellpadding, frame, rules, width

tbodyclass, style, align, valign

tfootclass, style, align, valign

theadclass, style, align, valign

tdclass, style, bgcolor,

height,

width,

colspan,

rowspan,align,

valign,

nowrap

thclass,

style,

bgcolor,

height,

width,

colspan,

rowspan,align,

valign,

nowrap

trclass, style,

align,bgcolor,

valign

ttclass, style

uclass, style

ulclass, style

varclass,

style

These CSS formats are supported:

background-color

border

border-bottom

border-bottom-color

border-bottom-style

boder-bottom-width

border-color

border-left

border-left-color

border-left-style

boder-left-width

border-right

border-right-color

border-rigtht-style

border-right-width

border-style

border-top

border-top-color

border-top-style

border-top-width

border-width

color

counter-increment

counter-reset

display

font-family

font-size

font-style

font-weight

height

line-height

list-style-type

margin

margin-bottom

margin-left

margin-right

margin-top

padding

padding-bottom

padding-left

padding-right

padding-top

text-align

text-decoration

text-indent

vertical-align

white-space

width

Input Arguments

htmlFile — HTML file path

character vector

HTML file path, specified as a character vector.

Properties

Note

For HTML markup to display correctly in your report, you must include end tags for

empty elements and enclose attribute values in quotation marks. If you want to show

a reserved XML markup character as text, you must use its equivalent named or

numeric XML character.

Reserved CharacterDescriptionEquivalent Character>Greater than>

&Ampersand&

"Double quotation mark"

'Single quotation mark'

%Percent%

Id — ID for HTMLFile object

character vector

A session-unique ID is generated as part of HTMLFile

object creation. You can specify an ID to replace the generated ID.

HTMLTag — HTML tag name of HTML container element

'div' (default) | character vector

Tag name of HTML container element, specified as a character vector, such

as 'div', 'section', or

'article' corresponding to this

HTMLFile object. This property applies only to HTML

output.

Children — Children of this HTMLFile object

cell array of mlreportgen.dom.Element objects

This read-only property lists child elements that the

HTMLFile object contains.

Parent — Parent of this HTMLFile object

a DOM object

This read-only property lists the parent of this

HTMLFile object.

Style — Formatting to apply to HTMLFile object

cell array of format objects

Formatting to apply to the HTMLFile object, specified

as a cell array of DOM format objects. The children of this

HTMLFile object inherit any of these formats that

they do not override.

StyleName — Style name of HTMLFile object

character vector

Style name of this HTMLFile object, specified as a

character vector. Use a name of a style specified in the style sheet of the

document to which this HTMLFile object is appended. The

specified style defines the appearance of the HTMLFile

object in the output document where not overridden by the formats specified

by this StyleName property of the

HTMLFile object.

Tag — Tag for HTMLFile object

character vector

Tag for HTMLFile object, specified as a character

vector.

A session-unique ID is generated as part of HTMLFile

object creation. The generated tag has the form CLASS:ID,

where CLASS is the class of the element and

ID is the value of the Id property

of the object. You can specify a tag to replace the generated tag.

Specify your own tag value, for example, to make it easier to identify

where an issue occurred during document generation.

Note

HTMLFile ignores the

KeepInterElementWhiteSpace property. If you want to

preserve white space, use fileread to read your HTML file as

text and then follow the procedure described for the mlreportgen.dom.HTMLKeepInterElementWhiteSpace

property.

MethodsAppend HTML to HTMLFile object

Examples

Convert HTML File to a Word Report

Create a text file named myHTML.html and save it in the

current folder. Add this text into the file:

Hello World

This is me speaking

To convert the myHTML.html file to a Word report, run

these commands:

import mlreportgen.dom.*;

rpt = Document('MyReport','docx');

htmlFile = HTMLFile('myHTML.html');

append(rpt,htmlFile);

close(rpt);

rptview(rpt.OutputPath);

The resulting Word report contains the text that you specified in the HTML file.

d9281056e4b5ddd7c8ca282971581159.png

Tips

MATLAB®

Report Generator™ mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects typically cannot accept the raw HTML output of third-party applications, such as Microsoft Word, that export native documents as HTML markup. In these cases, your Report API report generation program can use the mlreportgen.utils.html2dom.prepHTMLString and mlreportgen.utils.html2dom.prepHTMLFile functions to prepare the raw HTML for use with the mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects. Typically, your program will have to further process the prepared HTML to remove valid but undesirable objects, such as line feeds that were in the raw content.

Word and PDF documents require inline elements, such as text and links, to be contained in

a paragraph. To meet this requirement, the HTML parser creates wrapper paragraphs to contain

inline elements that are not already in a paragraph. If you create an

mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile object

from HTML that contains inline elements that are not in paragraphs and add the object to an

HTML document, the generated HTML can differ from the input HTML. To generate the inline

elements without the added wrapper paragraphs, insert the HTML markup into an HTML document

by using an mlreportgen.dom.RawText object.

By default, the DOM API uses a base font size of 12 points to

convert em units to actual font sizes. For example, a font size specified as

2em converts to 24 points. To specify a different base font size, add

your content to a report by using an mlreportgen.dom.HTML object. Set the

EMBaseFontSize property of the object to the base font size. For

example, if you set the EMBaseFontSize property to 14, a font size of

2em converts to 28 points.

Introduced in R2015a

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值