关闭

正则表达式在XSLT2.0中的运用实例. analyze-string tokenize()

标签: 正则表达式XSLT2.0正则XSLT analyze-string
497人阅读 评论(0) 收藏 举报
分类:

XSLT2.0相比XSLT1.0一个很大的进步是增强了正则表达式运用.

下面主要讲下: <xsl:analyze-string select="" regex="">元素.

通过select指定进行匹配的数据源,regex则提供对应的正则表达式的表示.

当匹配结果为true时候可以执行该元素下面的.<xsl:matching-substring/>

当匹配结果为false时候执行该元素内部的.<xsl:non-matching-substring/>


tokenize($content,$token)函数则主要用于拆分源字符串.通过提供$content参数和$token参数来指定拆分规则.


相关例子:

Step A.从外部文件中读取数据.(运用unparsed-text()函数)

Step B.对所读取的数据进行每一行的拆分. (运用tokenize()函数)

Step C.对每行数据进行规格匹配,找出符合规格和不符合规格的数据.(运用analyze-string元素)

             (注:符合规格的将进行XML序列化,不符合规格的行数据将直接作为备注输出)

Step D.对数据进行排序后重新序列化.(运用for-each-group元素及result-document元素)


数据源: empolyees.csv 文件.(其中第九行不符合规则)

Joe, Fawcett, Developer, IT  
Max, Bialystock, CEO, Management  
Phineas, Barnum, Head of Sales, Sales and Marketing  
Leo, Bloom, Auditor, Accounts  
Danny, Ayers, Developer, IT  
Carmen, Ghia, PA to the VP of Products, Management  
Ulla, Anderson, Head of Promotions, Sales and Marketing  
Grace, Hopper, Developer, IT  
<!--This line is invalid line-->
Bob, Cratchit, Bookkeeper, Accounts  
Charles, Babbage, Head of Infrastructure, IT  
Roger, De Bris, VP of Products, Management  
Willy, Loman, Salesman, Sales and Marketing  
Franz, Liebkind, Developer, IT  
Luca, Pacioli, Accountant, Accounts  
Lorenzo, St. DuBois, Project Manager, IT  

XSLT文件: empolyees.xslt 

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xsl:stylesheet version="2.0"
				xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
				xmlns:xs="http://www.w3.org/2001/XMLSchema"
				xmlns:myFunction="http://www.ricky.com/myFunction"
				exclude-result-prefixes="xs myFunction">
	<xsl:output indent="yes"/>
	<xsl:variable name="empolyeesInfo" select="tokenize(unparsed-text('empolyees.csv'),'\n')" as="xs:string*"/>

	<xsl:template name="main">
		<xsl:call-template name="createEmpolyeeFile"/>
		<xsl:call-template name="sortAndResaveTheEmpolyees"/>
	</xsl:template>

	<xsl:template name="createEmpolyeeFile">
		<xsl:variable name="regex" select="'^\s*([^,]+)\s*,\s*([^,]+)\s*,\s*([^,]+)\s*,\s*([^,]+)\s*$'" as="xs:string"/>
		<xsl:result-document href="empolyees.xml">
			<empolyees>
				<xsl:for-each select="$empolyeesInfo">
					<xsl:variable name="data" select="." as="xs:string"/>
					<xsl:variable name="position" select="position()" as="xs:integer"/>
					<xsl:analyze-string select="$data" regex="{$regex}">
						<!-- the data must match the regex-expression  -->
						<xsl:matching-substring>
							<xsl:call-template name="saveSeparateEmpolyee">
								<xsl:with-param name="firstName" select="regex-group(1)" as="xs:string"/>
								<xsl:with-param name="lastName" select="regex-group(2)" as="xs:string"/>
								<xsl:with-param name="jobTitle" select="regex-group(3)" as="xs:string"/>
								<xsl:with-param name="department" select="myFunction:replaceAll(regex-group(4),'\s*
','')" as="xs:string"/>
							</xsl:call-template>
							<xsl:call-template name="createEmpolyee">
								<xsl:with-param name="firstName" select="regex-group(1)" as="xs:string"/>
								<xsl:with-param name="lastName" select="regex-group(2)" as="xs:string"/>
								<xsl:with-param name="jobTitle" select="regex-group(3)" as="xs:string"/>
								<xsl:with-param name="department" select="myFunction:replaceAll(regex-group(4),'\s*
','')" as="xs:string"/>
							</xsl:call-template>
						</xsl:matching-substring>
						<xsl:non-matching-substring>
							<xsl:call-template name="invalidData">
								<xsl:with-param name="data" select="$data"/>
								<xsl:with-param name="position" select="$position"/>
							</xsl:call-template>
						</xsl:non-matching-substring>
					</xsl:analyze-string>
				</xsl:for-each>
			</empolyees>
		</xsl:result-document>
	</xsl:template>

	<xsl:template name="sortAndResaveTheEmpolyees">
		
		<xsl:variable name="empolyees" select="document('empolyees.xml')"/>
		<xsl:result-document href="empolyeesAfterSort.xml">
			<empolyees>
				<xsl:for-each-group select="$empolyees/empolyees/empolyee" group-by="@department">
					<xsl:sort select="@department" data-type="text"/>
					<department department="{current-grouping-key()}">
						<xsl:for-each select="current-group()">
							<xsl:sort select="@firstName" data-type="text"/>
							<xsl:sort select="@lastName" data-type="text"/>
							<empolyee>
								<firstName>
									<xsl:value-of select="@firstName"/>
								</firstName>
								<lastName>
									<xsl:value-of select="@lastName"/>
								</lastName>
								<jobTitle>
									<xsl:value-of select="@jobTitle"/>
								</jobTitle>
							</empolyee>
						</xsl:for-each>
					</department>
				</xsl:for-each-group>
			</empolyees>
		</xsl:result-document>
	</xsl:template>

	<xsl:template name="saveSeparateEmpolyee">
		<!-- save each empolyee in separate file. firstName-lastName.xml -->
		<xsl:param name="firstName" as="xs:string"/>
		<xsl:param name="lastName" as="xs:string"/>
		<xsl:param name="jobTitle" as="xs:string"/>
		<xsl:param name="department" as="xs:string"/>
		<xsl:result-document href="{concat($firstName,'-',$lastName)}.xml">
			<empolyee firstName="{$firstName}" lastName="{$lastName}" department="{$department}" jobTitle="{$jobTitle}"/>
		</xsl:result-document>

	</xsl:template>

	<xsl:template name="createEmpolyee">
		<xsl:param name="firstName" as="xs:string"/>
		<xsl:param name="lastName" as="xs:string"/>
		<xsl:param name="jobTitle" as="xs:string"/>
		<xsl:param name="department" as="xs:string"/>
		<empolyee firstName="{$firstName}" lastName="{$lastName}" department="{$department}" jobTitle="{$jobTitle}"/>
	</xsl:template>

	<xsl:template name="invalidData">
		<xsl:param name="data" as="xs:string"/>
		<xsl:param name="position" as="xs:integer"/>
		<xsl:comment>
			<xsl:value-of select="concat('Found invalid content.line:',$position,' content: ',$data)"/>
		</xsl:comment>
	</xsl:template>

	<!-- limit function -->
	<xsl:function name="myFunction:replaceAll">
		<xsl:param name="content" as="xs:string"/>
		<xsl:param name="regex" as="xs:string"/>
		<xsl:param name="replacement" as="xs:string"/>
		<xsl:analyze-string select="$content" regex="^(.*?){$regex}(.*)$">
			<xsl:matching-substring>
				<xsl:variable name="front" as="xs:string">
					<xsl:value-of select="regex-group(1)"/>
				</xsl:variable>
				<xsl:variable name="after" as="xs:string">
					<xsl:value-of select="regex-group(2)"/>
				</xsl:variable>
				<xsl:variable name="newContent" select="concat($front,$replacement,$after)"/>
				<xsl:value-of select="myFunction:replaceAll($newContent,$regex,$replacement)"/>
			</xsl:matching-substring>
			<xsl:non-matching-substring>
				<xsl:value-of select="$content"/>
			</xsl:non-matching-substring>
		</xsl:analyze-string>
	</xsl:function>
</xsl:stylesheet>

采用的是Saxon解析器.

对应命令:java net.sf.saxon.Transform -it:main -xsl:empolyees.xslt

0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:5459次
    • 积分:226
    • 等级:
    • 排名:千里之外
    • 原创:18篇
    • 转载:0篇
    • 译文:0篇
    • 评论:1条
    文章分类
    最新评论