XPATH中text()和string()的使用区别

<table style="WIDTH: 95.45%; BORDER-COLLAPSE: collapse; EMPTY-CELLS: show; MARGIN-LEFT: 4.55%; MARGIN-TOP: 2pt" cellspacing="0" cellpadding="4">
  <tbody>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Diversified Income Series 
      (Service Class): Maximum long-term total return consistent with reasonable 
      risk. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Emerging Markets Series (Service 
      Class): Long-term capital appreciation. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Limited-Term Diversified Income 
      Series (Service Class): Maximum total return, consistent with reasonable 
      risk. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> REIT Series (Service Class): 
      Maximum long-term total return, with capital appreciation as a secondary 
      objective. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Small Cap Value Series (Service 
      Class): Capital appreciation. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Smid Cap Core Series (Service 
      Class): Long-term capital appreciation. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> U.S. Growth Series (Service 
      Class): Long-term capital appreciation. </td></tr>
  <tr style="PAGE-BREAK-INSIDE: avoid">
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; PADDING-BOTTOM: 0pt; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt"></td>
    <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; PADDING-BOTTOM: 0pt; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware 
      VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Value Series (Service Class): 
      Long-term capital appreciation. </td></tr></tbody></table>

在上面的Html表格中,我们要抓出每个tr标签中第2个td的文本内容,一开始想到的XPATH语句是这么写的:

//td[contains(text(),':') and contains(text(),'(') and contains(text(),')') and (contains(text(),'Class') or contains(text(),'Shares'))]

结果发现提不出来,将text()函数改为string()函数,就可以提出来了:

//td[contains(string(),':') and contains(string(),'(') and contains(string(),')') and (contains(string(),'Class') or contains(string(),'Shares'))]

原文档中有些td标签文本有换行,而且可能还夹杂着其他子标签,这时候可能用text()提取不出来,可以改用string(),string()可以将所有子标签中的文本串成一起提出来,可以满足绝大部分时候的需求。

 

转载于:https://www.cnblogs.com/JTCLASSROOM/p/11023284.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值