在xpath中使用正则表达式

最新推荐文章于 2025-02-20 16:12:27 发布

PresleyR

最新推荐文章于 2025-02-20 16:12:27 发布

阅读量5.5k

点赞数 1

本文链接：https://blog.csdn.net/PresleyR/article/details/105804807

版权

xpath中使用正则表达式

其实我自己也从来没用到过，在此记录一下，万一以后会用到呢。
比如有个网站正文部分是： //*[@id='postmessage_32199']
另一个同级别页面的正文是： //*[@id='postmessage_32153']
要抓取这种正文其实可以用xpath： //*[starts-with(@id, 'postmessage_')]
或者 //*[contains(@id, 'postmessage_')]
也可以选择在xpath中使用正则表达式：doc.xpath(r'//*[re:match(@id, "postmessage_\d+")]', namespace={"re": "http://exslt.org/regular-expressions"})