我在我的HTML中有一个嵌套的UL,LI列表。我怎样才能从ul到ul节点的末尾获得正则表达式。在这个例子中,我需要获得2场比赛。正则表达式在C中嵌套从UL UL#
第一个应该是
- This is First List
- This is Second List
- This is Second UL First List
- This is Second UL Second List
- This is Third List
,第二个应该是
- This is Next List
- This is Test
- This is Third List
- This is Test
我的HTML代码:
This is First Paragraph
- This is First List
- This is Second List
- This is Second UL First List
- This is Second UL Second List
- This is Third List
This is Second Paragraph
- This is Next List
- This is Test
- This is Third List
- This is Test
+1
不要使用正则表达式来解析HTML。请参阅:http://stackoverflow.com/a/1732454/4664094 –
+1
[必备链接](http://stackoverflow.com/a/1732454/2307070) –
+1
您可以尝试HTML Agility Pack(https://htmlagilitypack.codeplex .COM)。正如以前的海报所指出的,不使用RegEx。 –