从《4、简单的网络爬虫例子.docx》可以得出,最终的正则表达式为:
<div\s+class="(result|result-op)\s+.*?<a.*?>(?<标题>.*?)</a>
得到的结果是:
现在需要把这个用vb.net语言实现这个功能
- 点击使用,并选择对应的高级语言
- 把相关的拷贝到工程中:
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim t1 As Long = Now.Ticks
Dim SubjectString As String = TextBox1.Text
Try
Dim mRegex As New Regex("<div\s+class=""(result|result-op)\s+.*?<a.*?>(?<标题>.*?)</a>", RegexOptions.Singleline Or RegexOptions.Multiline)
Dim mMatchs As MatchCollection = mRegex.Matches(SubjectString)
Dim t2 As Long = (Now.Ticks - t1) / 10000
Me.Text = t2
For Each item As Match In mMatchs
ListBox1.Items.Add(item.Groups("标题"))
Next
Catch ex As ArgumentException
'Syntax error in the regular expression
End Try
End Sub
End Class
效果: