正则html查找a href,查找<a>链接的'href'值的正则表达式

最新推荐文章于 2023-01-06 18:50:04 发布

梁颖聪

最新推荐文章于 2023-01-06 18:50:04 发布

阅读量643

点赞数

文章标签：正则html查找a href

尝试这个：

public partial class Form1 : Form

{

public Form1()

{

InitializeComponent();

}

private void Form1_Load(object sender, EventArgs e)

{

var res = Find(html);

}

public static List Find(string file)

{

List list = new List();

// 1.

// Find all matches in file.

MatchCollection m1 = Regex.Matches(file, @"(.*?)",

RegexOptions.Singleline);

// 2.

// Loop over each match.

foreach (Match m in m1)

{

string value = m.Groups[1].Value;

LinkItem i = new LinkItem();

// 3.

// Get href attribute.

Match m2 = Regex.Match(value, @"href=\""(.*?)\""",

RegexOptions.Singleline);

if (m2.Success)

{

i.Href = m2.Groups[1].Value;

}

// 4.

// Remove inner tags from text.

string t = Regex.Replace(value, @"\s*<.>\s*", "",

RegexOptions.Singleline);

i.Text = t;

list.Add(i);

}

return list;

}

public struct LinkItem

{

public string Href;

public string Text;

public override string ToString()

{

return Href + "\n\t" + Text;

}

}

}

输入：

string html = " 2. ";

结果：

[0] = {www.aaa.xx/xx.zz?id=xxxx&name=xxxx}

[1] = {http://www.aaa.xx/xx.zz?id=xxxx&name=xxxx}

C＃抓取HTML链接

刮HTML提取重要的页面元素。它对网站管理员和ASP.NET开发人员有许多法律用途。使用Regex类型和WebClient，我们实现了HTML的屏幕抓取。

已编辑

另一种简单的方法：您可以使用web browser控件href从tag 进行获取a，例如：(请参阅我的示例)

public Form1()

{

InitializeComponent();

webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);

}

private void Form1_Load(object sender, EventArgs e)

{

webBrowser1.DocumentText = "";

}

void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)

{

List href = new List();

foreach (HtmlElement el in webBrowser1.Document.GetElementsByTagName("a"))

{

href.Add(el.GetAttribute("href"));

}

}

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
正则html查找a href,查找<a>链接的'href'值的正则表达式

尝试这个：public partial class Form1 : Form{public Form1(){InitializeComponent();}private void Form1_Load(object sender, EventArgs e){var res = Find(html);}public static List Find(string file){List list =...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。