php替换文章内容链接,php – 使用HTML链接替换文本中的网址

最新推荐文章于 2021-04-10 16:14:24 发布

歌酒诗

最新推荐文章于 2021-04-10 16:14:24 发布

阅读量922

点赞数

文章标签： php替换文章内容链接

让我们来看看需求。您有一些用户提供的纯文本，您希望使用超链接的URL显示。

>“http：//”协议前缀应该是可选的。

>应接受域和IP地址。

>应接受任何有效的顶级域名，例如.aero和.xn – jxalpdlp。

>应该允许端口号。

>在正常的句子上下文中必须允许URL。例如，在“访问stackoverflow.com。”中，最后一个期间不是URL的一部分。

>您可能想要允许“https：//”网址，也可能允许其他网址。

>与在HTML中显示用户提供的文本时一样，您希望防止cross-site scripting(XSS)。此外，您还希望网址中的&符号为& amp; correctly escaped。

>您可能不需要支持IPv6地址。

>编辑：如注释中所述，支持电子邮件地址肯定是一个加号。

>编辑：仅支持纯文本输入 – 输入中的HTML标记不受支持。 (Bitbucket版本支持HTML输入。)

编辑：查看Bitbucket的最新版本，支持电子邮件地址，验证的URL，引号和括号中的URL，HTML输入以及更新的TLD列表。

请使用Bitbucket issue tracker报告错误和增强请求。它们更容易跟踪这种方式(并且不要混淆注释区域)。

这是我的采取：

$text = <<

Here are some URLs:

stackoverflow.com/questions/1188129/pregreplace-to-detect-html-php

Here's the answer: http://www.google.com/search?rls=en&q=42&ie=utf-8&oe=utf-8&hl=en. What was the question?

A quick look at http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax is helpful.

There is no place like 127.0.0.1! Except maybe http://news.bbc.co.uk/1/hi/england/surrey/8168892.stm?

Ports: 192.168.0.1:8080, https://example.net:1234/.

Beware of Greeks bringing internationalized top-level domains: xn--hxajbheg2az3al.xn--jxalpdlp.

And remember.Nobody is perfect.

EOD;

$rexProtocol = '(https?://)?';

$rexDomain = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';

$rexPort = '(:[0-9]{1,5})?';

$rexPath = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';

$rexQuery = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';

$rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';

// Solution 1:

function callback($match)

{

// Prepend http:// if no protocol specified

$completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";

return ''

. $match[2] . $match[3] . $match[4] . '';

}

print "

";

print preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",

'callback', htmlspecialchars($text));

print "";

>为了适当地逃离

>正如“记住，没有人是完美的”。行(其中remember.Nobody被视为一个URL，因为缺少空格)，进一步检查有效的顶级域可能是有序的。

编辑：以下代码修复了上述两个问题，但是有点更冗长，因为我或多或少使用preg_match重新实现preg_replace_callback。

// Solution 2:

$validTlds = array_fill_keys(explode(" ", ".aero .asia .biz .cat .com .coop .edu .gov .info .int .jobs .mil .mobi .museum .name .net .org .pro .tel .travel .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw .ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs .bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn .co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg .er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gb .gd .ge .gf .gg .gh .gi .gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kp .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .me .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .ms .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .no .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr .ps .pt .pw .py .qa .re .ro .rs .ru .rw .sa .sb .sc .sd .se .sg .sh .si .sj .sk .sl .sm .sn .so .sr .st .su .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tp .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw .xn--0zwm56d .xn--11b5bs3a9aj6g .xn--80akhbyknj4f .xn--9t4b11yi5a .xn--deba0ad .xn--g6w251d .xn--hgbk6aj7f53bba .xn--hlcj6aya9esc7a .xn--jxalpdlp .xn--kgbechtv .xn--zckzah .arpa"), true);

$position = 0;

while (preg_match("{\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))}", $text, &$match, PREG_OFFSET_CAPTURE, $position))

{

list($url, $urlPosition) = $match[0];

// Print the text leading up to the URL.

print(htmlspecialchars(substr($text, $position, $urlPosition - $position)));

$domain = $match[2][0];

$port = $match[3][0];

$path = $match[4][0];

// Check if the TLD is valid - or that $domain is an IP address.

$tld = strtolower(strrchr($domain, '.'));

if (preg_match('{\.[0-9]{1,3}}', $tld) || isset($validTlds[$tld]))

{

// Prepend http:// if no protocol specified

$completeUrl = $match[1][0] ? $url : "http://$url";

// Print the hyperlink.

printf('%s', htmlspecialchars($completeUrl), htmlspecialchars("$domain$port$path"));

}

else

{

// Not a valid URL.

print(htmlspecialchars($url));

}

// Continue text parsing from after the URL.

$position = $urlPosition + strlen($url);

}

// Print the remainder of the text.

print(htmlspecialchars(substr($text, $position)));

歌酒诗

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
php替换文章内容链接,php – 使用HTML链接替换文本中的网址

让我们来看看需求。您有一些用户提供的纯文本，您希望使用超链接的URL显示。>“http：//”协议前缀应该是可选的。>应接受域和IP地址。>应接受任何有效的顶级域名，例如.aero和.xn – jxalpdlp。>应该允许端口号。>在正常的句子上下文中必须允许URL。例如，在“访问stackoverflow.com。”中，最后一个期间不是URL的一部分。>您可能...
复制链接

扫一扫

php替换文章内容链接,php – 使用HTML链接替换文本中的网址

“相关推荐”对你有帮助么？