php 匹配不区分大小写,php – 匹配不区分大小写的精确短语和空格

最新推荐文章于 2021-04-06 09:44:45 发布

weixin_39777213

最新推荐文章于 2021-04-06 09:44:45 发布

阅读量336

点赞数

文章标签： php 匹配不区分大小写

听起来问题的第1部分已经解决了,所以这个答案只关注第2部分.据我所知,你试图确定给定的输入消息是否包含任何顺序的所有单词列表.

对于每条消息,可以使用正则表达式和单个preg_match来完成此操作,但如果您有大量单词列表,则效率非常低.如果N是您要搜索的单词数,M是消息的长度,则算法应为O(N * M).如果您注意到,每个关键字的正则表达式中都有两个.*术语.使用前瞻断言,正则表达式引擎必须遍历每个关键字一次.这是示例代码：

// sample messages

$msg1 = "Lose all the weight all the weight you want. It's fast and easy!";

$msg2 = 'Are you over weight? lose the pounds fast!';

$msg3 = 'Lose weight slowly by working really hard!';

// spam defining keywords (all required, but any order).

$keywords = array('lose', 'weight', 'fast');

//build the regex pattern using the array of keywords

$patt = '/(?=.*\b'. implode($keywords, '\b.*)(?=.*\b') . '\b.*)/is';

echo "The pattern is: '" .$patt. "'\n";

echo 'msg1 '. (preg_match($patt, $msg1) ? 'is' : 'is not') ." spam\n";

echo 'msg2 '. (preg_match($patt, $msg2) ? 'is' : 'is not') ." spam\n";

echo 'msg3 '. (preg_match($patt, $msg3) ? 'is' : 'is not') ." spam\n";

?>

输出是：

The pattern is: '/(?=.*\blose\b.*)(?=.*\bweight\b.*)(?=.*\bfast\b.*)/is'

msg1 is spam

msg2 is spam

msg3 is not spam

第二个解决方案似乎更复杂,因为代码更多,但正则表达式更简单.它没有先行断言,也没有.*术语. preg_match函数在while循环中调用,但这并不是什么大问题.每条消息只遍历一次,复杂度应为O(M).这也可以使用单个preg_match_all函数完成,但是您必须执行array_search才能获得最终计数.

// sample messages

$msg1 = "Lose all the weight all the weight you want. It's fast and easy!";

$msg2 = 'Are you over weight? lose the pounds fast!';

$msg3 = 'Lose weight slowly by working really hard!';

// spam defining keywords (all required, but any order).

$keywords = array('lose', 'weight', 'fast');

//build the regex pattern using the array of keywords

$patt = '/(\b'. implode($keywords,'\b|\b') .'\b)/is';

echo "The pattern is: '" .$patt. "'\n";

echo 'msg1 '. (matchall($patt, $msg1, $keywords) ? 'is' : 'is not') ." spam\n";

echo 'msg2 '. (matchall($patt, $msg2, $keywords) ? 'is' : 'is not') ." spam\n";

echo 'msg3 '. (matchall($patt, $msg3, $keywords) ? 'is' : 'is not') ." spam\n";

function matchall($patt, $msg, $keywords)

{

$offset = 0;

$matches = array();

$index = array_fill_keys($keywords, 0);

while( preg_match($patt, $msg, &$matches, PREG_OFFSET_CAPTURE, $offset) ) {

$offset = $matches[1][1] + strlen($matches[1][0]);

$index[strtolower($matches[1][0])] += 1;

}

return min($index);

}

?>

输出是：

The pattern is: '/(\blose\b|\bweight\b|\bfast\b)/is'

msg1 is spam

msg2 is spam

msg3 is not spam

weixin_39777213

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
php 匹配不区分大小写,php – 匹配不区分大小写的精确短语和空格

听起来问题的第1部分已经解决了,所以这个答案只关注第2部分.据我所知,你试图确定给定的输入消息是否包含任何顺序的所有单词列表.对于每条消息,可以使用正则表达式和单个preg_match来完成此操作,但如果您有大量单词列表,则效率非常低.如果N是您要搜索的单词数,M是消息的长度,则算法应为O(N * M).如果您注意到,每个关键字的正则表达式中都有两个.*术语.使用前瞻断言,正则表达式引擎必须遍历每...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。