php 删除特殊符号,从字符串中删除所有特殊字符

最新推荐文章于 2024-05-20 21:23:32 发布

weixin_39946534

最新推荐文章于 2024-05-20 21:23:32 发布

阅读量1.3k

点赞数

文章标签： php 删除特殊符号

下面的解决方案有一个“SEO更友好”版本：function hyphenize($string) {

$dict = array(

"I'm" => "I am",

"thier" => "their",

// Add your own replacements here

);

return strtolower(

preg_replace(

array( '#[\\s-]+#', '#[^A-Za-z0-9\. -]+#' ),

array( '-', '' ),

// the full cleanString() can be downloaded from

http://www.unexpectedit.com/php/php-clean-string-of-utf8-chars-convert-to-similar-ascii-char

cleanString(

str_replace( // preg_replace can be used to support more complicated replacements

array_keys($dict),

array_values($dict),

urldecode($string)

)

)

)

);}function cleanString($text) {

$utf8 = array(

'/[áàâãªä]/u' => 'a',

'/[ÁÀÂÃÄ]/u' => 'A',

'/[ÍÌÎÏ]/u' => 'I',

'/[íìîï]/u' => 'i',

'/[éèêë]/u' => 'e',

'/[ÉÈÊË]/u' => 'E',

'/[óòôõºö]/u' => 'o',

'/[ÓÒÔÕÖ]/u' => 'O',

'/[úùûü]/u' => 'u',

'/[ÚÙÛÜ]/u' => 'U',

'/ç/' => 'c',

'/Ç/' => 'C',

'/ñ/' => 'n',

'/Ñ/' => 'N',

'/–/' => '-', // UTF-8 hyphen to "normal" hyphen

'/[’‘‹›‚]/u' => ' ', // Literally a single quote

'/[“”«»„]/u' => ' ', // Double quote

'/ /' => ' ', // nonbreaking space (equiv. to 0x160)

);

return preg_replace(array_keys($utf8), array_values($utf8), $text);}

上述功能的基本原理(我发现方式，道路低效-下面的一个更好)是不应命名的服务显然在URL上进行拼写检查和关键字识别。

在一个顾客的妄想症上失去了很长一段时间之后，我发现他们不想像力毕竟-他们的SEO专家(我绝对不是其中之一)报告说，把“Viaggi Economy Pauro”转换成viaggi-economy-peru“表现更好”viaggi-economy-per(以前的“清洗”删除了UTF 8字符；波哥大成波哥大, 麦德林成梅德伦等等)。

还有一些常见的拼写错误似乎影响了结果，对我来说唯一有意义的解释是我们的URL被解压，单词被单独挑出来，用来驱动天知道什么排序算法。这些算法显然都是用UTF 8清理过的字符串来完成的，所以“PEROME”变成了“秘鲁”而不是“PER”。“Per”和“per”不匹配，有点像在脖子上。

为了既保留UTF 8字符又替换一些拼写错误，下面的函数越快，就越准确(？)以上功能。$dict当然需要手工定制。

先前的回答

一个简单的方法：

// Remove all characters except A-Z, a-z, 0-9, dots, hyphens and spaces

// Note that the hyphen must go last not to be confused with a range (A-Z)

// and the dot, being special, is escaped with \

$str = preg_replace('/[^A-Za-z0-9\. -]/', '', $str);

// Replace sequences of spaces with hyphen

$str = preg_replace('/ */', '-', $str);

// The above means "a space, followed by a space repeated zero or more times"

// (should be equivalent to / +/)

// You may also want to try this alternative:

$str = preg_replace('/\\s+/', '-', $str);

// where \s+ means "zero or more whitespaces" (a space is not necessarily the

// same as a whitespace) just to be sure and include everything

请注意，您可能必须首先urldecode()URL，因为%20和+实际上都是空格-我的意思是，如果你有“永不放弃%20给%20 You%20 up”，你希望它变成永远不会放弃的，而不是20..你可能不需要，但我想我应该提一下这个可能性。

因此，完成的功能以及测试用例：function hyphenize($string) {

return

## strtolower(

preg_replace(

array('#[\\s-]+#', '#[^A-Za-z0-9\. -]+#'),

array('-', ''),

## cleanString(

urldecode($string)

## )

)

## )

;}print implode("\n", array_map(

function($s) {

return $s . ' becomes ' . hyphenize($s);

},

array(

'Never%20gonna%20give%20you%20up',

"I'm not the man I was",

"'Légeresse', dit sa majesté",

)));Never%20gonna%20give%20you%20up becomes never-gonna-give-you-up

I'm not the man I was becomes im-not-the-man-I-was

'Légeresse', dit sa majesté becomes legeresse-dit-sa-majeste

为了处理UTF-8，我使用了cleanString实现找到在线(链接被打破，但一个剥离的副本与所有不太深奥的UTF 8字符是在答案的开头；它也很容易添加更多的字符，如果你需要的话)，将UTF 8字符转换为正常字符，从而尽可能保留“外观”一词。为了提高性能，可以将其简化并封装在这里的函数中。

上面的函数还实现了转换为小写-但这是一种体验。这样做的代码已被注释掉。

weixin_39946534

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。