php similartext 中文,PHP similar_text 用法 手册 | 示例代码

Well, as mentioned above the speed is O(N^3), i've done a longest common subsequence way that is O(m.n) where m and n are the length of str1 and str2, the result is a percentage and it seems to be exactly the same as similar_text percentage but with better performance... here's the 3 functions i'm using..

{$m=strlen($s1);$n=strlen($s2);//this table will be used to compute the LCS-Length, only 128 chars per string are considered$LCS_Length_Table= array(array(128),array(128));//reset the 2 cols in the tablefor($i=1;$i

for($j=0;$j

for ($i=1;$i<=$m;$i++) {

for ($j=1;$j<=$n;$j++) {

if ($s1[$i-1]==$s2[$j-1])$LCS_Length_Table[$i][$j] =$LCS_Length_Table[$i-1][$j-1] +1;

else if ($LCS_Length_Table[$i-1][$j] >=$LCS_Length_Table[$i][$j-1])$LCS_Length_Table[$i][$j] =$LCS_Length_Table[$i-1][$j];

else$LCS_Length_Table[$i][$j] =$LCS_Length_Table[$i][$j-1];

}

}

return$LCS_Length_Table[$m][$n];

}

functionstr_lcsfix($s)

{$s=str_replace(" ","",$s);$s=ereg_replace("[��������]","e",$s);$s=ereg_replace("[������������]","a",$s);$s=ereg_replace("[��������]","i",$s);$s=ereg_replace("[���������]","o",$s);$s=ereg_replace("[��������]","u",$s);$s=ereg_replace("[�]","c",$s);

return$s;

}

functionget_lcs($s1,$s2)

{//ok, now replace all spaces with nothing$s1=strtolower(str_lcsfix($s1));$s2=strtolower(str_lcsfix($s2));$lcs=LCS_Length($s1,$s2);//longest common sub sequence$ms= (strlen($s1) +strlen($s2)) /2;

return (($lcs*100)/$ms);

}?>

you can skip calling str_lcsfix if you don't worry about accentuated characters and things like that or you can add up to it or modify it for faster performance, i think ereg is not the fastest way?

hope this helps.

Georges

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值