业务需要,需要对用户输入的文本做模糊相等判断,比如“红豆薏米”,“薏米红豆”可以认为是相等的,又比如“书香酮”与“舒香桐”与可以认为是相等的。只是大概做个判断,给用户个提示,在不影响人类解读的情况下,可以不用修改了。想了个简单的算法如下:
/// <summary>
/// 判断两个字符串是不是模糊相等
/// </summary>
/// <param name="sInputA"></param>
/// <param name="sInputB"></param>
/// <returns></returns>
public static bool IsBasicEqual(string sInputA, string sInputB)
{
int nSimilarRate = 75; // 设定相似度,
int nLengthA = sInputA.Length;
int nLengthB = sInputB.Length;
string sMin = "";
string sMax = "";
string sMin_py = "";
string sMax_py = "";
if (nLengthA >= nLengthB)
{
sMin = sInputB;
sMax = sInputA;
sMin_py = PinYin.GetFirstLetter(sInputB);
sMax_py = PinYin.GetFirstLetter(sInputA);
}
else
{
sMin = sInputA;
sMax = sInputB;
sMin_py = PinYin.GetFirstLetter(sInputA);
sMax_py = PinYin.GetFirstLetter(sInputB);
}
int nRate_hz_char = CompareString(sMin, sMax);
int nRate_py_char = CompareString(sMin_py, sMax_py);
if (((nRate_hz_char + nRate_py_char)/2) >= nSimilarRate)
{
return true;
}
else
{
return false;
}
}
/// <summary>
/// 逐个比较字符,判断相似度
/// </summary>
/// <param name="sMin"></param>
/// <param name="sMax"></param>
/// <returns></returns>
public static int CompareString(string sMin, string sMax)
{
int nLength_min = sMin.Length;
int nLength_max = sMax.Length;
int nRate = 0;
for (int i = 0; i < nLength_min; i++)
{
if (sMax.IndexOf(sMin.Substring(i, 1)) >= 0)
{
nRate += 1;
}
}
return (nRate * 100 / nLength_max);
}
后期,又做了些完善,对字符串中的中文字符做模糊判断,对西文字符可严格判断,更符合业务需求。