It's easy to tell if two words are identical - just check the letters. But how do you tell if two words are almost identical? And how close is "almost"?
There are lots of techniques for approximate word matching. One is to determine the best substring match, which is the number of common letters when the words are compared letter-byletter.
The key to this approach is that the words can overlap in any way. For example, consider the words CAPILLARY and MARSUPIAL. One way to compare them is to overlay them:
CAPILLARY
MARSUPIAL
There is only one common letter (A). Better is the following overlay:
CAPILLARY
MARSUPIAL
with two common letters (A and R), but the best is:
CAPILLARY
MARSUPIAL
Which has three common letters (P, I and L).
The approximation measure appx(word1, word2) for two words is given by:
common letters * 2
-----------------------------
length(word1) + length(word2)
Thus, for this example, appx(CAPILLARY, MARSUPIAL) = 6 / (9 + 9) = 1/3. Obviously, for any word W appx(W, W) = 1, which is a nice property, while words with no common letters have an appx value of 0.
Input:
The input for your program will be a series of words, two per line, until the end-of-file flag of -1.
Using the above technique, you are to calculate appx() for the pair of words on the line and print the result. For example:
CAR CART
TURKEY CHICKEN
MONEY POVERTY
ROUGH PESKY
A A
-1
The words will all be uppercase.
Output:
Print the value for appx() for each pair as a reduced fraction, like this:
appx(CAR,CART) = 6/7
appx(TURKEY,CHICKEN) = 4/13
appx(MONEY,POVERTY) = 1/3
appx(ROUGH,PESKY) = 0
appx(A,A) = 1
Fractions reducing to zero or one should have no denominator.
分析:题目意思需要判断两个单词的最优字串匹配数,即两个单词之间有多少单词是相同的。
题目中已经说了计算的方法,分为两种情况,分别计算出公共字符串的个数Num,然后取最大值。
1.串a的最后一个字符与串b的第一个字符对齐,串a每向右移动一次,就与串b中每个字符进行一次比较。
2.串b的最后一个字符与串a的第一个字符对齐,串b每向右移动一次,就与串a中每个字符进行一次比较。
注意下最终结果需要约分!
AC代码:
#include <stdio.h>
#include <string.h>
int gcd(int a,int b)
{
if(b==0)
return a;
else
return gcd(b,a%b);
}
int main()
{
char a[300],b[300];
int len1,len2;
int lensum;
while(scanf("%s",a)&&strcmp(a,"-1")!=0)
{
scanf("%s",b);
len1=strlen(a);
len2=strlen(b);
lensum=len1+len2;
int max=0;
int num;
int i,j,k;
//串a的最后一个字符与串B的第一个字符对齐
for(k=len1-1;k>0;k--)
{
num=0;
i=k;
j=0;
while(1)
{
if(a[i++]==b[j++])
num++;
if(a[i]=='\0'||b[j]=='\0')
break;
}
if(num>max)
max=num;
}
//串B的最后一个字符与a的第一个字符对齐
for(k=0;k<=len2-1;k++)
{
num=0;
i=0;
j=k;
while(1)
{
if(a[i++]==b[j++])
num++;
if(a[i]=='\0'||b[j]=='\0')
break;
}
if(num>max)
max=num;
}
max*=2;
if(max==0)
printf("appx(%s,%s) = 0\n",a,b);
else if(max==lensum)
printf("appx(%s,%s) = 1\n",a,b);
else
{
int d=gcd(max,lensum);
printf("appx(%s,%s) = ",a,b);
printf("%d/%d\n",max/d,lensum/d);
}
}
return 0;
}