#Shell 脚本 输出两个字符串的最长匹配部分
如 输入:
00000000000000000011000000111001
01000110010000001110010000000000
输出:
00000000000000000011000000111001
010001100[1000000111001]0000000000
一开始的思路是:
逐个比较str1与str2中的对应字符, 记录最长连续匹配的字符数(max)及对应的当前移位数(shifti)以及从第几个字符处开始匹配(shiftj),然后固定str1, str2右移一位, 重复上述步骤,直到全部移位完成(即str1与str2无任何重叠)或当前的最大连续匹配字符数已大于剩余的字符数(max>length-i)
如下所示:
第一次比较
00000000000000000011000000111001
01000110010000001110010000000000
第二次比较
00000000000000000011000000111001
01000110010000001110010000000000
第i次比较
00000000000000000011000000111001
01000110010000001110010000000000
#!/bin/bash
#find the longest match part of two binary strings
read -p "Please input binary string1 :> " str1
read -p "Please input binary string2 :> " str2
length=${#str1}
lengthminus1=$[length-1]
#the longest match character count
max=0
count=0
shifti=0
shiftj=0
for ((i=0; i<$length; i++))
do
if [[ max -ge length-i ]]; then
break
fi
startj=0
for ((j=0,t=$i; t<$length && $max < $length-j; j++,t++))
do
if [ ${str2:$j:1} = ${str1:$t:1} ]; then
let count+=1
if [ $t -eq $lengthminus1 -a $count -gt $max ]; then
max=$count
shifti=$i
shiftj=$startj
fi
else
if [ $count -gt $max ]; then
max=$count
shifti=$i
shiftj=$startj
fi
count=0
startj=$[j+1]
fi
done
count=0
done
echo "$str1"
if [ $max -eq 0 ]; then
echo "$str2"
echo "nothing match"
exit
fi
for ((i=0; i<$shifti-1; i++))
do
echo -n " "
done
for ((j=0; j<${#str2}; j++))
do
if [ $j -eq $shiftj ]; then
echo -n "[${str2:$j:1}"
if [[ $j -eq $shiftj+$max-1 ]]; then
echo -n "]"
fi
elif [[ $j -eq $shiftj+$max-1 ]]; then
echo -n "${str2:$j:1}]"
else
echo -n "${str2:$j:1}"
fi
done
echo
执行效果:
$ bash find_binary_match_parts.sh
Please input binary string1 :> 00000000000000000011000000111001
Please input binary string2 :> 01000110010000001110010000000000
00000000000000000011000000111001
010001100[1000000111001]0000000000
似乎能解决问题, 但是一旦变换str1与str2的顺序, 结果就会不同,如下所示:
$ bash find_binary_match_parts.sh
Please input binary string1 :> abcdefghi
Please input binary string2 :> ghifabcde
abcdefghi
[ghi]fabcde
$ bash find_binary_match_parts.sh
Please input binary string1 :> ghifabcde
Please input binary string2 :> abcdefghi
ghifabcde
[abcde]fghi
上述程序只考虑了往一个方向移位, 故很有可能会漏掉真正的最大匹配子串.
故决定网上搜索一下,看看有没更简便通用的方法, 发现了如下脚本:
#!/bin/bash
word1="$1"
word2="$2"
if [ ${#word1} -lt ${#word2} ]
then
word1="$2"
word2="$1"
fi
for ((i=${#word2}; i>0; i--)); do
for ((j=0; j<=${#word2}-i; j++)); do
if [[ $word1 =~ ${word2:j:i} ]]
then
echo ${word2:j:i}
exit
fi
done
done
运行效果如下:
$ bash common_substr.sh abcdefghi ghifabcde
abcde
$ bash common_substr.sh ghifabcde abcdefghi
abcde
不管是什么输入顺序均能得到同样的结果.
注: =~ 相当于java中的indexOf,即用来判断子字符串是否存在,如下所示:
$ [[ "abcdefgh" =~ "abc" ]] && echo $?
0
$ [[ "abcdefgh" =~ "bcd" ]] && echo $?
0
$ [[ "abcdefgh" =~ "bar" ]]
$ echo $?
1
上述脚本的来源为: