问题描述:
对于一个字符串S,如果S经过有限次分割,被分成几个子串,将这些子串组合成两个多重集(Multiset)M1和M2,若存在M1=M2这样的两个多重集,则说S是平衡的。
比如"abccba","abaccb","aabbcc"都是平衡串,因为能被分成两个多重集M1,M2,其中:
M1=M2={a,b,c}
而串"ababab"和"abcabb"就不是平衡串
现在给出一个字符串S,求其中有多少子串是平衡串。
问题分析:
两个相等的多重集和起来的集合,里面的元素出现的次数一定是偶数,
所以平衡串每个元素出现的次数一定是偶数次。
对于只有字母'a'的字符串,一种判别'a'是否出现次数为a的方法是:
对于第i个元素,如果i为奇数则令A[i]为1,否则为-1,然后对A的元素求和,如果为零,则'a'出现了偶数次。
这样,将字符串中的26个字母都赋予一个数字p,对于字母L,第偶数个出现的赋p,第奇数个赋-p。
为了避免26个字母间的p相互干扰,可以令p=1<<k,其中k是L在字母表中的次序减一。
例如:
" a a b b c b a c b a"
对应 1 -1 2 -2 4 2 1 -4 -2 -1
这样,如果有子串为平衡串,那么子串中各个元素对应的数字之和一定为0
现在令B[i]=A[1]+...+A[i]
则对" a a b b c b a c b a"有:
1 0 2 0 4 6 7 3 1 0
经观察可发现,对应数字相同的字母为起止位置的子串一定是平衡串再在前面加一个多余的字母,
如上例中第二到第十个字母构成的子串:
“a b b c b a c b a”
除掉最前面的a,后面的串就是平衡串
对于B[i],如果出现了n次,就表明存在n(n-1)/2个平衡子串
现在问题归结为数相同的B[i]的个数
python代码:
T=int(raw_input())
for i in range(T):
lib=[0 for i in range(26)]
s=raw_input()
fun=[0 for i in range(len(s))]
for j in range(len(s)):
h=ord(s[j])-ord('a')
g=(1<<h)*(-1)**lib[h]
fun[j]+=fun[j-1]+g
lib[h]+=1
fun.append(0)
p=fun[:]
p.sort()
k=0
pre=p[0]
count=0
for j in p:
if j==pre:
count+=1
else:
k+=int((count-1)*count/2)
count=1
pre=j
k+=int((count-1)*count/2)
print k
原题:
Chandan got bored playing with the arrays all the time. Therefore he has decided to buy a string S consists of N lower case alphabets. Once he purchased the string, He starts formulating his own terminologies over his string S. Chandan calls a string str A Balanced String if and only if the characters of the string str can be paritioned into two multisets M1and M2 such that M1= M2 .
For eg:
Strings like "abccba" , "abaccb" , "aabbcc" are all balanced strings as their characters can be partitioned in the two multisets M1 and M2 such that M1 = M2.
M1 = {a,b,c}
M2 = {c,b,a}
whereas strings like ababab , abcabb are not balanced at all.
Chandan wonders how many substrings of his string S are Balanced String ? Chandan is a little guy and do not know how to calculate the count of such substrings.
Can you help him in accomplishing this task ?
Input
First line of input contains a single integer T denoting the number of test cases. First and the only line of each test case contains a string consists of lower case alphabets only denoting string S .
Output
For each test case, print the count of Balanced Substrings of string S.
Constraints
1 ≤ T ≤ 105
1 ≤ |S| ≤ 105
S consists of lower case alphabets only.
NOTE : sum of |S| over all the test case will not exceed 10^6.