我有一份清单:results = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
我创建了一个计数器,它存储每个列表中每个元素中的字符数,仅当字符是ATGC[NOT Z]时
^{pr2}$
**
代码:counters = [Counter(sub_list) for sub_list in results]
nn =[]
d = []
for counter in counters:
atgc_count = sum((val for key, val in counter.items() if key in "ATGC"))
nn.append(atgc_count)
d = [i - 1 for i in nn]
correctionfactor = [float(b) / float(m) for b,m in zip(nn, d)]
print nn
print correctionfactor
"Failed" Output:
[0, 0]
Desired Output
nn = [[4,3],[4,2]]
correctionfactor = [[1.33, 1.5],[1.33,2]]
**
然后计算每个字符的频率(pi),平方,然后求和(然后计算het=1-sum)。在The desired output [[1,2],[1,2]] #NOTE: This is NOT the real values of expected output. I just need the real values to be in this format.
在**
代码list_of_hets = []
for idx, element in enumerate(sample):
count_dict = {}
square_dict = {}
for base in list(element):
if base in count_dict:
count_dict[base] += 1
else:
count_dict[base] = 1
for allele in count_dict:
square_freq = (count_dict[allele] / float(nn[idx]))**2
square_dict[allele] = square_freq
pf = 0.0
for i in square_dict:
pf += square_dict[i] # pf --> pi^2 + pj^2...pn^2
het = 1-pf
list_of_hets.append(het)
print list_of_hets
"Failed" OUTPUT:
[-0.0, -0.0]
在**
我需要用修正系数乘以列表中的每一个元素h = [float(n) * float(p) for n,p in zip(correction factor,list_of_hets)
With the values given above:
h = [[1.33, 1.5],[1.33,2]] #correctionfactor multiplied by list_of_hets
最后,我需要找到h中每个元素的平均值,并将其存储在一个新列表中。在The desired output should read as [1.33, 1.75].hs = [mean(i) for i in zip(*h)]
但是我得到了以下错误“TypeError:zip参数#1必须支持迭代”
我知道在第一步纠正代码可能会解决这个问题。我试图手动输入“期望的输出”并运行其余的代码,但没有成功。在