python合并空格_在python中合并数据时填补空白

>>> from collections import defaultdict

>>> import glob

>>> pos = defaultdict(dict)

>>> for index, infile in enumerate(glob.glob('D:\\DATA\\FP12210\\My Documents\\Temp\\Python\\sample*.vcf'), 1):

for line in open(infile):

# Convert value in integer already

val, letter = int(line.split()[1]), line.split()[3]

pos[val][index] = letter

>>> def print_pos(pos):

""" Formats pos """

# Print header by sorting keys of pos

values = sorted(pos.keys())

print ' ',

for val in range(values[0], values[-1] + 1):

print '{0:5}'.format(val),

print

# pos has keys according to row1, create pos2 with keys = sample #

pos2 = defaultdict(dict)

for val, d in pos.iteritems():

for index, letter in d.iteritems():

pos2[index][val] = letter

# Now easier to print lines

for index in sorted(pos2.keys()):

print ' sample{0:2} '.format(index),

for val in range(values[0], values[-1] + 1):

if val in pos2[index]:

print ' {0} '.format(pos2[index][val]),

else:

print ' NaN ',

print

>>> print_pos(pos)

2025 2026 2027 2028 2029 2030 2031 2032

sample 1 A NaN C T NaN NaN NaN NaN

sample 2 G A NaN NaN NaN NaN NaN T

>>>

我使用pos来收集值,我还使用pos2来收集相同的数据,因为:pos是面向值的,对于具有范围值很有用

pos2是面向样本的,对于给定样本号的值非常有用

为了避免范围过大,我使用了以下值:

-sample1.vcf:

^{pr2}$

-sample2.vcf:1 2025 blah G . blah PASS AC=0 GT:DP 0/0:61

2 2026 blah A . blah blah AC=0 GT:DP 0/0:61

3 2032 blah T . blah PASS AC=0 GT:DP 0/0:61

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值