python运行不出结果_帮忙看下这个python程序啊，运行没错误，但就是没结果，求指导...-CSDN博客

libralibra

是基因序列?楼主你上传你的src.fa文件才好测试啊,否则只能看意思写一个,你自己测试.

1.空行跳过

2.已有title的序列跳过

3.无内容空序列跳过

4.含有非法字符序列跳过

5.输出文件序列之间添加空行

6.序列输出长度最多70

7.原序列先后顺序不变

代码CODE:

# output results: a dict

# title1: seq

output = {}

# keep title order

titles = []

# title

title = ''

# seq

seq = []

# read content

for line in open('src.fa','r'):

sline = line.strip()

if len(sline)==0: continue # jump empty line

if sline.startswith('>'):

# not the first title, save previous one

if len(title)!=0 and title not in titles and len(seq)>0: # check and save

output[title] = seq

titles.append(title)

title = sline # save current title

seq = [] # clear seq

elif set(sline.lower())==set('atcg'): # seq only has atcg

seq.append(sline)

# output content

result = []

# line length

length = 70

for t in titles:

seq = ''.join(output[t])

result.append(t+'\n'+'\n'.join([seq[i:i+length] for i in range(0,len(seq),length)]))

# write file

fout = open('out_05.fa','w')

fout.write('\n\n'.join(result))

fout.close()

print 'DONE'

测试文件:CODE:

>test1(illegal letters)

atcqATCQadkfaldfjaldasnlfnaldkjfaldkjflafjdfalidjLKJLIJILNLKJKLJLJLJIOJLJLJKIOJLJLKJKJKLJLKJ

>test2(OK, length is bigger than 70)

ATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCG

>test3(no empty line, illegal letters)

aadfalskdfalsdkjfalsdjfaskljdfl;aksjdfl;kajsdl;fka;sdfa;ldkfja;skldjf;alksdjf;akd

>test4(empty seq)

>test5(duplicate title)

atcg

>test5(duplicate title)

测试结果:CODE:

>test2(OK)

ATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaTCGATCGatcgaT

CGATCGatcgaTCG

>test5(duplicate title)

atcg

，

luck_cc

引用回帖:

libralibra at 2013-07-17 16:36:59

是基因序列?楼主你上传你的src.fa文件才好测试啊,否则只能看意思写一个,你自己测试.

1.空行跳过

2.已有title的序列跳过

3.无内容空序列跳过

4.含有非法字符序列跳过

5.输出文件序列之间添加空行

6.序列输出长 ...

真的很佩服你啊，果然厉害，你的脚步脚本运行完全正确！

不过我有些不懂的：

1.if len(title)!=0 and title not in titles and len(seq)>0: # check and save

output[title] = seq

titles.append(title)

比如这一句，虽然定义了 output是一个字典，但是你怎么指定连续的一个带大于号的标题行，与接下来的序列，联系起来呢，就是作为字典的key,value调用的？

2.result.append(t+'\n'+'\n'.join([seq[i:i+length] for i in range(0,len(seq),length)]))

这个其中的 \n是要换行的吧，为什么要连续两个呢，并且都是\n,为什么不用\r呢？不知道这两个有什么区别，分别在什么时候用？

3.整天来说，你写的已经很完美了，很谢谢你。

我看了你的个人资料上有QQ:790404545,能不能加你好友呢？十分感谢！

libralibra

引用回帖:

luck_cc at 2013-07-17 18:35:46

真的很佩服你啊，果然厉害，你的脚步脚本运行完全正确！

不过我有些不懂的：

1.if len(title)!=0 and title not in titles and len(seq)>0: # check and save

output = seq

...

1这个默认了你的序列没有乱存,也就是title下面肯定是同一个的seq,同一个seq的内容之间不会有空行.否则可能需要多些判断.只要满足条件,读到>title的时候,有2种情况,要么是第一次进入(以前titia是空),要么title已经有值(前一个),

1.1 如果是第一次进入,保存title,清空seq准备接受序列;

1.2 如果不是第一次,那么title已经有值,seq也有值(可能是空),就写入结果到字典.但是如果title已经有(重复),seq是空(序列为空)或者title是空的时候不写入.然后保存当前title,清空seq准备接受序列内容.

2.python一般都用\n表示换行,result.append的时候,写完title需要换行.第二个换行符用来连接seq内容,这是python的'str'.join()函数.

qq一般不上,可以qq邮件.

luck_cc

引用回帖:

libralibra at 2013-07-17 19:31:28

1这个默认了你的序列没有乱存,也就是title下面肯定是同一个的seq,同一个seq的内容之间不会有空行.否则可能需要多些判断.只要满足条件,读到>title的时候,有2种情况,要么是第一次进入(以前titia是空),要么title已 ...

不好意思，关于第一个问题，我还是不太懂。原数据是规则的，每一个title下面是自己的seq，或者空行，但是，没有哪显示指定了他们之间的关系，怎么就赋给了字典中的对应key-value对呢？

libralibra

引用回帖:

luck_cc at 2013-07-17 21:07:52

不好意思，关于第一个问题，我还是不太懂。原数据是规则的，每一个title下面是自己的seq，或者空行，但是，没有哪显示指定了他们之间的关系，怎么就赋给了字典中的对应key-value对呢？...

if len(sline)==0: continue # jump empty line

if sline.startswith('>'):

# not the first title, save previous one

if len(title)!=0 and title not in titles and len(seq)>0: # check and save

output[title] = seq

titles.append(title)

title = sline # save current title

seq = [] # clear seq

elif set(sline.lower())==set('atcg'): # seq only has atcg

seq.append(sline)

主要看这几行

就是认为title下面肯定是对应的seq,才判断当前行是不是以>开头,如果是就记录title,如果不是,读取内容到seq.等到下次读取到title的时候,原来的title和seq不就是上一个对应关系吗?记录之后修改title,然后清空seq准备读取下一对.

luck_cc

引用回帖:

libralibra at 2013-07-17 23:01:45

if len(sline)==0: continue # jump empty line

if sline.startswith('>'):

# not the first title, save previous one

if len(title)!=0 and title not in titles and len(seq)>0: ...

if sline.startswith('>'):

# not the first title, save previous one

if len(title)!=0 and title not in titles and len(seq)>0: # check and save

这一行，有些疑问：

这个是刚开始判断，就是读取每一行，然后去掉两端空格，记为 sline对吧。

如果 sline是以大于号开始，把它赋给title，但是这个赋值是第二行的判断之后。

我不明白，第二个if判断的时候，还没有赋值，怎么就有title，和seq了呢？

luck_cc

引用回帖:

libralibra at 2013-07-17 23:01:45

if len(sline)==0: continue # jump empty line

if sline.startswith('>'):

# not the first title, save previous one

if len(title)!=0 and title not in titles and len(seq)>0: ...

还有这个：

if len(title)!=0 and title not in titles and len(seq)>0: # check and save

output[title] = seq

第二句，当前判断的还是title对吧，即使写的是seq。我自己这么判断的，不知道对不对？

那这个赋值就看不懂了， output[title] = seq，是把当前读的序列赋给了title呢，还是字典中title对应的value呢，关于字典的用法，还不是很清楚。