python截取空格,如何在Python中从空格分隔的文件中提取特定列？

weixin_39727336

于 2021-01-13 16:57:26 发布

阅读量214

点赞数

文章标签： python截取空格

I'm trying to process a file from the protein data bank which is separated by spaces (not \t). I have a .txt file and I want to extract specific rows and, from that rows, I want to extract only a few columns.

I need to do it in Python. I tried first with command line and used awk command with no problem, but I have no idea of how to do the same in Python.

Here is an extract of my file:

[...]

SEQRES 6 B 80 ALA LEU SER ILE LYS LYS ALA GLN THR PRO GLN GLN TRP

SEQRES 7 B 80 LYS PRO

HELIX 1 1 THR A 68 SER A 81 1 14

HELIX 2 2 CYS A 97 LEU A 110 1 14

HELIX 3 3 ASN A 122 SER A 133 1 12

[...]

For example, I'd like to take only the 'HELIX' rows and then the 4th, 6th, 7th and 9th columns. I started reading the file line by line with a for loop and then extracted those rows starting with 'HELIX'... and that's all.

EDIT: This is the code I have right now, but the print doesn't work properly, only prints the first line of each block (HELIX SHEET AND DBREF)

#!/usr/bin/python

import sys

for line in open(sys.argv[1]):

if 'HELIX' in line:

helix = line.split()

elif 'SHEET'in line:

sheet = line.split()

elif 'DBREF' in line:

dbref = line.split()

print (helix), (sheet), (dbref)

解决方案

If you already have extracted the line, you can split it using line.split(). This will give you a list, of which you can extract all the elements you need:

>>> test='HELIX 2 2 CYS A 97'

>>> test.split()

['HELIX', '2', '2', 'CYS', 'A', '97']

>>> test.split()[3]

'CYS'

weixin_39727336

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python截取空格,如何在Python中从空格分隔的文件中提取特定列？

I'm trying to process a file from the protein data bank which is separated by spaces (not \t). I have a .txt file and I want to extract specific rows and, from that rows, I want to extract only a few ...
复制链接

扫一扫

python截取空格,如何在Python中从空格分隔的文件中提取特定列？

“相关推荐”对你有帮助么？