RDKit|分子读取

RDKit|分子读取

Github: 地址

读SMILES/SMARTS

m = Chem.MolFromSmiles('C[C@H](O)c1ccccc1')
m = Chem.MolFromSmarts('Cc1ccccc1')
m

文件批量读取

.csv批量读取:SmilesMolSupplier(data, delimiter, smilesColumn, nameColumn, titleLine, sanitize)
data:数据文件
delimiter:分隔符,默认为’ ’
smilesColumn:SMILES所在列,默认为0
nameColumn:SMILES名称所在列,默认为1
titleLine:是否含有标题行,默认True
sanitize:是否检查正确性,默认True

suppl = Chem.SmilesMolSupplier(data="./data/batch.csv", delimiter=",")
smiles = [Chem.MolToSmiles(m) for m in suppl]
print(smiles)
with open("./data/batch.csv", "r", encoding="utf-8") as f:
    content = f.read()
suppl = Chem.SmilesMolSupplierFromText(text=content, delimiter=",")
smiles = [Chem.MolToSmiles(m) for m in suppl]
print(smiles)

DataFrame批量读取

读取DataFrame中的SMILES:AddMoleculeColumnToFrame(frame, smilesCol, molCol, includeFingerprints)
frame:DataFrame对象
smilesCol:SMILES所在列
molCol:新列名,将存放产生的rdkit mol对象
includeFingerprints:是否生成指纹

from rdkit.Chem import PandasTools
import pandas as pd

df = pd.read_csv('./data/batch.csv')
PandasTools.AddMoleculeColumnToFrame(frame=df,smilesCol='SMILES', molCol='mol' ,includeFingerprints=True)


下面我们可以计算分子的质量

from rdkit.Chem import Descriptors

df["MW"] = df["mol"].apply(Descriptors.MolWt)
df

.sdf里批量读取:SDMolSupplier(fileName, sanitize, removeHs, strictParsing)
fileName:文件名
sanitize:检查化合价,计算芳香性、共轭、杂化、kekule,默认True
removeHs:是否隐藏氢原子,默认True
strictParsing:是否使用严格模式进行解析,默认True

suppl = SDMolSupplier("./data/batch.sdf")
smiles = [Chem.MolToSmiles(m) for m in suppl]
print(smiles)

从压缩包file object/.gz里读取

import gzip
gz_file = gzip.open("./data/batch.sdf.gz", "r")
suppl = Chem.ForwardSDMolSupplier(gz_file)
smiles = [Chem.MolToSmiles(m) for m in suppl]
print(smiles)

读.mol

.mol里读取:MolFromMolFile(fileName, sanitize, removeHs, strictParsing)

m = Chem.MolFromMolFile('./data/single.mol')
m

读.mol2

不推荐,容易出bug:MolFromMol2File(…)

m = Chem.MolFromMol2File('data/batch.mol2')
print(Chem.MolToSmiles(m))

读取pdb

mol = Chem.MolFromPDBFile("./data/single.pdb")
print(Chem.MolToSmiles(mol))
mol =Chem.MolFromPDBBlock("""COMPND    UNNAMED
AUTHOR    GENERATED BY OPEN BABEL 3.1.1
HETATM    1  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    2  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    3  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    4  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    5  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    6  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    7  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
HETATM    8  C   UNL     1       0.000   0.000   0.000  1.00  0.00           C  
CONECT    1    8    2    2                                            
CONECT    2    1    1    3                                            
CONECT    3    2    4    4                                            
CONECT    4    3    3    5                                            
CONECT    5    4    6    6                                            
CONECT    6    5    5    7                                            
CONECT    7    6    8    8                                            
CONECT    8    7    7    1                                            
MASTER        0    0    0    0    0    0    0    0    8    0    8    0
END""")
print(Chem.MolToSmiles(mol))

读取fasta序列

mol = Chem.MolFromFASTA(""">3CA7_1|Chain A|Protein spitz|Drosophila melanogaster (7227)
TFPTYKCPETFDAWYCLNDAHCFAVKIADLPVYSCECAIGFMGQRCEYKEID""")
mol
mol = Chem.MolFromSequence("TFPTYKCPETFDAWYCLNDAHCFAVKIADLPVYSCECAIGFMGQRCEYKEID")
mol

读取Inchi

mol = Chem.MolFromInchi("InChI=1S/C8H10O/c1-7(9)8-5-3-2-4-6-8/h2-7,9H,1H3/t7-/m0/s1")
print(Chem.MolToSmiles(mol))
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

发呆的比目鱼

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值