MoleculeNet 数据集

Datasets (moleculenet.org)

Dataset Details

a All MoleculeNet datasets are split into training, validation and test subsets following a 80/10/10 ratio. Different  splittings are recommended depending on each dataset's contents. For details of splitting methods please refer to the paper. 

b Different classification and regress metrics are recommended based on previous works and dataset's contents:

          ROC-AUC:  Area Under Curve of Receiver Operating Characteristics

          PRC-AUC:  Area Under Curve of Precision Recall Curve

          RMSE: Root-Mean-Square Error

          MAE: Mean Absolute Error

    For details of metrics please refer to the paper. 

FreeSolv数据集部分内容

iupacsmilesexptcalc
4-methoxy-N,N-dimethyl-benzamideCN(C)C(=O)c1ccc(cc1)OC-11.01-9.625
methanesulfonyl chlorideCS(=O)(=O)Cl-4.87-6.219
3-methylbut-1-eneCC(C)C=C1.832.452
2-ethylpyrazineCCc1cnccn1-5.45-5.809
heptan-1-olCCCCCCCO-4.21-2.917
3,5-dimethylphenolCc1cc(cc(c1)O)C-6.27-5.444
2,3-dimethylbutaneCC(C)C(C)C2.342.468
2-methylpentan-2-olCCCC(C)(C)O-3.92-2.779
1,2-dimethylcyclohexaneC[C@@H]1CCCC[C@@H]1C1.581.685

HIV数据集部分内容

smilesactivityHIV_active
CCC1=[O+][Cu-3]2([O+]=C(CC)C1)[O+]=C(CC)CC(CC)=[O+]2CI0
C(=Cc1ccccc1)C1=[O+][Cu-3]2([O+]=C(C=Cc3ccccc3)CC(c3ccccc3)=[O+]2)[O+]=C(c2ccccc2)C1CI0
CC(=O)N1c2ccccc2Sc2c1ccc1ccccc21CI0
Nc1ccc(C=Cc2ccc(N)cc2S(=O)(=O)O)c(S(=O)(=O)O)c1CI0
O=S(=O)(O)CCS(=O)(=O)OCI0

BBBP数据集部分内容

numnamep_npsmiles
1Propanolol1[Cl].CC(C)NCC(O)COc1cccc2ccccc12
2Terbutylchlorambucil1C(=O)(OC(C)(C)C)CCCc1ccc(cc1)N(CCCl)CCCl
3407301c12c3c(N4CCN(C)CC4)c(F)cc1c(c(C(O)=O)cn2C(C)CO3)=O
4241C1CCN(CC1)Cc1cccc(c1)OCCCNC(=O)C
5cloxacillin1Cc1onc(c2ccccc2Cl)c1C(=O)N[C@H]3[C@H]4SC(C)(C)[C@@H](N4C3=O)C(O)=O
6cefoperazone1CCN1CCN(C(=O)N[C@@H](C(=O)N[C@H]2[C@H]3SCC(=C(N3C2=O)C(O)=O)CSc4nnnn4C)c5ccc(O)cc5)C(=O)C1=O
7rolitetracycline1CN(C)[C@H]1[C@@H]2C[C@H]3C(=C(O)c4c(O)cccc4[C@@]3(C)O)C(=O)[C@]2(O)C(=O)\C(=C(/O)NCN5CCCC5)C1=O
8ondansetron1Cn1c2CCC(Cn3ccnc3C)C(=O)c2c4ccccc14
9diltiazem1COc1ccc(cc1)[C@@H]2Sc3ccccc3N(CCN(C)C)C(=O)[C@@H]2OC(C)=O

QM8数据集部分内容

smilesE1-CC2E2-CC2f1-CC2f2-CC2E1-PBE0E2-PBE0f1-PBE0f2-PBE0E1-PBE0E2-PBE0f1-PBE0f2-PBE0E1-CAME2-CAMf1-CAMf2-CAM
[H]C([H])([H])[H]0.4329520.432960.2497280.2497360.4302180.4302360.1814360.1815020.4302180.4302360.1814360.1815020.4099310.4099390.18320.1832
[H]N([H])[H]0.265220.3500810.0670150.0300490.2683860.3491060.0407610.0316410.2683860.3491060.0407610.0316410.2538530.3344810.05750.0238
[H]O[H]0.2865370.3635790.03775500.2913770.3620910.0195031E-080.2913770.3620910.0195031E-080.2785190.3500740.03330
[H]C#C[H]0.3586290.358629000.2563210.268469000.2563210.268469000.2448790.25505100
[H]C#N0.3199580.336074000.2951390.311657000.2951390.311657000.2834260.29699300
[H]C([H])=O0.1539140.29123400.0910230.1485530.31296200.1579160.1485530.31296200.1579160.1468390.30444200.0954
[H]C([H])([H])C([H])([H])[H]0.3761380.376146000.3728670.372891000.3728670.372891000.3549650.35497600
[H]OC([H])([H])[H]0.2666910.3331910.0009440.0716080.2778840.3314150.0013110.0568240.2778840.3314150.0013110.0568240.2612250.3252940.00030.0653
[H]C#CC([H])([H])[H]0.2733890.2857500.0011940.2514150.2627500.0016530.2514150.2627500.0016530.2438320.25335700.0009

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值