1 MGSSL
1.1 torch1.10
在pytorch官网找到对应版本
如果是cpu:
conda install pytorch=1.8.1 torchvision torchaudio cpuonly -c pytorch
如果是GPU
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
查看版本:
import torch
print(torch.__version__)
1.2 torch-geometric
conda install pyg -c pyg -c conda-forge
1.3 RDKit
详见 https://blog.csdn.net/qq_35337126/article/details/107801288
1.4 tensorboardx 1.6
用pip安装
pip install tensorboardX==1.6
2 Meta-MGNN(2022/3/12)
conda create -n Mata python=3.6
2.1 torch = 1.4.0
pip install torch==1.4.0
conda install pytorch=1.4.0
方法二:下载对应版本的pytorch(装完是torch)
①先查看cuda版本
②去网站https://download.pytorch.org/whl/torch_stable.html
下载对应版本的pytorch
使用 pip install 安装
pip install torch-1.4.0-cp36-cp36m-linux_x86_64.whl
(显示这个是因为已经用第一种方法装了)
2.2 torch-geometric = 1.6.1
pip install torch-geometric==1.6.1
2.3 torch-scatter = 2.0.4
到官网①下载相应版本,②上传到相应位置,然后③pip进行安装
①网址:https://pytorch-geometric.com/whl/torch-1.4.0.html
②上传
③pip安装
pip install torch_scatter-2.0.4+cu101-cp36-cp36m-linux_x86_64.whl
2.4 torch-sparse = 0.6.1
https://pytorch-geometric.com/whl/torch-1.4.0%2Bcu101.html
下载对应版本,torch10.1,python3.6, torch-sparse = 0.6.1
2.5 scikit-learn = 0.23.2
pip install scikit-learn
2.6 tqdm = 4.50.0
pip install tqdm
2.7 rdkit
conda install rdkit
3 GeoGNN
3.1 预训练
python pretrain.py --batch_size=256 --num_workers=4 --max_epoch=50 --lr=1e-3 --dropout_rate=0.2 --dataset="zinc" --data_path="./demo_zinc_smiles" --compound_encoder_config="model_configs/geognn_l8.json" --model_config="model_configs/pretrain_gem.json" --model_dir=./pretrain_models/zinc
之后生成(运行了一个epoch)
3.2 下游微调
3.2.1
Firstly, download the pretrained model from the previous step:
wget https://baidu-nlp.bj.bcebos.com/PaddleHelix/pretrained_models/compound/pretrain_models-chemrl_gem.tgz
tar xzf pretrain_models-chemrl_gem.tgz
3.2.2
Download the downstream molecular property prediction datasets from MoleculeNet, including classification tasks and regression tasks:
wget https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/compound_datasets/chemrl_downstream_datasets.tgz
tar xzf chemrl_downstream_datasets.tgz
删掉pretrain_models-chemrl_gem.tgz 和 chemrl_downstream_datasets.gz
3.2.3
Run downstream finetuning and the final results will be saved under ./log/pretrain-$dataset/final_result
①分类任务
[1] bace
微调:
python finetune_class.py --task=data --num_workers=10 --dataset_name="bace" --data_path="chemrl_downstream_datasets/bace" --cached_data_path="cached_data/bace" --compound_encoder_config="model_configs/geognn_l8.json" --model_config="model_configs/down_mlp2.json"
结果:
python finetune_class.py --batch_size=32 --max_epoch=100 --dataset_name="bace" --data_path="chemrl_downstream_datasets/bace" --cached_data_path="cached_data/bace" --split_type=scaffold --compound_encoder_config="model_configs/geognn_l8.json" --model_config="model_configs/down_mlp2.json" --init_model="pretrain_models-chemrl_gem/class.pdparams" --model_dir=./finetune_models/$dataset --encoder_lr="1e-3" --head_lr="1e-3" --dropout_rate=0.2
[2] bbbp
微调:
python finetune_class.py --task=data --num_workers=10 --dataset_name="bbbp" --data_path="chemrl_downstream_datasets/bbbp" --cached_data_path="cached_data/bbbp" --compound_encoder_config="model_configs/geognn_l8.json" --model_config="model_configs/down_mlp2.json"
结果:
[3]clintox
python finetune_class.py --task=data --num_workers=10 --dataset_name="clintox" --data_path="chemrl_downstream_datasets/clintox" --cached_data_path="cached_data/clintox" --compound_encoder_config="model_configs/geognn_l8.json" --model_config="model_configs/down_mlp2.json"
②回归任务