AlphaFold安装——非docker镜像(2)
接上一篇,和师兄说完没多久就帮我弄好了,开了一个一个多小时的头脑风暴,然后出去吃完饭,回来继续!
8、应用OpenMM 补丁
其实对于patch 这个package我也是第一次看到,以前没有接触过,自己查了查知道是一个打补丁的package,大家想了解可以自行查询。
cd /share2/pub/yangjy/yangjy/conda3/envs/alphafold/lib/python3.8/site-packages/ && patch -p0 < /share2/pub/yangjy/yangjy/softs/alphafold/docker/openmm.patch
我自己的理解,上面的代码就是根据.patch文件中的difference去更新conda中alphafold环境中的package,我觉得是为了避免后面的工作不匹配!(我自己的理解,大家如果更明白,可以滴滴我~)
9、运行alphafold的脚本
此处有具体的bash脚本,源码地址给大家,不再贴在这里了,因为有点长~
源码地址,直接下载,放在alphafold的目录下就OK
附:参数说明
在这里又想叨叨两句,切身体会,谷歌翻译有的时候真的不准确,翻译过来的文字非常生硬,很难理解,读英文原文真的容易理解,而且不会理解错,往往翻译过来的意思都变味了,容易误导人!还有一个学习的好方法就是,无论是R package 还是python的package或者function,不明白的直接看源代码,里面写的很详细,无论是参数说明还是example。
Usage: ./run_alphafold_v21.sh <OPTIONS>
Required Parameters:
-d <data_dir> Path to directory of supporting data
-o <output_dir> Path to a directory that will store the results.
-f <fasta_path> Path to a FASTA file containing sequence. If a FASTA file contains multiple sequences, then it will be folded as a multimer
-t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets
Optional Parameters:
-g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true)
-n <openmm_threads> OpenMM threads (default: all available cores)
-a <gpu_devices> Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0)
-m <model_preset> Choose preset model configuration - the monomer model, the monomer model with extra ensembling, monomer model with pTM head, or multimer model (default: 'monomer')
-c <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: 'full_dbs')
-p <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed (default: 'false')
-l <is_prokaryote> Optional for multimer system, not used by the single chain system. A boolean specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. This value determine the pairing method for the MSA (default: 'None')
-b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: 'false')
敲黑板!!划重点!!这个脚本的位置,是直接放在alphafold的文件夹下,不是alphafold的子文件夹alphafold!上个图:
10、准备好的database
(附一张下载好的database的picture,这是师兄之前下载好的,我不想再下载了,去年的是这样子的,今年不知道有没有变了)
11、running
# Example run (Uses the GPU with index id 0 as default)
bash run_alphafold.sh -d /share/pub/zhaohq/project/pumch/alphafold/version1/source/databases/ -o /share2/pub/yangjy/yangjy/softs/alphafold/result/query -f /share2/pub/yangjy/yangjy/softs/alphafold/example/query.fasta -t 2020-05-14
# or for CPU only run
bash run_alphafold.sh -d /share/pub/zhaohq/project/pumch/alphafold/version1/source/databases/ -o /share2/pub/yangjy/yangjy/softs/alphafold/result/query -f /share2/pub/yangjy/yangjy/softs/alphafold/example/query.fasta -t 2020-05-14 -g False
先这样,有报错,是GPU 的问题,我继续查错,解决了继续!