GATK主要支持在linux平台/MacOS平台,window平台并不支持。
因此下面的操作都在linux平台运行。
参考链接:https://gatk.broadinstitute.org/hc/en-us/articles/360036194592-Getting-started-with-GATK4
1.下载GTAK并解压
下载GATK的安装包。
参考链接:https://software.broadinstitute.org/gatk/
注意解压指令是unzip。
unzip gatk-4.2.0.0.zip -d /home/zxx/workplace/
2. 配置GATK的环境变量
编辑环境变量
sudo vi ~/.bashrc
export PATH=/home/zxx/workplace/gatk:$PATH
保存环境变量的设置
source ~/.bashrc
3. 检验是否安装完成
在命令行那边输入:
gatk
Usage template for all tools (uses --spark-runner LOCAL when used with a Spark tool)
gatk AnyTool toolArgs
Usage template for Spark tools (will NOT work on non-Spark tools)
gatk SparkTool toolArgs [ – --spark-runner <LOCAL | SPARK | GCS> sparkArgs ]
Getting help
gatk --list Print the list of available tools
gatk Tool --help Print help on a particular tool
Configuration File Specification
–gatk-config-file PATH/TO/GATK/PROPERTIES/FILE
gatk forwards commands to GATK and adds some sugar for submitting spark jobs
–spark-runner controls how spark tools are run
valid targets are:
LOCAL: run using the in-memory spark runner
SPARK: run using spark-submit on an existing cluster
–spark-master must be specified
–spark-submit-command may be specified to control the Spark submit command
arguments to spark-submit may optionally be specified after –
GCS: run using Google cloud dataproc
commands after the – will be passed to dataproc
–cluster must be specified after the –
spark properties and some common spark-submit parameters will be translated
to dataproc equivalents
–dry-run may be specified to output the generated command line without running it
–java-options ‘OPTION1[ OPTION2=Y … ]’ optional - pass the given string of options to the
java JVM at runtime.
Java options MUST be passed inside a single string with space-separated values.
–debug-port sets up a Java VM debug agent to listen to debugger connections on a
particular port number. This in turn will add the necessary java VM arguments
so that you don’t need to explicitly indicate these using --java-options.
–debug-suspend sets the Java VM debug agent up so that the run get immediatelly suspended
waiting for a debugger to connect. By default the port number is 5005 but
can be customized using --debug-port
gatk -version
Using GATK jar /home/zxx/workplace/gatk/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/zxx/workplace/gatk/gatk-package-4.2.0.0-local.jar -version
The Genome Analysis Toolkit (GATK) v4.2.0.0
HTSJDK Version: 2.24.0
Picard Version: 2.25.0