之间因为电脑没有显卡,所以为了加快程序速度,配置了condor作业处理系统,本来处理288方向的数据可能会花上7,8天,4核cpu并行了以后时间降为2天。condor的配置步骤另有介绍。
这里介绍的是配置FSL的GPU环境,FSL wiki官网上也说了FSL bedpostx支持GPU并行计算,不过后来发现probtrackx也可以用GPU跑。
这里主要是两个步骤bedpostx以及probtrackx。
首先安装CUDA环境,这个网上步骤较多,这里就不写了,我自己配的是CUDA8.0.我的显卡是nvidia 1080ti。
一。bedpostx
关键文件:bedpostx_gpu bedpostx_procpost_gpu xfibres_gpu
首先根据FSL wiki,在ubantu16.04环境下安装FSL,应该是5.0.8版本,这个版本是没有上述GPU版本文件的,也缺少一些共享库。到FSL Archive里查询可以得到GPU版本下载地址 注意我的cuda后来装的是8.0版本,所以这里下载8.0版本匹配的bedpostx,然后文件保存到/usr/lib/fsl/5.0/ 和/usr/share/fsl/5.0/bin/里去,还有其他文件也要一一下载GPU版本,此外FSL要升级到5.0.9,下载官网的patch包,把里面的bin文件里全部拷贝到/usr/lib/fsl/5.0/ 和/usr/share/fsl/5.0/bin/里去,lib文件有一些库要转移到/usr/lib下面,这个如果有问题,会有log信息提示,可按要求改正,基本上是做软连接。
另外一些GPU脚本文件有需要修正的地方。
#!/bin/sh
# Copyright (C) 2004 University of Oxford
#
# SHCOPYRIGHT
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${FSLDIR}/lib %%这里是需要修改的,应该是FSL环境版本的问题,5.0.8终端命令安装版本并没有${FSLDIR}/lib这个文件夹,
%%经修改为/bin,能够运行。否则会报错xfibres_gpu:symbol lookup error
Usage() {
echo ""
echo "Usage: bedpostx <subject_directory> [options]"
echo ""
echo "expects to find bvals and bvecs in subject directory"
echo "expects to find data and nodif_brain_mask in subject directory"
echo "expects to find grad_dev in subject directory, if -g is set"
echo ""
echo "<options>:"
#echo "-QSYS (Queue System, 0 use fsl_sub: FMRIB, 1 TORQUE (default): WashU)"
echo "-Q (name of the GPU(s) queue, default cuda.q (defined in environment variable: FSLGECUDAQ)"
#echo "-Q (name of the GPU(s) queue, default cuda.q for QSYS=0 and no queue for QSYS=1)"
echo "-NJOBS (number of jobs to queue, the data is divided in NJOBS parts, usefull for a GPU cluster, default 4)"
echo "-n (number of fibres per voxel, default 3)"
echo "-w (ARD weight, more weight means less secondary fibres per voxel, default 1)"
echo "-b (burnin period, default 1000)"
echo "-j (number of jumps, default 1250)"
echo "-s (sample every, default 25)"
echo "-model (Deconvolution model. 1: with sticks, 2: with sticks with a range of diffusivities (default), 3: with zeppelins)"
echo "-g (consider gradient nonlinearities, default off)"
echo ""
echo ""
echo "ALTERNATIVELY: you can pass on xfibres options onto directly bedpostx"
echo " For example: bedpostx <subject directory> --noard --cnonlinear"
echo " Type 'xfibres --help' for a list of available options "
echo " Default options will be bedpostx default (see above), and not xfibres default."
echo ""
echo "Note: Use EITHER old OR new syntax."
exit 1
}
monitor(){
cat <<EOM > ${subjdir}.bedpostX/monitor
#!/bin/sh
nparts=0
if [ $njobs -eq 1 ]; then
#1 part (GPU) and several subparts
#voxels processed in each subpart are 12800 or more if the last one is less than 6400 (1 part less)
nparts=\$(($nvox/12800))
if [ \$nparts%12800 != 0 ];then
nparts=\$((\$nparts + 1))
fi
last_part=\$(($nvox-(((\$nparts-1))*12800)))
if [ \$last_part -lt 6400 ];then
nparts=\$((\$nparts - 1))
fi
else
nparts=$njobs
fi
echo
echo "----- Bedpostx Monitor -----"
finished=0
lastprinted=0
havedad=2
while [ \$finished -eq 0 ] ; do
nfin=0
part=0
errorFiles=\`ls ${subjdir}.bedpostX/logs/*.e* 2> /dev/null \`
for errorFile in \$errorFiles
do
if [ -s \$errorFile ]; then
echo An error ocurred. Please check file \$errorFile
kill -9 $$
exit 1
fi
done
while [ \$part -le \$nparts ];do
if [ -e ${subjdir}.bedpostX/logs/monitor/\$part ]; then
nfin=\$((\$nfin + 1))
fi
part=\$((\$part + 1))
done
newmessages=\$((\$nfin - \$lastprinted))
while [ "\$newmessages" -gt 0 ];do
lastprinted=\$((\$lastprinted + 1))
echo \$lastprinted parts processed out of \$nparts
newmessages=\$((\$newmessages - 1))
done
if [ -f ${subjdir}.bedpostX/xfms/eye.mat ] ; then
finished=1
echo "All parts processed"
exit
fi
if [ ! \$havedad -gt 0 ]; then
exit 0
fi
if [ "x$SGE_ROOT" = "x" ]; then
havedad=\`ps -e -o pid 2>&1| grep "$$\\b" | wc -l\`
fi
sleep 50;
done
EOM
chmod +x ${subjdir}.bedpostX/monitor
}
make_absolute(){
dir=$1;
if [ -d ${dir} ]; then
OLDWD=`pwd`
cd ${dir}
dir_all=`pwd`
cd $OLDWD
else
dir_all=${dir}
fi
echo ${dir_all}
}
[ "$1" = "" ] && Usage
subjdir=`make_absolute $1`
subjdir=`echo $subjdir | sed 's/\/$/$/g'`
echo "---------------------------------------------"
echo "------------ BedpostX GPU Version -----------"
echo "---------------------------------------------"
echo subjectdir is $subjdir
#parse option arguments
qsys=0
njobs=4
nfibres=3
fudge=1
burnin=1000
njumps=1250
sampleevery=25
model=2
gflag=0
other=""
queue=""
if [ $qsys -eq 0 ] && [ "x$SGE_ROOT" != "x" ]; then
queue="-q $FSLGECUDAQ"
fi
shift
while [ ! -z "$1" ]
do
case "$1" in
-QSYS) qsys=$2;shift;;
-Q) queue="-q $2";shift;;
-NJOBS) njobs=$2;shift;;
-n) nfibres=$2;shift;;
-w) fudge=$2;shift;;
-b) burnin=$2;shift;;
-j) njumps=$2;shift;;
-s) sampleevery=$2;shift;;
-model) model=$2;shift;;
-g) gflag=1;;
*) other=$other" "$1;;
esac
shift
done
opts="--nf=$nfibres --fudge=$fudge --bi=$burnin --nj=$njumps --se=$sampleevery --model=$model"
defopts="--cnonlinear"
opts="$opts $defopts $other"
#check that all required files exist
if [ ! -d $subjdir ]; then
echo "subject directory $1 not found"
exit 1
fi
if [ ! -e ${subjdir}/bvecs ]; then
if [ -e ${subjdir}/bvecs.txt ]; then
mv ${subjdir}/bvecs.txt ${subjdir}/bvecs
else
echo "${subjdir}/bvecs not found"
exit 1
fi
fi
if [ ! -e ${subjdir}/bvals ]; then
if [ -e ${subjdir}/bvals.txt ]; then
mv ${subjdir}/bvals.txt ${subjdir}/bvals
else
echo "${subjdir}/bvals not found"
exit 1
fi
fi
if [ `${FSLDIR}/bin/imtest ${subjdir}/data` -eq 0 ]; then
echo "${subjdir}/data not found"
exit 1
fi
if [ ${gflag} -eq 1 ]; then
if [ `${FSLDIR}/bin/imtest ${subjdir}/grad_dev` -eq 0 ]; then
echo "${subjdir}/grad_dev not found"
exit 1
fi
fi
if [ `${FSLDIR}/bin/imtest ${subjdir}/nodif_brain_mask` -eq 0 ]; then
echo "${subjdir}/nodif_brain_mask not found"
exit 1
fi
if [ -e ${subjdir}.bedpostX/xfms/eye.mat ]; then
echo "${subjdir} has already been processed: ${subjdir}.bedpostX."
echo "Delete or rename ${subjdir}.bedpostX before repeating the process."
exit 1
fi
echo Making bedpostx directory structure
mkdir -p ${subjdir}.bedpostX/
mkdir -p ${subjdir}.bedpostX/diff_parts
mkdir -p ${subjdir}.bedpostX/logs
mkdir -p ${subjdir}.bedpostX/logs/logs_gpu
mkdir -p ${subjdir}.bedpostX/logs/monitor
rm -f ${subjdir}.bedpostX/logs/monitor/*
mkdir -p ${subjdir}.bedpostX/xfms
#mailto=`whoami`@fmrib.ox.ac.uk
echo Copying files to bedpost directory
cp ${subjdir}/bvecs ${subjdir}/bvals ${subjdir}.bedpostX
${FSLDIR}/bin/imcp ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX
if [ `${FSLDIR}/bin/imtest ${subjdir}/nodif` = 1 ] ; then
${FSLDIR}/bin/fslmaths ${subjdir}/nodif -mas ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/nodif_brain
fi
# Split the dataset in parts
echo Pre-processing stage
if [ ${gflag} -eq 1 ]; then
pre_command="$FSLDIR/bin/split_parts_gpu ${subjdir}/data ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/bvals ${subjdir}.bedpostX/bvecs ${subjdir}/grad_dev 1 $njobs ${subjdir}.bedpostX"
else
pre_command="$FSLDIR/bin/split_parts_gpu ${subjdir}/data ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/bvals ${subjdir}.bedpostX/bvecs NULL 0 $njobs ${subjdir}.bedpostX"
fi
if [ $qsys -eq 0 ]; then
#SGE
splitID=`${FSLDIR}/bin/fsl_sub -T 40 -l ${subjdir}.bedpostX/logs -N bedpostx_preproc_gpu $pre_command`
else
#TORQUE
echo $pre_command > ${subjdir}.bedpostX/temp
torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=00:40:00 -N bedpostx_preproc_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs"
splitID=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
rm ${subjdir}.bedpostX/temp
sleep 10
fi
nvox=`${FSLDIR}/bin/fslstats $subjdir.bedpostX/nodif_brain_mask -V | cut -d ' ' -f1 `
echo Queuing parallel processing stage
[ -f ${subjdir}.bedpostX/commands.txt ] && rm ${subjdir}.bedpostX/commands.txt
monitor
if [ "x$SGE_ROOT" = "x" ]; then
${subjdir}.bedpostX/monitor&
fi
part=0
while [ $part -lt $njobs ]
do
partzp=`$FSLDIR/bin/zeropad $part 4`
if [ ${gflag} -eq 1 ]; then
gopts="$opts --gradnonlin=${subjdir}.bedpostX/grad_dev_$part"
else
gopts=$opts
fi
echo "${FSLDIR}/bin/xfibres_gpu --data=${subjdir}.bedpostX/data_$part --mask=$subjdir.bedpostX/nodif_brain_mask -b ${subjdir}.bedpostX/bvals -r ${subjdir}.bedpostX/bvecs --forcedir --logdir=$subjdir.bedpostX/diff_parts/data_part_$partzp $gopts ${subjdir} $part $njobs $nvox" >> ${subjdir}.bedpostX/commands.txt
part=$(($part + 1))
done
if [ $qsys -eq 0 ]; then
#SGE
bedpostid=`${FSLDIR}/bin/fsl_sub $queue -l ${subjdir}.bedpostX/logs -N bedpostx_gpu -j $splitID -t ${subjdir}.bedpostX/commands.txt`
else
#TORQUE
taskfile=${subjdir}.bedpostX/commands.txt
echo "command=\`cat "$taskfile" | head -\$PBS_ARRAYID | tail -1\` ; exec \$command" > ${subjdir}.bedpostX/temp
tasks=`wc -l $taskfile | awk '{print $1}'`
sge_tasks="-t 1-$tasks"
#PBS -t x-y: x and y are the array bounds
torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=3:00:00,pmem=16gb -N bedpostx_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs -W depend=afterok:$splitID $sge_tasks"
bedpostid=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
rm ${subjdir}.bedpostX/temp
sleep 10
fi
echo Queuing post processing stage %%这里可能涉及到bash dash等shell版本区别,不一一解释,要将sh脚本前面bash执行。
post_command="${FSLDIR}/bin/bedpostx_postproc_gpu.sh --data=${subjdir}/data --mask=$subjdir.bedpostX/nodif_brain_mask -b ${subjdir}.bedpostX/bvals -r ${subjdir}.bedpostX/bvecs --forcedir --logdir=$subjdir.bedpostX/diff_parts $gopts $nvox $njobs ${subjdir} ${FSLDIR}"
if [ $qsys -eq 0 ]; then
#SGE
mergeid=`${FSLDIR}/bin/fsl_sub -T 120 -j $bedpostid -N bedpostx_postproc_gpu -l ${subjdir}.bedpostX/logs $post_command`
else
#TORQUE
echo $post_command > ${subjdir}.bedpostX/temp
torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=00:40:00 -N bedpostx_postproc_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs -W depend=afterokarray:$bedpostid"
mergeid=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
rm ${subjdir}.bedpostX/temp
sleep 10
fi
echo $mergeid > ${subjdir}.bedpostX/logs/postproc_ID
if [ "x$SGE_ROOT" != "x" ]; then
echo
echo Type ${subjdir}.bedpostX/monitor to show progress.
echo Type ${subjdir}.bedpostX/cancel to terminate all the queued tasks.
cat <<EOC > ${subjdir}.bedpostX/cancel
#!/bin/sh
qdel $mergeid $bedpostid
EOC
chmod +x ${subjdir}.bedpostX/cancel
echo
echo You will get an email at the end of the post-processing stage.
echo
else
sleep 60
fi
#!/bin/sh
# Copyright (C) 2012 University of Oxford
#
# SHCOPYRIGHT
# last 2 parameters are subjdir and bindir
parameters=""
while [ ! -z "$2" ] %%这里可能会报[[错误
do
if [[ $1 =~ "--nf=" ]]; then %% if[[ ]] 代表正则表达式匹配 如果查询到nf变量,将numfib赋值为nf,cut -d 表示将一个字符串切断
numfib=`echo $1 | cut -d '=' -f2` %%这里'='表示把'--nf=3'切为'--nf'和'3' 而-f后接2表示第二个串,即3赋值给numfib,这里-f与2之间
fi %%加个空格,这里如果反复bug,可以echo numfib的值进行观察
all=$all" "$1
subjdir=$1
shift
done
bindir=$1
$bindir/bin/merge_parts_gpu $all
fib=1
while [ $fib -le $numfib ]
do
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_th${fib}samples -Tmean ${subjdir}.bedpostX/mean_th${fib}samples
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_ph${fib}samples -Tmean ${subjdir}.bedpostX/mean_ph${fib}samples
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_f${fib}samples -Tmean ${subjdir}.bedpostX/mean_f${fib}samples
${FSLDIR}/bin/make_dyadic_vectors ${subjdir}.bedpostX/merged_th${fib}samples ${subjdir}.bedpostX/merged_ph${fib}samples ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/dyads${fib}
if [ $fib -ge 2 ];then
${FSLDIR}/bin/maskdyads ${subjdir}.bedpostX/dyads${fib} ${subjdir}.bedpostX/mean_f${fib}samples
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_f${fib}samples -div ${subjdir}.bedpostX/mean_f1samples ${subjdir}.bedpostX/mean_f${fib}_f1samples
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/dyads${fib}_thr0.05 -mul ${subjdir}.bedpostX/mean_f${fib}_f1samples ${subjdir}.bedpostX/dyads${fib}_thr0.05_modf${fib}
${FSLDIR}/bin/imrm ${subjdir}.bedpostX/mean_f${fib}_f1samples
fi
fib=$(($fib + 1))
done
if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/mean_f1samples` -eq 1 ];then
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_f1samples -mul 0 ${subjdir}.bedpostX/mean_fsumsamples
fib=1
while [ $fib -le $numfib ]
do
${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_fsumsamples -add ${subjdir}.bedpostX/mean_f${fib}samples ${subjdir}.bedpostX/mean_fsumsamples
fib=$(($fib + 1))
done
fi
echo Removing intermediate files
if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_th1samples` -eq 1 ];then
if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_ph1samples` -eq 1 ];then
if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_f1samples` -eq 1 ];then
rm -rf ${subjdir}.bedpostX/diff_parts
rm -rf ${subjdir}.bedpostX/data*
rm -rf ${subjdir}.bedpostX/grad_dev*
fi
fi
fi
echo Creating identity xfm
xfmdir=${subjdir}.bedpostX/xfms
echo 1 0 0 0 > ${xfmdir}/eye.mat
echo 0 1 0 0 >> ${xfmdir}/eye.mat
echo 0 0 1 0 >> ${xfmdir}/eye.mat
echo 0 0 0 1 >> ${xfmdir}/eye.mat
echo Done
共享库的问题主要是libcudart,libcuda,libcurand,libbedpostx_cuda 注意如果有地方出问题可以进当前目录用ldd -r *指令查看共享库依赖,注意软链接方法
一切搞定之后,bedpostx_gpu在98个方向上只需要30分钟,在288个方向上需要一个小时多一点。
二。probtrackx
下载地址,一定要下载,因为FSL就算5.0.9版本也没有GPU版本的probtrackx
然后运行的时候注意 permission问题,加一下权限 chmod +777 /probtrackx2_gpu路径
后面find_the_biggest记得指定output名字,生成在home里面
probtarckx的运行时间,CPU运行大概2小时多,GPU7分钟左右。
整体配置:
1.ubantu16.04
2.cuda8.0
3.FSL5.0.8+后续缺少的GPU版本文件
4.显卡GTX 1080 TI
5.内存32GB
FSL的FDT部分可以完美GPU运行