FSL配置GPU环境运行bedpostx以及probtrackx

  之间因为电脑没有显卡,所以为了加快程序速度,配置了condor作业处理系统,本来处理288方向的数据可能会花上7,8天,4核cpu并行了以后时间降为2天。condor的配置步骤另有介绍。

  这里介绍的是配置FSL的GPU环境,FSL wiki官网上也说了FSL bedpostx支持GPU并行计算,不过后来发现probtrackx也可以用GPU跑。

  这里主要是两个步骤bedpostx以及probtrackx。

  首先安装CUDA环境,这个网上步骤较多,这里就不写了,我自己配的是CUDA8.0.我的显卡是nvidia 1080ti。

  一。bedpostx

             关键文件:bedpostx_gpu bedpostx_procpost_gpu xfibres_gpu

            首先根据FSL wiki,在ubantu16.04环境下安装FSL,应该是5.0.8版本,这个版本是没有上述GPU版本文件的,也缺少一些共享库。到FSL Archive里查询可以得到GPU版本下载地址 注意我的cuda后来装的是8.0版本,所以这里下载8.0版本匹配的bedpostx,然后文件保存到/usr/lib/fsl/5.0/ 和/usr/share/fsl/5.0/bin/里去,还有其他文件也要一一下载GPU版本,此外FSL要升级到5.0.9,下载官网的patch包,把里面的bin文件里全部拷贝到/usr/lib/fsl/5.0/ 和/usr/share/fsl/5.0/bin/里去,lib文件有一些库要转移到/usr/lib下面,这个如果有问题,会有log信息提示,可按要求改正,基本上是做软连接。

             另外一些GPU脚本文件有需要修正的地方。

            

#!/bin/sh

#   Copyright (C) 2004 University of Oxford
#
#   SHCOPYRIGHT

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${FSLDIR}/lib  %%这里是需要修改的,应该是FSL环境版本的问题,5.0.8终端命令安装版本并没有${FSLDIR}/lib这个文件夹,
                                                         %%经修改为/bin,能够运行。否则会报错xfibres_gpu:symbol lookup error

Usage() {
    echo ""
    echo "Usage: bedpostx <subject_directory> [options]"
    echo ""
    echo "expects to find bvals and bvecs in subject directory"
    echo "expects to find data and nodif_brain_mask in subject directory"
    echo "expects to find grad_dev in subject directory, if -g is set"
    echo ""
    echo "<options>:"
    #echo "-QSYS (Queue System, 0 use fsl_sub: FMRIB, 1 TORQUE (default): WashU)"
    echo "-Q (name of the GPU(s) queue, default cuda.q (defined in environment variable: FSLGECUDAQ)"
    #echo "-Q (name of the GPU(s) queue, default cuda.q for QSYS=0 and no queue for QSYS=1)"	
    echo "-NJOBS (number of jobs to queue, the data is divided in NJOBS parts, usefull for a GPU cluster, default 4)"
    echo "-n (number of fibres per voxel, default 3)"
    echo "-w (ARD weight, more weight means less secondary fibres per voxel, default 1)"
    echo "-b (burnin period, default 1000)"
    echo "-j (number of jumps, default 1250)"
    echo "-s (sample every, default 25)"
    echo "-model (Deconvolution model. 1: with sticks, 2: with sticks with a range of diffusivities (default), 3: with zeppelins)"
    echo "-g (consider gradient nonlinearities, default off)"
    echo ""
    echo ""
    echo "ALTERNATIVELY: you can pass on xfibres options onto directly bedpostx"
    echo " For example:  bedpostx <subject directory> --noard --cnonlinear"
    echo " Type 'xfibres --help' for a list of available options "
    echo " Default options will be bedpostx default (see above), and not xfibres default."
    echo ""
    echo "Note: Use EITHER old OR new syntax."
    exit 1
}

monitor(){
    cat <<EOM > ${subjdir}.bedpostX/monitor
#!/bin/sh
nparts=0
if [ $njobs -eq 1 ]; then
#1 part (GPU) and several subparts
#voxels processed in each subpart are 12800 or more if the last one is less than 6400 (1 part less)
	nparts=\$(($nvox/12800))
	if [ \$nparts%12800 != 0 ];then 
		nparts=\$((\$nparts + 1)) 
	fi
	last_part=\$(($nvox-(((\$nparts-1))*12800)))
	if [ \$last_part -lt 6400 ];then 
		nparts=\$((\$nparts - 1)) 
	fi
else
	nparts=$njobs
fi

echo
echo "----- Bedpostx Monitor -----"
finished=0
lastprinted=0
havedad=2
while [ \$finished -eq 0 ] ; do
    nfin=0
    part=0
    errorFiles=\`ls ${subjdir}.bedpostX/logs/*.e* 2> /dev/null \`
    for errorFile in \$errorFiles
    do
        if [ -s \$errorFile ]; then
            echo An error ocurred. Please check file \$errorFile
            kill -9 $$
            exit 1
        fi
    done
    while [ \$part -le \$nparts ];do
        if [ -e ${subjdir}.bedpostX/logs/monitor/\$part ]; then
            nfin=\$((\$nfin + 1))
        fi
        part=\$((\$part + 1))
    done
    newmessages=\$((\$nfin - \$lastprinted))
    while [ "\$newmessages" -gt 0 ];do
        lastprinted=\$((\$lastprinted + 1))
        echo \$lastprinted parts processed out of \$nparts
        newmessages=\$((\$newmessages - 1))
    done
    if [ -f ${subjdir}.bedpostX/xfms/eye.mat ] ; then
        finished=1
        echo "All parts processed"
	exit 
    fi
    if [ ! \$havedad -gt 0 ]; then
       exit 0
    fi
    if [ "x$SGE_ROOT" = "x" ]; then
        havedad=\`ps -e -o pid 2>&1| grep "$$\\b" | wc -l\`
    fi
    sleep 50;
done
EOM
    chmod +x ${subjdir}.bedpostX/monitor
}

make_absolute(){
    dir=$1;
    if [ -d ${dir} ]; then
	OLDWD=`pwd`
	cd ${dir}
	dir_all=`pwd`
	cd $OLDWD
    else
	dir_all=${dir}
    fi
    echo ${dir_all}
}

[ "$1" = "" ] && Usage

subjdir=`make_absolute $1`
subjdir=`echo $subjdir | sed 's/\/$/$/g'`

echo "---------------------------------------------"
echo "------------ BedpostX GPU Version -----------"
echo "---------------------------------------------"
echo subjectdir is $subjdir

#parse option arguments
qsys=0
njobs=4
nfibres=3
fudge=1
burnin=1000
njumps=1250
sampleevery=25
model=2
gflag=0
other=""
queue=""

if [ $qsys -eq 0 ] && [ "x$SGE_ROOT" != "x" ]; then
	queue="-q $FSLGECUDAQ"
fi

shift
while [ ! -z "$1" ]
do
  case "$1" in
      -QSYS) qsys=$2;shift;;
      -Q) queue="-q $2";shift;;
      -NJOBS) njobs=$2;shift;;
      -n) nfibres=$2;shift;;
      -w) fudge=$2;shift;;
      -b) burnin=$2;shift;;
      -j) njumps=$2;shift;;
      -s) sampleevery=$2;shift;;
      -model) model=$2;shift;;
      -g) gflag=1;; 
      *) other=$other" "$1;;
  esac
  shift
done
opts="--nf=$nfibres --fudge=$fudge --bi=$burnin --nj=$njumps --se=$sampleevery --model=$model"
defopts="--cnonlinear"
opts="$opts $defopts $other"

#check that all required files exist

if [ ! -d $subjdir ]; then
	echo "subject directory $1 not found"
	exit 1
fi

if [ ! -e ${subjdir}/bvecs ]; then
    if [ -e ${subjdir}/bvecs.txt ]; then
	mv ${subjdir}/bvecs.txt ${subjdir}/bvecs
    else
	echo "${subjdir}/bvecs not found"
	exit 1
    fi
fi

if [ ! -e ${subjdir}/bvals ]; then
    if [ -e ${subjdir}/bvals.txt ]; then
	mv ${subjdir}/bvals.txt ${subjdir}/bvals
    else
	echo "${subjdir}/bvals not found"
	exit 1
    fi
fi

if [ `${FSLDIR}/bin/imtest ${subjdir}/data` -eq 0 ]; then
	echo "${subjdir}/data not found"
	exit 1
fi

if [ ${gflag} -eq 1 ]; then
    if [ `${FSLDIR}/bin/imtest ${subjdir}/grad_dev` -eq 0 ]; then
	echo "${subjdir}/grad_dev not found"
	exit 1
    fi
fi

if [ `${FSLDIR}/bin/imtest ${subjdir}/nodif_brain_mask` -eq 0 ]; then
	echo "${subjdir}/nodif_brain_mask not found"
	exit 1
fi

if [ -e ${subjdir}.bedpostX/xfms/eye.mat ]; then
	echo "${subjdir} has already been processed: ${subjdir}.bedpostX." 
	echo "Delete or rename ${subjdir}.bedpostX before repeating the process."
	exit 1
fi

echo Making bedpostx directory structure

mkdir -p ${subjdir}.bedpostX/
mkdir -p ${subjdir}.bedpostX/diff_parts
mkdir -p ${subjdir}.bedpostX/logs
mkdir -p ${subjdir}.bedpostX/logs/logs_gpu
mkdir -p ${subjdir}.bedpostX/logs/monitor
rm -f ${subjdir}.bedpostX/logs/monitor/*
mkdir -p ${subjdir}.bedpostX/xfms

#mailto=`whoami`@fmrib.ox.ac.uk

echo Copying files to bedpost directory

cp ${subjdir}/bvecs ${subjdir}/bvals ${subjdir}.bedpostX
${FSLDIR}/bin/imcp ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX
if [ `${FSLDIR}/bin/imtest ${subjdir}/nodif` = 1 ] ; then
    ${FSLDIR}/bin/fslmaths ${subjdir}/nodif -mas ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/nodif_brain
fi


# Split the dataset in parts 
echo Pre-processing stage

if [ ${gflag} -eq 1 ]; then
	pre_command="$FSLDIR/bin/split_parts_gpu ${subjdir}/data ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/bvals ${subjdir}.bedpostX/bvecs ${subjdir}/grad_dev 1 $njobs ${subjdir}.bedpostX"
else
	pre_command="$FSLDIR/bin/split_parts_gpu ${subjdir}/data ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/bvals ${subjdir}.bedpostX/bvecs NULL 0 $njobs ${subjdir}.bedpostX"
fi
if [ $qsys -eq 0 ]; then
	#SGE
	splitID=`${FSLDIR}/bin/fsl_sub -T 40 -l ${subjdir}.bedpostX/logs -N bedpostx_preproc_gpu $pre_command`
else
	#TORQUE
	echo $pre_command > ${subjdir}.bedpostX/temp
	torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=00:40:00 -N bedpostx_preproc_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs"
	splitID=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
        rm ${subjdir}.bedpostX/temp
	sleep 10
fi


nvox=`${FSLDIR}/bin/fslstats $subjdir.bedpostX/nodif_brain_mask -V  | cut -d ' ' -f1 `

echo Queuing parallel processing stage

[ -f ${subjdir}.bedpostX/commands.txt ] && rm ${subjdir}.bedpostX/commands.txt

monitor
if [ "x$SGE_ROOT" = "x" ]; then
    ${subjdir}.bedpostX/monitor&
fi

part=0
while [ $part -lt $njobs ]
do
    	partzp=`$FSLDIR/bin/zeropad $part 4`
    
	if [ ${gflag} -eq 1 ]; then
	    gopts="$opts --gradnonlin=${subjdir}.bedpostX/grad_dev_$part"
	else
	    gopts=$opts
	fi    

	echo "${FSLDIR}/bin/xfibres_gpu --data=${subjdir}.bedpostX/data_$part --mask=$subjdir.bedpostX/nodif_brain_mask -b ${subjdir}.bedpostX/bvals -r ${subjdir}.bedpostX/bvecs --forcedir --logdir=$subjdir.bedpostX/diff_parts/data_part_$partzp $gopts ${subjdir} $part $njobs $nvox" >> ${subjdir}.bedpostX/commands.txt
    
    	part=$(($part + 1))
done

if [ $qsys -eq 0 ]; then
	#SGE
	bedpostid=`${FSLDIR}/bin/fsl_sub $queue -l ${subjdir}.bedpostX/logs -N bedpostx_gpu -j $splitID -t ${subjdir}.bedpostX/commands.txt`
else
	#TORQUE
	taskfile=${subjdir}.bedpostX/commands.txt
	echo "command=\`cat "$taskfile" | head -\$PBS_ARRAYID | tail -1\` ; exec \$command" > ${subjdir}.bedpostX/temp
	tasks=`wc -l $taskfile | awk '{print $1}'`
	sge_tasks="-t 1-$tasks"
	#PBS -t x-y: x and y are the array bounds
	torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=3:00:00,pmem=16gb -N bedpostx_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs -W depend=afterok:$splitID $sge_tasks"
	bedpostid=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
        rm ${subjdir}.bedpostX/temp
	sleep 10
fi


echo Queuing post processing stage    %%这里可能涉及到bash dash等shell版本区别,不一一解释,要将sh脚本前面bash执行。
post_command="${FSLDIR}/bin/bedpostx_postproc_gpu.sh --data=${subjdir}/data --mask=$subjdir.bedpostX/nodif_brain_mask -b ${subjdir}.bedpostX/bvals -r ${subjdir}.bedpostX/bvecs  --forcedir --logdir=$subjdir.bedpostX/diff_parts $gopts $nvox $njobs ${subjdir} ${FSLDIR}"
if [ $qsys -eq 0 ]; then
	#SGE
	mergeid=`${FSLDIR}/bin/fsl_sub -T 120 -j $bedpostid -N bedpostx_postproc_gpu -l ${subjdir}.bedpostX/logs $post_command`
else
	#TORQUE
	echo $post_command > ${subjdir}.bedpostX/temp
	torque_command="qsub -V $queue -l nodes=1:ppn=1:gpus=1,walltime=00:40:00 -N bedpostx_postproc_gpu -o ${subjdir}.bedpostX/logs -e ${subjdir}.bedpostX/logs -W depend=afterokarray:$bedpostid"
	mergeid=`exec $torque_command ${subjdir}.bedpostX/temp | awk '{print $1}' | awk -F. '{print $1}'`
        rm ${subjdir}.bedpostX/temp
	sleep 10
fi

echo $mergeid > ${subjdir}.bedpostX/logs/postproc_ID

if [ "x$SGE_ROOT" != "x" ]; then
    echo
    echo Type ${subjdir}.bedpostX/monitor to show progress.
    echo Type ${subjdir}.bedpostX/cancel to terminate all the queued tasks.
    cat <<EOC > ${subjdir}.bedpostX/cancel
#!/bin/sh
qdel $mergeid $bedpostid
EOC
    chmod +x ${subjdir}.bedpostX/cancel

    echo
    echo You will get an email at the end of the post-processing stage.
    echo
else
    sleep 60
fi

#!/bin/sh

#   Copyright (C) 2012 University of Oxford
#
#   SHCOPYRIGHT

# last 2 parameters are subjdir and bindir
parameters=""
while [ ! -z "$2" ]                                      %%这里可能会报[[错误
do
	if [[ $1 =~ "--nf=" ]]; then                     %% if[[ ]] 代表正则表达式匹配 如果查询到nf变量,将numfib赋值为nf,cut -d 表示将一个字符串切断
    		numfib=`echo $1 | cut -d '=' -f2`        %%这里'='表示把'--nf=3'切为'--nf'和'3' 而-f后接2表示第二个串,即3赋值给numfib,这里-f与2之间
	fi                                               %%加个空格,这里如果反复bug,可以echo numfib的值进行观察
 	all=$all" "$1
	subjdir=$1
	shift
done
bindir=$1

$bindir/bin/merge_parts_gpu $all

fib=1
while [ $fib -le $numfib ]
do
    ${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_th${fib}samples -Tmean ${subjdir}.bedpostX/mean_th${fib}samples
    ${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_ph${fib}samples -Tmean ${subjdir}.bedpostX/mean_ph${fib}samples
    ${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/merged_f${fib}samples -Tmean ${subjdir}.bedpostX/mean_f${fib}samples

    ${FSLDIR}/bin/make_dyadic_vectors ${subjdir}.bedpostX/merged_th${fib}samples ${subjdir}.bedpostX/merged_ph${fib}samples ${subjdir}/nodif_brain_mask ${subjdir}.bedpostX/dyads${fib}
    if [ $fib -ge 2 ];then
	${FSLDIR}/bin/maskdyads ${subjdir}.bedpostX/dyads${fib} ${subjdir}.bedpostX/mean_f${fib}samples
	${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_f${fib}samples -div ${subjdir}.bedpostX/mean_f1samples ${subjdir}.bedpostX/mean_f${fib}_f1samples
	${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/dyads${fib}_thr0.05 -mul ${subjdir}.bedpostX/mean_f${fib}_f1samples ${subjdir}.bedpostX/dyads${fib}_thr0.05_modf${fib}
	${FSLDIR}/bin/imrm ${subjdir}.bedpostX/mean_f${fib}_f1samples
    fi

    fib=$(($fib + 1))

done

if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/mean_f1samples` -eq 1 ];then
    ${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_f1samples -mul 0 ${subjdir}.bedpostX/mean_fsumsamples
    fib=1
    while [ $fib -le $numfib ]
    do
	${FSLDIR}/bin/fslmaths ${subjdir}.bedpostX/mean_fsumsamples -add ${subjdir}.bedpostX/mean_f${fib}samples ${subjdir}.bedpostX/mean_fsumsamples
	fib=$(($fib + 1))
    done	
fi



echo Removing intermediate files

if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_th1samples` -eq 1 ];then
  if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_ph1samples` -eq 1 ];then
    if [ `${FSLDIR}/bin/imtest ${subjdir}.bedpostX/merged_f1samples` -eq 1 ];then
      rm -rf ${subjdir}.bedpostX/diff_parts
      rm -rf ${subjdir}.bedpostX/data*
      rm -rf ${subjdir}.bedpostX/grad_dev*	  
    fi
  fi
fi

echo Creating identity xfm

xfmdir=${subjdir}.bedpostX/xfms
echo 1 0 0 0 > ${xfmdir}/eye.mat
echo 0 1 0 0 >> ${xfmdir}/eye.mat
echo 0 0 1 0 >> ${xfmdir}/eye.mat
echo 0 0 0 1 >> ${xfmdir}/eye.mat

echo Done

共享库的问题主要是libcudart,libcuda,libcurand,libbedpostx_cuda 注意如果有地方出问题可以进当前目录用ldd -r *指令查看共享库依赖,注意软链接方法

一切搞定之后,bedpostx_gpu在98个方向上只需要30分钟,在288个方向上需要一个小时多一点。

二。probtrackx

  下载地址,一定要下载,因为FSL就算5.0.9版本也没有GPU版本的probtrackx

然后运行的时候注意 permission问题,加一下权限 chmod +777 /probtrackx2_gpu路径

后面find_the_biggest记得指定output名字,生成在home里面

probtarckx的运行时间,CPU运行大概2小时多,GPU7分钟左右。



整体配置:

1.ubantu16.04

2.cuda8.0

3.FSL5.0.8+后续缺少的GPU版本文件

4.显卡GTX 1080 TI

5.内存32GB

FSL的FDT部分可以完美GPU运行


  • 4
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 11
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值