Many users have reported issues running HaplotypeCaller with the -nct argument, so we recommend using Queue to parallelize HaplotypeCaller instead of multithreading.
https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_haplotypecaller_HaplotypeCaller.php
To make it run faster you can run it on a machine with a large number of cores or on a Sun Grid Engine cluster. You can use the GATK queue library together with a small scala script to start the haplotype caller on multiple cores, locally or on a cluster.
I am using Queue. I think it is still useful even if you only have one multi-core workstation. It can speed up IndelRealignment and HaplotypeCaller significantly.
The documentation obviously is inadequate but it is still possible to figure it out by reading some scripts you can download from the internet. However, I can see many scientists who are not from a progamming background might find learning a new language called Scala not worth the effort.
http://gatkforums.broadinstitute.org/gatk/discussion/5353/how-actively-used-is-queue
http://gatkforums.broadinstitute.org/gatk/discussion/5334/how-to-initiate-scatter-gather-on-one-machine
The HaplotypeCaller documentation recommends using Queue to parallelize HaplotypeCaller instead of -nct, so I've been attempting to do that, however I can't seem to get Queue to do any kind of parallel processing. I'm currently working on a machine with 8 cores and I'm consistently getting Queue to run, but it only runs single-threaded. I don't have access to a distributed computing environment, but I don't see why Queue wouldn't be able to parallelize on one machine with multiple cores, and I see no documentation indicating that threading by Queue is only available in distributed computing environments.
What I've done is a minimal modification of the ExampleUnifiedGenotyper.scala script to use it to run HaplotypeCaller. I have tried running it a couple of times to see how it would run. I tried a couple times with just the reference file and mapping file as input, plus I tried a couple times with an intervals file listing each of the chromosomes as separate intervals. Every time, it ran single threaded.
I've found several articles and comments indicating that Queue should be used to Scatter/Gather a job and even explain how Scatter/Gather works, so I was under the assumption that this is just what Queue does and it would use multi-core systems to their full advantage, however this is not my experience and I don't see anything in the documentation to explain why. If it could be explained to me either how I'm running the command wrong, or why Queue can't be used to parallelize on one machine, I would be very grateful.
Ah, sorry for the confusion. Queue is intended more for compute farms and requires a job scheduler (eg LSF or GridEngine) to actually dispatch jobs to different nodes.
Do you know if the issues with running HaplotypeCaller with -nct have been addressed in version 3.3-0?
No they have not been addressed, and probably will not be. We are moving away from multithreading in favor of scatter gather. It is a much more stable method of parallelism and less prone to race conditions.
So I am trying to run HaplotypeCaller using Queue on a single 32 core, 256G RAM machine. I have been using a modified version ofthis script. I am able to run the Walker (with -sg 50) and it seems to be running the Scatter portion, however, I only see one java process running. Shouldn't I see 50? Is there some other switch I need to use to make the Walker run 50 times in parallel?
Queue's scatter gather is designed to work on a cluster, not on a single machine. In a single machine it will run all jobs sequentially.