实验记录 | 6/15 speedseq的编译安装

刚刚用了一个小时的时间整理回顾了上周的工作内容。
接下来,重新梳理问题来源。

通过我们本次实验所得到的结果文档germline_mutations.txt的文档的结果可以看出,RNA的calling使用到了非常多的caller,但是,却最终只有varscan得到了calling的结果,为什么?
我们需要回顾一下,somatic.pl的源代码文件,它这部分是怎样进行处理的?(我现在能够做什么,来解决这个问题)
以上,我先去吃个饭。切换到Linux平台下。在吃饭的过程中,这部分指令可以继续运行。
(17:01)程序运行中。我先去吃个饭。


我在这里先解决speedseq的子模块安装不全的问题。
再次尝试,
git clone --recursive git://github.com/hall-lab/speedseq

Cloning into ‘speedseq’…
remote: Enumerating objects: 3302, done.
remote: Total 3302 (delta 0), reused 0 (delta 0), pack-reused 3302
Receiving objects: 100% (3302/3302), 14.81 MiB | 4.97 MiB/s, done.
Resolving deltas: 100% (1531/1531), done.
Submodule ‘src/CNVnator’ (https://github.com/abyzovlab/CNVnator) registered for path ‘src/CNVnator’
Submodule ‘src/bamkit’ (https://github.com/cc2qe/bamkit.git) registered for path ‘src/bamkit’
Submodule ‘src/bwa’ (https://github.com/lh3/bwa.git) registered for path ‘src/bwa’
Submodule ‘src/freebayes’ (https://github.com/ekg/freebayes.git) registered for path ‘src/freebayes’
Submodule ‘src/lumpy-sv’ (https://github.com/hall-lab/lumpy-sv.git) registered for path ‘src/lumpy-sv’
Submodule ‘src/parallel’ (http://git.savannah.gnu.org/r/parallel.git) registered for path ‘src/parallel’
Submodule ‘src/samblaster’ (https://github.com/GregoryFaust/samblaster.git) registered for path ‘src/samblaster’
Submodule ‘src/svtyper’ (https://github.com/hall-lab/svtyper.git) registered for path ‘src/svtyper’
Submodule ‘src/tabix’ (https://github.com/samtools/tabix.git) registered for path ‘src/tabix’
Submodule ‘src/vawk’ (https://github.com/cc2qe/vawk.git) registered for path ‘src/vawk’
Cloning into ‘src/CNVnator’…
remote: Enumerating objects: 573, done.
remote: Counting objects: 100% (17/17), done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 573 (delta 3), reused 7 (delta 1), pack-reused 556
Receiving objects: 100% (573/573), 71.93 MiB | 17.40 MiB/s, done.
Resolving deltas: 100% (339/339), done.
Submodule path ‘src/CNVnator’: checked out ‘9d3a92b01ce4f554227b566e4c9b8ba8af42d0af’
Cloning into ‘src/bamkit’…
fatal: unable to access ‘https://github.com/cc2qe/bamkit.git/’: OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443
Clone of ‘https://github.com/cc2qe/bamkit.git’ into submodule path ‘src/bamkit’ failed

前后两次不同时间的clone比较,发现都卡在了src/bamkit的下载上。
解决方法:
进入到clone的文件夹,cd speedseq
在clone的文件夹内,运行指令,git submodule update --init --recursive

Cloning into ‘src/bamkit’…
remote: Enumerating objects: 66, done.
remote: Total 66 (delta 0), reused 0 (delta 0), pack-reused 66
Unpacking objects: 100% (66/66), done.
Submodule path ‘src/bamkit’: checked out ‘b5ddbc560491d2e18f071951e55dddc75915922e’
Cloning into ‘src/bwa’…
fatal: unable to access ‘https://github.com/lh3/bwa.git/’: OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443
Clone of ‘https://github.com/lh3/bwa.git’ into submodule path ‘src/bwa’ failed

继续运行指令,git submodule update --init --recursive

Cloning into ‘src/bwa’…
remote: Enumerating objects: 4326, done.
remote: Counting objects: 100% (31/31), done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 4326 (delta 9), reused 20 (delta 6), pack-reused 4295
Receiving objects: 100% (4326/4326), 1.68 MiB | 2.81 MiB/s, done.
Resolving deltas: 100% (3084/3084), done.
Submodule path ‘src/bwa’: checked out ‘705aa538947a0681de575f0f0d5c65593f80cf14’
Cloning into ‘src/freebayes’…
remote: Enumerating objects: 6389, done.
remote: Counting objects: 100% (256/256), done.
remote: Compressing objects: 100% (179/179), done.
remote: Total 6389 (delta 124), reused 151 (delta 77), pack-reused 6133
Receiving objects: 100% (6389/6389), 7.86 MiB | 2.57 MiB/s, done.
Resolving deltas: 100% (4191/4191), done.
Submodule path ‘src/freebayes’: checked out ‘c003c1e602ab1fc9a4d3389adb0582c40d65123f’
Submodule ‘bamtools’ (https://github.com/ekg/bamtools.git) registered for path ‘bamtools’
Submodule ‘intervaltree’ (https://github.com/ekg/intervaltree.git) registered for path ‘intervaltree’
Submodule ‘bash-tap’ (https://github.com/illusori/bash-tap.git) registered for path ‘test/bash-tap’
Submodule ‘test/test-simple-bash’ (https://github.com/ingydotnet/test-simple-bash.git) registered for path ‘test/test-simple-bash’
Submodule ‘vcflib’ (https://github.com/ekg/vcflib.git) registered for path ‘vcflib’
Cloning into ‘bamtools’…
(又再次卡住)
fatal: unable to access ‘https://github.com/ekg/bamtools.git/’: OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443
Clone of ‘https://github.com/ekg/bamtools.git’ into submodule path ‘bamtools’ failed

ctrl C终止命令。
继续不断的迭代运行指令,git submodule update --init --recursive,直至指令运行完成所有的子模块的加载。

Cloning into ‘bamtools’…
remote: Enumerating objects: 4028, done.
remote: Total 4028 (delta 0), reused 0 (delta 0), pack-reused 4028
Receiving objects: 100% (4028/4028), 2.50 MiB | 1.41 MiB/s, done.
Resolving deltas: 100% (2683/2683), done.
Submodule path ‘src/freebayes/bamtools’: checked out ‘e77a43f5097ea7eee432ee765049c6b246d49baa’
Cloning into ‘intervaltree’…
remote: Enumerating objects: 181, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 181 (delta 1), reused 3 (delta 1), pack-reused 172
Receiving objects: 100% (181/181), 126.35 KiB | 0 bytes/s, done.
Resolving deltas: 100% (98/98), done.
Submodule path ‘src/freebayes/intervaltree’: checked out ‘d151b487804861dc9f932e9f1fe4f8c499673cec’
Cloning into ‘test/bash-tap’…
remote: Enumerating objects: 52, done.
remote: Total 52 (delta 0), reused 0 (delta 0), pack-reused 52
Unpacking objects: 100% (52/52), done.
Submodule path ‘src/freebayes/test/bash-tap’: checked out ‘c38fbfa401600cc81ccda66bfc0da3ea56288d03’
Cloning into ‘src/lumpy-sv’…
remote: Enumerating objects: 2942, done.
remote: Total 2942 (delta 0), reused 0 (delta 0), pack-reused 2942
Receiving objects: 100% (2942/2942), 183.20 MiB | 15.24 MiB/s, done.
Resolving deltas: 100% (1583/1583), done.
Submodule path ‘src/lumpy-sv’: checked out ‘dd4bf97704b253f7b5aef72fc28c10b2ef74a2c6’
Cloning into ‘test/test-simple-bash’…
remote: Enumerating objects: 87, done.
remote: Counting objects: 100% (87/87), done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 87 (delta 32), reused 87 (delta 32), pack-reused 0
Unpacking objects: 100% (87/87), done.
Submodule path ‘src/freebayes/test/test-simple-bash’: checked out ‘124673ff204b01c8e96b7fc9f9b32ee35d898acc’
Cloning into ‘vcflib’…
remote: Enumerating objects: 5670, done.
remote: Counting objects: 100% (1756/1756), done.
remote: Compressing objects: 100% (518/518), done.
remote: Total 5670 (delta 1398), reused 1513 (delta 1233), pack-reused 3914
Receiving objects: 100% (5670/5670), 28.51 MiB | 11.86 MiB/s, done.
Resolving deltas: 100% (3900/3900), done.
Submodule path ‘src/freebayes/vcflib’: checked out ‘5ac091365fdc716cc47cc5410bb97ee5dc2a2c92’
Submodule ‘fastahack’ (https://github.com/ekg/fastahack.git) registered for path ‘fastahack’
Submodule ‘filevercmp’ (https://github.com/ekg/filevercmp.git) registered for path ‘filevercmp’
Submodule ‘fsom’ (https://github.com/ekg/fsom.git) registered for path ‘fsom’
Submodule ‘intervaltree’ (https://github.com/ekg/intervaltree.git) registered for path ‘intervaltree’
Submodule ‘multichoose’ (https://github.com/ekg/multichoose.git) registered for path ‘multichoose’
Submodule ‘smithwaterman’ (https://github.com/ekg/smithwaterman.git) registered for path ‘smithwaterman’
Submodule ‘tabixpp’ (https://github.com/ekg/tabixpp.git) registered for path ‘tabixpp’
Cloning into ‘fastahack’…
remote: Enumerating objects: 227, done.
remote: Total 227 (delta 0), reused 0 (delta 0), pack-reused 227
Receiving objects: 100% (227/227), 51.19 KiB | 0 bytes/s, done.
Resolving deltas: 100% (128/128), done.
Submodule path ‘src/freebayes/vcflib/fastahack’: checked out ‘c68cebb4f2e5d5d2b70cf08fbdf1944e9ab2c2dd’
Cloning into ‘src/svtyper’…
remote: Enumerating objects: 1954, done.
remote: Total 1954 (delta 0), reused 0 (delta 0), pack-reused 1954
Receiving objects: 100% (1954/1954), 2.25 MiB | 1.37 MiB/s, done.
Resolving deltas: 100% (1207/1207), done.
Submodule path ‘src/svtyper’: checked out ‘635b8f6e8b17345a1905963877b597110c6906e7’

这个过程中,出现报错:

error: RPC failed; result=35, HTTP code = 0

后参考链接:https://blog.csdn.net/wangwangstone/article/details/109443947
输入指令,git config --global http.postBuffer 80M,得以正常的运行。

Cloning into ‘filevercmp’…
remote: Enumerating objects: 30, done.
remote: Total 30 (delta 0), reused 0 (delta 0), pack-reused 30
Unpacking objects: 100% (30/30), done.
Submodule path ‘src/freebayes/vcflib/filevercmp’: checked out ‘1a9b779b93d0b244040274794d402106907b71b7’
Cloning into ‘fsom’…
remote: Enumerating objects: 42, done.
remote: Total 42 (delta 0), reused 0 (delta 0), pack-reused 42
Unpacking objects: 100% (42/42), done.
Submodule path ‘src/freebayes/vcflib/fsom’: checked out ‘a6ef318fbd347c53189384aef7f670c0e6ce89a3’

再次遇到问题:

error: RPC failed; result=52, HTTP code = 0

参考网上修改网络设置。
git config --global http.proxy http://127.0.0.1:1080
git config --global https.proxy http://127.0.0.1:1080
错误的更加彻底:

Cloning into ‘intervaltree’…
fatal: unable to access ‘https://github.com/ekg/intervaltree.git/’: Failed to connect to 127.0.0.1 port 1080: Connection refused
Clone of ‘https://github.com/ekg/intervaltree.git’ into submodule path ‘intervaltree’ failed
Failed to recurse into submodule path ‘src/freebayes/vcflib’
Cloning into ‘src/tabix’…
fatal: unable to access ‘https://github.com/samtools/tabix.git/’: Failed to connect to 127.0.0.1 port 1080: Connection refused
Clone of ‘https://github.com/samtools/tabix.git’ into submodule path ‘src/tabix’ failed

当我尝试删除代理服务之后,
git config --global --unset https.proxy
git config --global --unset http.proxy
出现了惊人的变化,

Cloning into ‘intervaltree’…
remote: Enumerating objects: 181, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 181 (delta 1), reused 3 (delta 1), pack-reused 172
Receiving objects: 100% (181/181), 126.35 KiB | 0 bytes/s, done.
Resolving deltas: 100% (98/98), done.
Submodule path ‘src/freebayes/vcflib/intervaltree’: checked out ‘1290744283cef8076bb8a2968d4899b7228435f4’
Cloning into ‘multichoose’…
remote: Enumerating objects: 109, done.
remote: Total 109 (delta 0), reused 0 (delta 0), pack-reused 109
Receiving objects: 100% (109/109), 24.24 KiB | 0 bytes/s, done.
Resolving deltas: 100% (56/56), done.
Submodule path ‘src/freebayes/vcflib/multichoose’: checked out ‘73d35daa18bf35729b9ba758041a9247a72484a5’
Cloning into ‘smithwaterman’…
remote: Enumerating objects: 312, done.
remote: Total 312 (delta 0), reused 0 (delta 0), pack-reused 312
Receiving objects: 100% (312/312), 90.24 KiB | 0 bytes/s, done.
Resolving deltas: 100% (197/197), done.
Submodule path ‘src/freebayes/vcflib/smithwaterman’: checked out ‘203218b47d45ac56ef234716f1bd4c741b289be1’
Cloning into ‘src/tabix’…
remote: Enumerating objects: 391, done.
remote: Total 391 (delta 0), reused 0 (delta 0), pack-reused 391
Receiving objects: 100% (391/391), 145.37 KiB | 0 bytes/s, done.
Resolving deltas: 100% (229/229), done.
Submodule path ‘src/tabix’: checked out ‘1ae158ac79b459f5feeed7490c67519b14ce9f35’
(我这阴晴不定的网络啊)
Cloning into ‘src/vawk’…
remote: Enumerating objects: 102, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 102 (delta 0), reused 1 (delta 0), pack-reused 99
Receiving objects: 100% (102/102), 25.39 KiB | 0 bytes/s, done.
Resolving deltas: 100% (44/44), done.
Submodule path ‘src/vawk’: checked out ‘10b8cf0916edadd57f80d5e99e32bf7534523af3’

出现错误:

OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443

再次尝试git submodule update --init --recursive

Cloning into ‘tabixpp’…
remote: Enumerating objects: 148, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 148 (delta 4), reused 6 (delta 2), pack-reused 135
Receiving objects: 100% (148/148), 89.31 KiB | 0 bytes/s, done.
Resolving deltas: 100% (57/57), done.
Submodule path ‘src/freebayes/vcflib/tabixpp’: checked out ‘bbc63a49acc52212199f92e9e3b8fba0a593e3f7’

当输入git submodule update --init --recursive之后,不再有子模块的继续更新,就知道speedseq的clone工作基本完成。
由于这个过程来之不易,断掉了很多次。所以将子模块下载好的文件夹,存放到百度网盘上进行备份(===>百度网盘上传的文件数超过限制,于是最终将其转移到电脑F盘的位置)。
子模块的安装结束。

接着,我们对其进行编译。
make
make完成。

make[1]: Leaving directory `/home/xxzhang/workplace/software/speedseq’

我暂时性的编译安装完成了,接下来,如何检验自己speedseq是否有安装完成呢?
输入指令:

/home/xxzhang/workplace/software/speedseq/bin/speedseq somatic -F 0.01 -q 10 -t 20 -T ./output_RNA/sptmp -o ./output_RNA/speedseq ./geneome/hg19/hg19.fa /home/xxzhang/workplace/QBRC/output_RNA/normal/normal.bam /home/xxzhang/workplace/QBRC/output_RNA/tumor/tumor.bam

再次报错产生的是信息是:

Error: BAM file /home/xxzhang/workplace/QBRC/output_RNA/normal/normal.bam not found.

而不是之前的:

Error: freebayes executable not found. Please set path in /home/xxzhang/workplace/software/speedseq/bin/speedseq.config file

因此,说明过去的错误不再出现。speedseq的安装暂时性的完成。


接下来,我们继续去找上一次运行的流程的实验记录里,出现的一些错误。尝试着对其进行修改,修改完成之后,争取走之前,将代码挂在服务器中运行。

  • shimmer
    /home/xxzhang/workplace/software/Shimmer/shimmer.pl --minqual 25 --ref ./geneome/hg19/hg19.fa /home/xxzhang/workplace/QBRC/output_RNA/normal/normal.bam /home/xxzhang/workplace/QBRC/output_RNA/tumor/tumor.bam --outdir ./output_RNA

/usr/bin/perl: symbol lookup error: /opt/perl5/lib/perl5/x86_64-linux-thread-multi/auto/List/Util/Util.so: undefined symbol: Perl_xs_handshake

猜想错误的原因:
我们想要运行perl文件,应该在其前添加perl
所以,将指令应该修改为:
perl /home/xxzhang/workplace/software/Shimmer/shimmer.pl --minqual 25 --ref ./geneome/hg19/hg19.fa /home/xxzhang/workplace/QBRC/output_RNA/normal/normal.bam /home/xxzhang/workplace/QBRC/output_RNA/tumor/tumor.bam --outdir ./output_RNA

perl /home/xxzhang/workplace/QBRC//somatic_script/add_readct_shimmer.pl ./output_RNA/som_counts.bh.txt ./output_RNA/somatic_diffs.vcf ./output_RNA/somatic_diffs.readct.vcf

No such file or directory at /home/xxzhang/workplace/QBRC//somatic_script/add_readct_shimmer.pl line 6.

  • lofreq
    /home/xxzhang/miniconda3/bin/lofreq2_call_parallel.py --pp-threads 20 -s --sig 0.1 --bonf 1 -C 7 -f ./geneome/hg19/hg19.fa -S ./geneome/hg19/hg19.fa_resource/dbsnp.hg19.vcf --call-indels -l ./geneome/hg19/hg19.fa.exon.bed -o ./output_RNA/lofreq_t.vcf /home/xxzhang/workplace/QBRC/output_RNA/tumor/tumor.bam

FileNotFoundError: [Errno 2] No such file or directory: ‘lofreq’: ‘lofreq’

  • strelka
    /home/xxzhang/workplace/software/strelka/bin/configureStrelkaSomaticWorkflow.py --normalBam /home/xxzhang/workplace/QBRC/output_RNA/normal/normal.bam --tumorBam /home/xxzhang/workplace/QBRC/output_RNA/tumor/tumor.bam --referenceFasta ./geneome/hg19/hg19.fa --runDir ./output_RNA/strelka --exome --indelCandidates ./output_RNA/manta/results/variants/candidateSmallIndels.vcf.gz

是我们在calling的过程中,调用这些指令不正确吗?

lofreq和strelka,我暂时没有什么想法。现在先在服务器上运行全部,如果再次出错,就明天一条条的拆开来运行。
qsub -I
切换到计算节点。
nohup perl somatic.pl RNA:./data/SRR3052083.fastq.gz NA RNA:./data/SRR5297065.fastq.gz NA 20 hg19 ./geneome/hg19/hg19.fa /home/xxzhang/workplace/software/java/jdk1.7.0_80/bin/java ./output_RNA human 1 ./disambiguate_pipeline>pipeline_RNA_6.txt &

挂上指令文件,将屏显结果输出到pipeline_RNA_6.txt文件中,明天早上查看运行结果。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值