Linux系统或服务器运行Fastqc

Fastqc官网:Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data

正确命令

fastqc --noextract  201645A_200048_1_S1_L001_R1_001.fastq.gz 

当拿到测序数据的第一件事往往是进行质检,Fastqc是较为常用的质控软件,优点是:软件小,易于安装,傻瓜式操作。

Fastqc有两种使用方式,

1. windows系统使用,Fastqc最基本的使用方式,一种交互式界面,使用非常简单,但是这种方式质检小批量数据,对于超过100G或者上T的数据,如果不怕电脑崩了可以试试。

2. Linux系统使用,命令行运行,适合大批量测序数据质检

需要注意,如果直接使用默认参数运行会报错,如下:

fastqc 201645A_200048_1_S1_L001_R1_001.fastq.gz 
Error: Could not find or load main class uk.ac.babraham.FastQC.FastQCApplication

查看帮助文件给人的感觉是:Fastqc作者推荐用使用windows系统,因为文件中描述了很多Running FastQC Interactively的内容

帮助文件也说明了如何在Linux中运行Fastqc,以及参数

To run non-interactively you simply have to specify a list of files to process
on the commandline

1. fastqc somefile.txt someotherfile.txt 生成含有fastq文件名的txt文件

You can specify as many files to process in a single run as you like.  If you don't
specify any files to process the program will try to open the interactive application
which may result in an error if you're running in a non-graphical environment.

There are a few extra options you can specify when running non-interactively.  Full
details of these can be found by running 

2. fastqc --help

By default, in non-interactive mode FastQC will create an HTML report with embedded
graphs, but also a zip file containing individual graph files and additional data files
containing the raw data from which plots were drawn.  The zip file will not be extracted
by default but you can enable this by adding:

3. --extract 解压缩文件

To the launch command.

If you want to save your reports in a folder other than the folder which contained
your original FastQ files then you can specify an alternative location by setting a
--outdir value:

4. --outdir=/some/other/dir/

If you want to run fastqc on a stream of data to be read from standard input then you
can do this by specifing 'stdin' as the name of the file to be processed and then 
streaming uncompressed fastq format data to the program.  For example:

zcat *fastq.gz | fastqc stdin

If you want the results from a streamed analysis sent to a file with a name other than
stdin then you can add a colon and put the file name you want, for example:

zcat *fastq.gz | fastqc stdin:my_results

..would write results to my_result.html and my_results.zip.

按照帮助文件操作,生成一个200048.txt文件,里边有要质检数据的文件名,并没有什么用,仍然报错

cat 200048.txt 
201645A_200048_1_S1_L001_R1_001.fastq.gz

fastqc 200048.txt 
Failed to process 200048.txt
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
	at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
	at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:89)
	at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
	at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
	at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:159)
	at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:121)
	at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)

仔细查看了fastqc --help内容,其中有一个参数

    --noextract     Do not uncompress the output file after creating it.  You
                    should set this option if you do not wish to uncompress
                    the output when running in non-interactive mode.

添加该参数后成功运行

fastqc --noextract  201645A_200048_1_S1_L001_R1_001.fastq.gz 
Started analysis of 201645A_200048_1_S1_L001_R1_001.fastq.gz

3. Fastqc运行依赖于Java,所以,无论是在windows中使用还是在Linux中使用都需要先安装jre

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值