htsjdk库ReferenceSequenceFile接口介绍

ReferenceSequenceFile 是 HTSJDK 库中的一个接口,用于表示参考基因组文件。它定义了读取参考基因组序列的标准方法。这个接口使得不同类型的参考基因组文件(如 FASTA 格式)能够以统一的方式进行访问。

ReferenceSequenceFile 接口主要功能

  • 访问参考序列

    • 提供获取参考基因组序列的接口方法,使得用户能够从参考基因组文件中读取特定的序列数据(contig)。
  • 支持不同格式

    • 该接口可以被不同格式的参考基因组文件实现,例如 FASTA 文件。具体实现类会根据文件格式提供相应的功能。

ReferenceSequenceFile接口的实现类:

ReferenceSequenceFile接口源码:

/*
 * The MIT License
 *
 * Copyright (c) 2009 The Broad Institute
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

package htsjdk.samtools.reference;

import htsjdk.samtools.SAMSequenceDictionary;

import java.io.Closeable;
import java.io.IOException;

/**
 * An interface for working with files of reference sequences regardless of the file format
 * being used.
 *
 * @author Tim Fennell
 */
public interface ReferenceSequenceFile extends Closeable {

    /**
     * Must return a sequence dictionary with at least the following fields completed
     * for each sequence: name, length.
     *
     * @return a list of sequence records representing the sequences in this reference file
     */
    public SAMSequenceDictionary getSequenceDictionary();

    /**
     * Retrieves the next whole sequences from the file.
     * @return a ReferenceSequence or null if at the end of the file
     */
    public ReferenceSequence nextSequence();

    /**
     * Resets the ReferenceSequenceFile so that the next call to nextSequence() will return
     * the first sequence in the file.
     */
    public void reset();


    /**
     * @return true if getSequence and getSubsequenceAt methods are allowed.
     */
    public boolean isIndexed();

    /**
     * Retrieves the complete sequence described by this contig.
     * @param contig contig whose data should be returned.
     * @return The full sequence associated with this contig.
     * @throws UnsupportedOperationException if !sIndexed.
     */
    public ReferenceSequence getSequence( String contig );

    /**
     * Gets the subsequence of the contig in the range [start,stop]
     * @param contig Contig whose subsequence to retrieve.
     * @param start inclusive, 1-based start of region.
     * @param stop inclusive, 1-based stop of region.
     * @return The partial reference sequence associated with this range.
     * @throws UnsupportedOperationException if !sIndexed.
     */
    public ReferenceSequence getSubsequenceAt( String contig, long start, long stop );
    
    /**
     * @return Reference name, file name, or something other human-readable representation.
     */
    public String toString();

    @Override
    public void close() throws IOException;
}

使用GATK的combinegvcf模块合并gvcf文件,可是到了这一步Using GATK jar /stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar CombineGVCFs -R /stor9000/apps/users/NWSUAF/2008115251/genomes/ARS-UCD1.2_Btau5.0.1Y.fa --variant /stor9000/apps/users/NWSUAF/2020055419/home/xncattle/03.GVCF/01_out_GVCF/XN_22/1_XN_22.g.vcf.gz --variant /stor9000/apps/users/NWSUAF/2020055419/home/xncattle/03.GVCF/01_out_GVCF/XN_18/1_XN_18.g.vcf.gz -O /stor9000/apps/users/NWSUAF/2022050469/candy/bwa/gatk/Combine/chr1.g.vcf.gz 09:10:40.524 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 09:10:50.696 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.697 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.3.0.0 09:10:50.697 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/ 09:10:50.698 INFO CombineGVCFs - Executing as 2022050469@node54 on Linux v3.10.0-1127.el7.x86_64 amd64 09:10:50.698 INFO CombineGVCFs - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_72-b15 09:10:50.698 INFO CombineGVCFs - Start Date/Time: July 21, 2023 9:10:40 AM CST 09:10:50.698 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.698 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.698 INFO CombineGVCFs - HTSJDK Version: 3.0.1 09:10:50.699 INFO CombineGVCFs - Picard Version: 2.27.5 09:10:50.699 INFO CombineGVCFs - Built for Spark Version: 2.4.5 09:10:50.699 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2 09:10:50.699 INFO CombineGVCFs - HTSJDK Defa就停止了,没有输出文件,也没有报错文件
07-22
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值