htsjdk库ReferenceSequence类介绍

ReferenceSequence 是 HTSJDK 库中的一个类,用于表示基因组参考序列的一个特定区域。它是处理参考基因组数据时的关键类,提供了对参考序列的访问和操作功能。

ReferenceSequence 类概述

功能
  • 表示参考序列

    • ReferenceSequence 类封装了参考基因组中某个 contig(染色体或 contig)的序列数据,包括序列的名称、起始位置以及实际的核苷酸序列。
  • 提供序列数据

    • 该类提供了对序列的各种操作,如获取序列的碱基字符串、访问序列的部分区域等。
主要属性和方法
  1. 序列名称和位置

    • getName():返回参考序列的名称(即 contig 名称)。
    • getStart():返回序列的起始位置(1-based)。
    • getEnd():返回序列的结束位置(1-based)。
  2. 序列数据

    • getBaseString():返回参考序列的核苷酸字符串(即序列本身)。例如,"ACGT"。
    • getBases():返回参考序列的碱基数组。
  3. 区域访问

    • subSequence(int start, int end):返回参考序列中指定区域的子序列。
源代码
/*
 * The MIT License
 *
 * Copyright (c) 2009 The Broad Institute
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

package htsjdk.samtools.reference;

import htsjdk.beta.plugin.HtsRecord;
import htsjdk.samtools.util.StringUtil;

/**
 * Wrapper around a reference sequence that has been read from a reference file.
 *
 * @author Tim Fennell
 */
public class ReferenceSequence implements HtsRecord {
    private final String name;
    private final byte[] bases;
    private final int contigIndex;
    private final int length;

    /**
     * creates a fully formed ReferenceSequence
     *
     * @param name the name of the sequence from the source file
     * @param index the zero based index of this contig in the source file
     * @param bases the bases themselves stored as one-byte characters
     */
    public ReferenceSequence(String name, int index, byte[] bases) {
        this.name = name;
        this.contigIndex = index;
        this.bases = bases;
        this.length = bases.length;
    }

    /** Gets the set of names given to this sequence in the source file. */
    public String getName() { return name; }

    /**
     * Gets the array of bases that define this sequence. The bases can include any
     * letter and possibly include masking information in the form of lower case
     * letters.  This array is mutable (obviously!) and it NOT a clone of the array
     * held interally.  Do not modify it!!!
     */
    public byte[] getBases() { return bases; }

    /**
     * Returns the bases represented by this ReferenceSequence as a String. Since this will copy the bases
     * and convert them to two-byte characters, this should not be used on very long reference sequences,
     * but as a convenience when manipulating short sequences returned by
     * {@link ReferenceSequenceFile#getSubsequenceAt(String, long, long)}
     *
     * @return The set of bases represented by this ReferenceSequence, as a String
     */
    public String getBaseString() { return StringUtil.bytesToString(bases); }

    /** Gets the 0-based index of this contig in the source file from which it came. */
    public int getContigIndex() { return contigIndex; }

    /** Gets the length of this reference sequence in bases. */
    public int length() { return length; }
    
    public String toString() {
        return "ReferenceSequence " + getName();
    }
}

使用GATK的combinegvcf模块合并gvcf文件,可是到了这一步Using GATK jar /stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar CombineGVCFs -R /stor9000/apps/users/NWSUAF/2008115251/genomes/ARS-UCD1.2_Btau5.0.1Y.fa --variant /stor9000/apps/users/NWSUAF/2020055419/home/xncattle/03.GVCF/01_out_GVCF/XN_22/1_XN_22.g.vcf.gz --variant /stor9000/apps/users/NWSUAF/2020055419/home/xncattle/03.GVCF/01_out_GVCF/XN_18/1_XN_18.g.vcf.gz -O /stor9000/apps/users/NWSUAF/2022050469/candy/bwa/gatk/Combine/chr1.g.vcf.gz 09:10:40.524 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/stor9000/apps/users/NWSUAF/2022050434/biosoft/gatk4.3/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 09:10:50.696 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.697 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.3.0.0 09:10:50.697 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/ 09:10:50.698 INFO CombineGVCFs - Executing as 2022050469@node54 on Linux v3.10.0-1127.el7.x86_64 amd64 09:10:50.698 INFO CombineGVCFs - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_72-b15 09:10:50.698 INFO CombineGVCFs - Start Date/Time: July 21, 2023 9:10:40 AM CST 09:10:50.698 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.698 INFO CombineGVCFs - ------------------------------------------------------------ 09:10:50.698 INFO CombineGVCFs - HTSJDK Version: 3.0.1 09:10:50.699 INFO CombineGVCFs - Picard Version: 2.27.5 09:10:50.699 INFO CombineGVCFs - Built for Spark Version: 2.4.5 09:10:50.699 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2 09:10:50.699 INFO CombineGVCFs - HTSJDK Defa就停止了,没有输出文件,也没有报错文件
07-22
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值