HTSJDK 库中的 Cigar
类用于表示比对过程中读取序列与参考序列之间的比对信息。CIGAR(Compact Idiosyncratic Gapped Alignment Report)字符串由多个操作符组成,这些操作符表示在比对过程中如何处理读取序列中的每个碱基。Cigar
类封装了这些信息,并提供了操作和访问这些信息的方法。
类简介
Cigar
类主要由一系列 CigarElement
对象组成,每个 CigarElement
包含一个操作符和一个长度。操作符包括匹配(M)、插入(I)、删除(D)、跳跃(N)、软剪切(S)、硬剪切(H)、匹配或不匹配(X)、匹配或不匹配(=)等。
Cigar.java源码
/*
* The MIT License
*
* Copyright (c) 2009 The Broad Institute
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
package htsjdk.samtools;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
/**
* A list of CigarElements, which describes how a read aligns with the reference.
* E.g. the Cigar string 10M1D25M means
* * match or mismatch for 10 bases
* * deletion of 1 base
* * match or mismatch for 25 bases
*
* c.f. https://samtools.github.io/hts-specs/SAMv1.pdf for complete CIGAR specification.
*/
public class Cigar implements Serializable, Iterable<CigarElement> {
public static final long serialVersionUID = 1L;
private final List<CigarElement> cigarElements = new ArrayList<CigarElement>();
public Cigar() {
}
public Cigar(final List<CigarElement> cigarElements) {
this.cigarElements.addAll(cigarElements);
}
public List<CigarElement> getCigarElements() {
return Collections.unmodifiableList(cigarElements);
}
public CigarElement getCigarElement(final int i) {
return cigarElements.get(i);
}
public void add(final CigarElement cigarElement) {
cigarElements.add(cigarElement);
}
public int numCigarElements() {
return cigarElements.size();
}
public boolean isEmpty() {
return cigarElements.isEmpty();
}
/**
* @return The number of reference bases that the read covers, excluding padding.
*/
public int getReferenceLength() {
int length = 0;
for (final CigarElement element : cigarElements) {
switch (element.getOperator()) {
case M:
case D:
case N:
case EQ:
case X:
length += element.getLength();
break;
default: break;
}
}
return length;
}
/**
* @return The number of reference bases that the read covers, including padding.
*/
public int getPaddedReferenceLength() {
int length = 0;
for (final CigarElement element : cigarElements) {
switch (element.getOperator()) {
case M:
case D:
case N:
case EQ:
case X:
case P:
length += element.getLength();
break;
default: break;
}
}
return length;
}
/**
* @return The number of read bases that the read covers.
*/
public int getReadLength() {
return getReadLength(cigarElements);
}
/**
* @return The number of read bases that the read covers.
*/
public static int getReadLength(final List<CigarElement> cigarElements) {
int length = 0;
for (final CigarElement element : cigarElements) {
if (element.getOperator().consumesReadBases()){
length += element.getLength();
}
}
return length;
}
/**
* Exhaustive validation of CIGAR.
* Note that this method deliberately returns null rather than Collections.emptyList() if there
* are no validation errors, because callers tend to assume that if a non-null list is returned, it is modifiable.
* @param readName For error reporting only. May be null if not known.
* @param recordNumber For error reporting only. May be -1 if not known.
* @return List