[med-svn] [Git][med-team/rna-star][upstream] New upstream version 2.7.11b+dfsg
Lance Lin (@linqigang)
gitlab at salsa.debian.org
Fri Feb 9 12:42:39 GMT 2024
Lance Lin pushed to branch upstream at Debian Med / rna-star
Commits:
bd08540b by Lance Lin at 2024-02-06T20:28:53+07:00
New upstream version 2.7.11b+dfsg
- - - - -
10 changed files:
- CHANGES.md
- README.md
- doc/STARmanual.pdf
- extras/doc-latex/STARmanual.tex
- extras/doc-latex/parametersDefault.tex
- extras/docker/Dockerfile
- source/Parameters.cpp
- source/Parameters.h
- source/VERSION
- source/parametersDefault
Changes:
=====================================
CHANGES.md
=====================================
@@ -1,3 +1,8 @@
+STAR 2.7.11b --- 2024/01/24 ::: Minor in one parameter.
+===========================================
+* Replaced --quantTranscriptomeBan parameter with --quantTranscriptomeSAMoutput with more explicit naming of options. The default behavior is not affected.
+* New option: --quantTranscriptomeSAMoutput BanSingleEnd_ExtendSoftclip : prohibit single-end alignments, extend softclips, allow indels.
+
STAR 2.7.11a --- 2023/08/15 ::: STARdiploid
===========================================
* Implemented STARdiploid option --genomeTransformType Diploid that generates personal diploid genome. At the mapping step, --genomeTransformOutput options will transform the alignments into reference genome coordinates.
=====================================
README.md
=====================================
@@ -1,7 +1,7 @@
-STAR 2.7.11a
+STAR 2.7.11b
==========
Spliced Transcripts Alignment to a Reference
-© Alexander Dobin, 2009-2022
+© Alexander Dobin, 2009-2024
https://www.ncbi.nlm.nih.gov/pubmed/23104886
AUTHOR/SUPPORT
@@ -37,9 +37,9 @@ Download the latest [release from](https://github.com/alexdobin/STAR/releases) a
```bash
# Get latest STAR source from releases
-wget https://github.com/alexdobin/STAR/archive/2.7.11a.tar.gz
-tar -xzf 2.7.11a.tar.gz
-cd STAR-2.7.11a
+wget https://github.com/alexdobin/STAR/archive/2.7.11b.tar.gz
+tar -xzf 2.7.11b.tar.gz
+cd STAR-2.7.11b
# Alternatively, get STAR source using git
git clone https://github.com/alexdobin/STAR.git
=====================================
doc/STARmanual.pdf
=====================================
Binary files a/doc/STARmanual.pdf and b/doc/STARmanual.pdf differ
=====================================
extras/doc-latex/STARmanual.tex
=====================================
@@ -35,7 +35,7 @@
\newcommand{\sechyperref}[1]{\hyperref[#1]{Section \ref{#1}. \nameref{#1}}}
-\title{STAR manual 2.7.11a}
+\title{STAR manual 2.7.11b}
\author{Alexander Dobin\\
dobin at cshl.edu}
\maketitle
=====================================
extras/doc-latex/parametersDefault.tex
=====================================
@@ -66,7 +66,7 @@
\optLine{uint(s){\textgreater}0: genome files exact sizes in bytes. Typically, this should not be defined by the user.}
\optName{genomeTransformOutput}
\optValue{None}
- \optLine{string(s) which output to transform back to original genome}
+ \optLine{string(s): which output to transform back to original genome}
\begin{optOptTable}
\optOpt{SAM} \optOptLine{SAM/BAM alignments}
\optOpt{SJ} \optOptLine{splice junctions (SJ.out.tab)}
@@ -75,7 +75,7 @@
\end{optOptTable}
\optName{genomeChrSetMitochondrial}
\optValue{chrM M MT}
- \optLine{string(s) names of the mitochondrial chromosomes. Presently only used for STARsolo statisics output/}
+ \optLine{string(s): names of the mitochondrial chromosomes. Presently only used for STARsolo statistics output/}
\end{optTable}
\optSection{Genome Indexing Parameters - only used with --runMode genomeGenerate}\label{Genome_Indexing_Parameters_-_only_used_with_--runMode_genomeGenerate}
\begin{optTable}
@@ -107,7 +107,7 @@
\begin{optTable}
\optName{sjdbFileChrStartEnd}
\optValue{-}
- \optLine{string(s): path to the files with genomic coordinates (chr {\textless}tab{\textgreater} start {\textless}tab{\textgreater} end {\textless}tab{\textgreater} strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated.}
+ \optLine{string(s): path to the files with genomic coordinates (chr {\textless}tab{\textgreater} start {\textless}tab{\textgreater} end {\textless}tab{\textgreater} strand) for the splice junction introns. Multiple files can be supplied and will be concatenated.}
\optName{sjdbGTFfile}
\optValue{-}
\optLine{string: path to the GTF file with annotations}
@@ -125,10 +125,10 @@
\optLine{string: GTF attribute name for parent gene ID (default "gene{\textunderscore}id" works for GTF files)}
\optName{sjdbGTFtagExonParentGeneName}
\optValue{gene{\textunderscore}name}
- \optLine{string(s): GTF attrbute name for parent gene name}
+ \optLine{string(s): GTF attribute name for parent gene name}
\optName{sjdbGTFtagExonParentGeneType}
\optValue{gene{\textunderscore}type gene{\textunderscore}biotype}
- \optLine{string(s): GTF attrbute name for parent gene type}
+ \optLine{string(s): GTF attribute name for parent gene type}
\optName{sjdbOverhang}
\optValue{100}
\optLine{int{\textgreater}0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate{\textunderscore}length - 1)}
@@ -240,7 +240,7 @@
\optLine{int{\textgreater}0: maximum available RAM (bytes) for genome generation}
\optName{limitIObufferSize}
\optValue{30000000 50000000}
- \optLine{int{\textgreater}0: max available buffers size (bytes) for input/output, per thread}
+ \optLine{int(s){\textgreater}0: max available buffers size (bytes) for input/output, per thread}
\optName{limitOutSAMoneReadBytes}
\optValue{100000}
\optLine{int{\textgreater}0: max size of the SAM record (bytes) for one read. Recommended value: {\textgreater}(2*(LengthMate1+LengthMate2+100)*outFilterMultimapNmax}
@@ -255,7 +255,7 @@
\optLine{int{\textgreater}=0: maximum available RAM (bytes) for sorting BAM. If =0, it will be set to the genome index size. 0 value can only be used with --genomeLoad NoSharedMemory option.}
\optName{limitSjdbInsertNsj}
\optValue{1000000}
- \optLine{int{\textgreater}=0: maximum number of junction to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run}
+ \optLine{int{\textgreater}=0: maximum number of junctions to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run}
\optName{limitNreadsSoft}
\optValue{-1}
\optLine{int: soft limit on the number of reads}
@@ -268,14 +268,16 @@
\optName{outTmpDir}
\optValue{-}
\optLine{string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed!}
- \optLine{- the temp directory will default to outFileNamePrefix{\textunderscore}STARtmp}
+\begin{optOptTable}
+ \optOpt{-} \optOptLine{the temp directory will default to outFileNamePrefix{\textunderscore}STARtmp}
+\end{optOptTable}
\optName{outTmpKeep}
\optValue{None}
- \optLine{string: whether to keep the tempporary files after STAR runs is finished}
+ \optLine{string: whether to keep the temporary files after STAR runs is finished}
\begin{optOptTable}
\optOpt{None} \optOptLine{remove all temporary files}
+ \optOpt{All} \optOptLine{keep all files}
\end{optOptTable}
- \optLine{All .. keep all files}
\optName{outStd}
\optValue{Log}
\optLine{string: which output will be directed to stdout (standard out)}
@@ -337,7 +339,7 @@
\end{optOptTable}
\optName{outSAMattributes}
\optValue{Standard}
- \optLine{string: a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.}
+ \optLine{string(s): a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.}
\optLine{***Presets:}
\begin{optOptTable}
\optOpt{None} \optOptLine{no attributes}
@@ -468,7 +470,7 @@
\optLine{int: {\textgreater}=0: number of threads for BAM sorting. 0 will default to min(6,--runThreadN).}
\optName{outBAMsortingBinsN}
\optValue{50}
- \optLine{int: {\textgreater}0: number of genome bins fo coordinate-sorting}
+ \optLine{int: {\textgreater}0: number of genome bins for coordinate-sorting}
\end{optTable}
\optSection{BAM processing}\label{BAM_processing}
\begin{optTable}
@@ -548,7 +550,7 @@
\optLine{int: alignment will be output only if its score is higher than or equal to this value.}
\optName{outFilterScoreMinOverLread}
\optValue{0.66}
- \optLine{real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)}
+ \optLine{real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)}
\optName{outFilterMatchNmin}
\optValue{0}
\optLine{int: alignment will be output only if the number of matched bases is higher than or equal to this value.}
@@ -624,28 +626,28 @@
\optLine{int: non-canonical junction penalty (in addition to scoreGap)}
\optName{scoreGapGCAG}
\optValue{-4}
- \optLine{GC/AG and CT/GC junction penalty (in addition to scoreGap)}
+ \optLine{int: GC/AG and CT/GC junction penalty (in addition to scoreGap)}
\optName{scoreGapATAC}
\optValue{-8}
- \optLine{AT/AC and GT/AT junction penalty (in addition to scoreGap)}
+ \optLine{int: AT/AC and GT/AT junction penalty (in addition to scoreGap)}
\optName{scoreGenomicLengthLog2scale}
\optValue{-0.25}
- \optLine{extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)}
+ \optLine{int: extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)}
\optName{scoreDelOpen}
\optValue{-2}
- \optLine{deletion open penalty}
+ \optLine{int: deletion open penalty}
\optName{scoreDelBase}
\optValue{-2}
- \optLine{deletion extension penalty per base (in addition to scoreDelOpen)}
+ \optLine{int: deletion extension penalty per base (in addition to scoreDelOpen)}
\optName{scoreInsOpen}
\optValue{-2}
- \optLine{insertion open penalty}
+ \optLine{int: insertion open penalty}
\optName{scoreInsBase}
\optValue{-2}
- \optLine{insertion extension penalty per base (in addition to scoreInsOpen)}
+ \optLine{int: insertion extension penalty per base (in addition to scoreInsOpen)}
\optName{scoreStitchSJshift}
\optValue{1}
- \optLine{maximum score reduction while searching for SJ boundaries in the stitching step}
+ \optLine{int: maximum score reduction while searching for SJ boundaries in the stitching step}
\end{optTable}
\optSection{Alignments and Seeding}\label{Alignments_and_Seeding}
\begin{optTable}
@@ -678,13 +680,13 @@
\optLine{int{\textgreater}0: min length of seeds to be mapped}
\optName{alignIntronMin}
\optValue{21}
- \optLine{minimum intron size: genomic gap is considered intron if its length{\textgreater}=alignIntronMin, otherwise it is considered Deletion}
+ \optLine{int: minimum intron size, genomic gap is considered intron if its length{\textgreater}=alignIntronMin, otherwise it is considered Deletion}
\optName{alignIntronMax}
\optValue{0}
- \optLine{maximum intron size, if 0, max intron size will be determined by (2\^{}winBinNbits)*winAnchorDistNbins}
+ \optLine{int: maximum intron size, if 0, max intron size will be determined by (2\^{}winBinNbits)*winAnchorDistNbins}
\optName{alignMatesGapMax}
\optValue{0}
- \optLine{maximum gap between two mates, if 0, max intron gap will be determined by (2\^{}winBinNbits)*winAnchorDistNbins}
+ \optLine{int: maximum gap between two mates, if 0, max intron gap will be determined by (2\^{}winBinNbits)*winAnchorDistNbins}
\optName{alignSJoverhangMin}
\optValue{5}
\optLine{int{\textgreater}0: minimum overhang (i.e. block size) for spliced alignments}
@@ -747,7 +749,7 @@
\begin{optTable}
\optName{peOverlapNbasesMin}
\optValue{0}
- \optLine{int{\textgreater}=0: minimum number of overlap bases to trigger mates merging and realignment. Specify {\textgreater}0 value to switch on the "merginf of overlapping mates" algorithm.}
+ \optLine{int{\textgreater}=0: minimum number of overlapping bases to trigger mates merging and realignment. Specify {\textgreater}0 value to switch on the "merginf of overlapping mates" algorithm.}
\optName{peOverlapMMp}
\optValue{0.01}
\optLine{real, {\textgreater}=0 {\&} {\textless}1: maximum proportion of mismatched bases in the overlap area}
@@ -847,7 +849,7 @@
\optOpt{GeneCounts} \optOptLine{count reads per gene}
\end{optOptTable}
\optName{quantTranscriptomeBAMcompression}
- \optValue{1 1}
+ \optValue{1}
\optLine{int: -2 to 10 transcriptome BAM compression level}
\begin{optOptTable}
\optOpt{-2} \optOptLine{no BAM output}
@@ -855,12 +857,13 @@
\optOpt{0} \optOptLine{no compression}
\optOpt{10} \optOptLine{maximum compression}
\end{optOptTable}
-\optName{quantTranscriptomeBan}
- \optValue{IndelSoftclipSingleend}
- \optLine{string: prohibit various alignment type}
+\optName{quantTranscriptomeSAMoutput}
+ \optValue{BanSingleEnd{\textunderscore}BanIndels{\textunderscore}ExtendSoftclip}
+ \optLine{string: alignment filtering for TranscriptomeSAM output}
\begin{optOptTable}
- \optOpt{IndelSoftclipSingleend} \optOptLine{prohibit indels, soft clipping and single-end alignments - compatible with RSEM}
- \optOpt{Singleend} \optOptLine{prohibit single-end alignments}
+ \optOpt{BanSingleEnd{\textunderscore}BanIndels{\textunderscore}ExtendSoftclip} \optOptLine{prohibit indels and single-end alignments, extend softclips - compatible with RSEM}
+ \optOpt{BanSingleEnd} \optOptLine{prohibit single-end alignments, allow indels and softclips}
+ \optOpt{BanSingleEnd{\textunderscore}ExtendSoftclip} \optOptLine{prohibit single-end alignments, extend softclips, allow indels}
\end{optOptTable}
\end{optTable}
\optSection{2-pass Mapping}\label{2-pass_Mapping}
@@ -936,7 +939,7 @@
\end{optOptTable}
\optName{soloCBposition}
\optValue{-}
- \optLine{strings(s) position of Cell Barcode(s) on the barcode read.}
+ \optLine{strings(s): position of Cell Barcode(s) on the barcode read.}
\optLine{Presently only works with --soloType CB{\textunderscore}UMI{\textunderscore}Complex, and barcodes are assumed to be on Read2.}
\optLine{Format for each barcode: startAnchor{\textunderscore}startPosition{\textunderscore}endAnchor{\textunderscore}endPosition}
\optLine{start(end)Anchor defines the Anchor Base for the CB: 0: read start; 1: read end; 2: adapter start; 3: adapter end}
@@ -946,7 +949,7 @@
\optLine{--soloCBposition 0{\textunderscore}0{\textunderscore}2{\textunderscore}-1 3{\textunderscore}1{\textunderscore}3{\textunderscore}8}
\optName{soloUMIposition}
\optValue{-}
- \optLine{string position of the UMI on the barcode read, same as soloCBposition}
+ \optLine{string: position of the UMI on the barcode read, same as soloCBposition}
\optLine{Example: inDrop (Zilionis et al, Nat. Protocols, 2017):}
\optLine{--soloCBposition 3{\textunderscore}9{\textunderscore}3{\textunderscore}14}
\optName{soloAdapterSequence}
@@ -1020,7 +1023,7 @@
\end{optOptTable}
\optName{soloUMIfiltering}
\optValue{-}
- \optLine{string(s) type of UMI filtering (for reads uniquely mapping to genes)}
+ \optLine{string(s): type of UMI filtering (for reads uniquely mapping to genes)}
\begin{optOptTable}
\optOpt{-} \optOptLine{basic filtering: remove UMIs with N and homopolymers (similar to CellRanger 2.2.0).}
\optOpt{MultiGeneUMI} \optOptLine{basic + remove lower-count UMIs that map to more than one gene.}
@@ -1030,7 +1033,7 @@
\optLine{Only works with --soloUMIdedup 1MM{\textunderscore}CR}
\optName{soloOutFileNames}
\optValue{Solo.out/ features.tsv barcodes.tsv matrix.mtx}
- \optLine{string(s) file names for STARsolo output:}
+ \optLine{string(s): file names for STARsolo output:}
\optLine{file{\textunderscore}name{\textunderscore}prefix gene{\textunderscore}names barcode{\textunderscore}sequences cell{\textunderscore}feature{\textunderscore}count{\textunderscore}matrix}
\optName{soloCellFilter}
\optValue{CellRanger2.2 3000 0.99 10}
=====================================
extras/docker/Dockerfile
=====================================
@@ -2,7 +2,7 @@ FROM debian:stable-slim
MAINTAINER dobin at cshl.edu
-ARG STAR_VERSION=2.7.11a
+ARG STAR_VERSION=2.7.11b
ENV PACKAGES gcc g++ make wget zlib1g-dev unzip
=====================================
source/Parameters.cpp
=====================================
@@ -261,7 +261,7 @@ Parameters::Parameters() {//initalize parameters info
//quant
parArray.push_back(new ParameterInfoVector <string> (-1, -1, "quantMode", &quant.mode));
parArray.push_back(new ParameterInfoScalar <int> (-1, -1, "quantTranscriptomeBAMcompression", &quant.trSAM.bamCompression));
- parArray.push_back(new ParameterInfoScalar <string> (-1, -1, "quantTranscriptomeBan", &quant.trSAM.ban));
+ parArray.push_back(new ParameterInfoScalar <string> (-1, -1, "quantTranscriptomeSAMoutput", &quant.trSAM.output));
//2-pass
parArray.push_back(new ParameterInfoScalar <uint> (-1, -1, "twopass1readsN", &twoPass.pass1readsN));
@@ -912,14 +912,18 @@ void Parameters::inputParameters (int argInN, char* argIn[]) {//input parameters
};
inOut->outQuantBAMfile=bgzf_open(outQuantBAMfileName.c_str(),("w"+to_string((long long) quant.trSAM.bamCompression)).c_str());
};
- if (quant.trSAM.ban=="IndelSoftclipSingleend") {
+ if (quant.trSAM.output=="BanSingleEnd_BanIndels_ExtendSoftclip") {
quant.trSAM.indel=false;
quant.trSAM.softClip=false;
quant.trSAM.singleEnd=false;
- } else if (quant.trSAM.ban=="Singleend") {
+ } else if (quant.trSAM.output=="BanSingleEnd") {
quant.trSAM.indel=true;
quant.trSAM.softClip=true;
quant.trSAM.singleEnd=false;
+ } else if (quant.trSAM.output=="BanSingleEnd_ExtendSoftclip") {
+ quant.trSAM.indel=true;
+ quant.trSAM.softClip=false;
+ quant.trSAM.singleEnd=false;
};
} else if (quant.mode.at(ii)=="GeneCounts") {
quant.geCount.yes=true;
=====================================
source/Parameters.h
=====================================
@@ -301,7 +301,7 @@ class Parameters {
bool softClip;
bool singleEnd;
int bamCompression;
- string ban;
+ string output;
} trSAM;
struct {
=====================================
source/VERSION
=====================================
@@ -1 +1 @@
-#define STAR_VERSION "2.7.11a"
+#define STAR_VERSION "2.7.11b"
=====================================
source/parametersDefault
=====================================
@@ -55,14 +55,14 @@ genomeFileSizes 0
uint(s)>0: genome files exact sizes in bytes. Typically, this should not be defined by the user.
genomeTransformOutput None
- string(s) which output to transform back to original genome
+ string(s): which output to transform back to original genome
SAM ... SAM/BAM alignments
SJ ... splice junctions (SJ.out.tab)
Quant ... quantifications (from --quantMode option)
None ... no transformation of the output
genomeChrSetMitochondrial chrM M MT
- string(s) names of the mitochondrial chromosomes. Presently only used for STARsolo statisics output/
+ string(s): names of the mitochondrial chromosomes. Presently only used for STARsolo statistics output/
### Genome Indexing Parameters - only used with --runMode genomeGenerate
genomeChrBinNbits 18
@@ -105,7 +105,7 @@ genomeType Full
### Splice Junctions Database
sjdbFileChrStartEnd -
- string(s): path to the files with genomic coordinates (chr <tab> start <tab> end <tab> strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated.
+ string(s): path to the files with genomic coordinates (chr <tab> start <tab> end <tab> strand) for the splice junction introns. Multiple files can be supplied and will be concatenated.
sjdbGTFfile -
string: path to the GTF file with annotations
@@ -123,10 +123,10 @@ sjdbGTFtagExonParentGene gene_id
string: GTF attribute name for parent gene ID (default "gene_id" works for GTF files)
sjdbGTFtagExonParentGeneName gene_name
- string(s): GTF attrbute name for parent gene name
+ string(s): GTF attribute name for parent gene name
sjdbGTFtagExonParentGeneType gene_type gene_biotype
- string(s): GTF attrbute name for parent gene type
+ string(s): GTF attribute name for parent gene type
sjdbOverhang 100
int>0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate_length - 1)
@@ -230,7 +230,7 @@ limitGenomeGenerateRAM 31000000000
int>0: maximum available RAM (bytes) for genome generation
limitIObufferSize 30000000 50000000
- int>0: max available buffers size (bytes) for input/output, per thread
+ int(s)>0: max available buffers size (bytes) for input/output, per thread
limitOutSAMoneReadBytes 100000
int>0: max size of the SAM record (bytes) for one read. Recommended value: >(2*(LengthMate1+LengthMate2+100)*outFilterMultimapNmax
@@ -245,7 +245,7 @@ limitBAMsortRAM 0
int>=0: maximum available RAM (bytes) for sorting BAM. If =0, it will be set to the genome index size. 0 value can only be used with --genomeLoad NoSharedMemory option.
limitSjdbInsertNsj 1000000
- int>=0: maximum number of junction to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run
+ int>=0: maximum number of junctions to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run
limitNreadsSoft -1
int: soft limit on the number of reads
@@ -256,12 +256,12 @@ outFileNamePrefix ./
outTmpDir -
string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed!
- - the temp directory will default to outFileNamePrefix_STARtmp
+ - ... the temp directory will default to outFileNamePrefix_STARtmp
outTmpKeep None
- string: whether to keep the tempporary files after STAR runs is finished
+ string: whether to keep the temporary files after STAR runs is finished
None ... remove all temporary files
- All .. keep all files
+ All ... keep all files
outStd Log
string: which output will be directed to stdout (standard out)
@@ -307,7 +307,7 @@ outSAMstrandField None
intronMotif ... strand derived from the intron motif. This option changes the output alignments: reads with inconsistent and/or non-canonical introns are filtered out.
outSAMattributes Standard
- string: a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.
+ string(s): a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.
***Presets:
None ... no attributes
Standard ... NH HI AS nM
@@ -415,7 +415,7 @@ outBAMsortingThreadN 0
int: >=0: number of threads for BAM sorting. 0 will default to min(6,--runThreadN).
outBAMsortingBinsN 50
- int: >0: number of genome bins fo coordinate-sorting
+ int: >0: number of genome bins for coordinate-sorting
### BAM processing
bamRemoveDuplicatesType -
@@ -478,7 +478,7 @@ outFilterScoreMin 0
int: alignment will be output only if its score is higher than or equal to this value.
outFilterScoreMinOverLread 0.66
- real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)
+ real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)
outFilterMatchNmin 0
int: alignment will be output only if the number of matched bases is higher than or equal to this value.
@@ -540,28 +540,28 @@ scoreGapNoncan -8
int: non-canonical junction penalty (in addition to scoreGap)
scoreGapGCAG -4
- GC/AG and CT/GC junction penalty (in addition to scoreGap)
+ int: GC/AG and CT/GC junction penalty (in addition to scoreGap)
scoreGapATAC -8
- AT/AC and GT/AT junction penalty (in addition to scoreGap)
+ int: AT/AC and GT/AT junction penalty (in addition to scoreGap)
scoreGenomicLengthLog2scale -0.25
- extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)
+ int: extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)
scoreDelOpen -2
- deletion open penalty
+ int: deletion open penalty
scoreDelBase -2
- deletion extension penalty per base (in addition to scoreDelOpen)
+ int: deletion extension penalty per base (in addition to scoreDelOpen)
scoreInsOpen -2
- insertion open penalty
+ int: insertion open penalty
scoreInsBase -2
- insertion extension penalty per base (in addition to scoreInsOpen)
+ int: insertion extension penalty per base (in addition to scoreInsOpen)
scoreStitchSJshift 1
- maximum score reduction while searching for SJ boundaries in the stitching step
+ int: maximum score reduction while searching for SJ boundaries in the stitching step
### Alignments and Seeding
@@ -594,13 +594,13 @@ seedMapMin 5
int>0: min length of seeds to be mapped
alignIntronMin 21
- minimum intron size: genomic gap is considered intron if its length>=alignIntronMin, otherwise it is considered Deletion
+ int: minimum intron size, genomic gap is considered intron if its length>=alignIntronMin, otherwise it is considered Deletion
alignIntronMax 0
- maximum intron size, if 0, max intron size will be determined by (2^winBinNbits)*winAnchorDistNbins
+ int: maximum intron size, if 0, max intron size will be determined by (2^winBinNbits)*winAnchorDistNbins
alignMatesGapMax 0
- maximum gap between two mates, if 0, max intron gap will be determined by (2^winBinNbits)*winAnchorDistNbins
+ int: maximum gap between two mates, if 0, max intron gap will be determined by (2^winBinNbits)*winAnchorDistNbins
alignSJoverhangMin 5
int>0: minimum overhang (i.e. block size) for spliced alignments
@@ -653,7 +653,7 @@ alignInsertionFlush None
### Paired-End reads
peOverlapNbasesMin 0
- int>=0: minimum number of overlap bases to trigger mates merging and realignment. Specify >0 value to switch on the "merginf of overlapping mates" algorithm.
+ int>=0: minimum number of overlapping bases to trigger mates merging and realignment. Specify >0 value to switch on the "merginf of overlapping mates" algorithm.
peOverlapMMp 0.01
real, >=0 & <1: maximum proportion of mismatched bases in the overlap area
@@ -738,17 +738,19 @@ quantMode -
TranscriptomeSAM ... output SAM/BAM alignments to transcriptome into a separate file
GeneCounts ... count reads per gene
-quantTranscriptomeBAMcompression 1 1
+quantTranscriptomeBAMcompression 1
int: -2 to 10 transcriptome BAM compression level
-2 ... no BAM output
-1 ... default compression (6?)
0 ... no compression
10 ... maximum compression
-quantTranscriptomeBan IndelSoftclipSingleend
- string: prohibit various alignment type
- IndelSoftclipSingleend ... prohibit indels, soft clipping and single-end alignments - compatible with RSEM
- Singleend ... prohibit single-end alignments
+quantTranscriptomeSAMoutput BanSingleEnd_BanIndels_ExtendSoftclip
+ string: alignment filtering for TranscriptomeSAM output
+ BanSingleEnd_BanIndels_ExtendSoftclip ... prohibit indels and single-end alignments, extend softclips - compatible with RSEM
+ BanSingleEnd ... prohibit single-end alignments, allow indels and softclips
+ BanSingleEnd_ExtendSoftclip ... prohibit single-end alignments, extend softclips, allow indels
+
### 2-pass Mapping
twopassMode None
@@ -799,14 +801,14 @@ soloBarcodeReadLength 1
1 ... equal to sum of soloCBlen+soloUMIlen
0 ... not defined, do not check
-soloBarcodeMate 0
+soloBarcodeMate 0
int: identifies which read mate contains the barcode (CB+UMI) sequence
0 ... barcode sequence is on separate read, which should always be the last file in the --readFilesIn listed
1 ... barcode sequence is a part of mate 1
2 ... barcode sequence is a part of mate 2
soloCBposition -
- strings(s) position of Cell Barcode(s) on the barcode read.
+ strings(s): position of Cell Barcode(s) on the barcode read.
Presently only works with --soloType CB_UMI_Complex, and barcodes are assumed to be on Read2.
Format for each barcode: startAnchor_startPosition_endAnchor_endPosition
start(end)Anchor defines the Anchor Base for the CB: 0: read start; 1: read end; 2: adapter start; 3: adapter end
@@ -816,7 +818,7 @@ soloCBposition -
--soloCBposition 0_0_2_-1 3_1_3_8
soloUMIposition -
- string position of the UMI on the barcode read, same as soloCBposition
+ string: position of the UMI on the barcode read, same as soloCBposition
Example: inDrop (Zilionis et al, Nat. Protocols, 2017):
--soloCBposition 3_9_3_14
@@ -882,7 +884,7 @@ soloUMIdedup 1MM_All
1MM_CR ... CellRanger2-4 algorithm for 1MM UMI collapsing.
soloUMIfiltering -
- string(s) type of UMI filtering (for reads uniquely mapping to genes)
+ string(s): type of UMI filtering (for reads uniquely mapping to genes)
- ... basic filtering: remove UMIs with N and homopolymers (similar to CellRanger 2.2.0).
MultiGeneUMI ... basic + remove lower-count UMIs that map to more than one gene.
MultiGeneUMI_All ... basic + remove all UMIs that map to more than one gene.
@@ -890,7 +892,7 @@ soloUMIfiltering -
Only works with --soloUMIdedup 1MM_CR
soloOutFileNames Solo.out/ features.tsv barcodes.tsv matrix.mtx
- string(s) file names for STARsolo output:
+ string(s): file names for STARsolo output:
file_name_prefix gene_names barcode_sequences cell_feature_count_matrix
soloCellFilter CellRanger2.2 3000 0.99 10
View it on GitLab: https://salsa.debian.org/med-team/rna-star/-/commit/bd08540b37b705089082631a72c1ac0344883454
--
View it on GitLab: https://salsa.debian.org/med-team/rna-star/-/commit/bd08540b37b705089082631a72c1ac0344883454
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240209/3f478c02/attachment-0001.htm>
More information about the debian-med-commit
mailing list