[med-svn] [Git][med-team/rna-star][master] 6 commits: routine-update: New upstream version

Lance Lin (@linqigang) gitlab at salsa.debian.org
Fri Feb 9 12:42:22 GMT 2024



Lance Lin pushed to branch master at Debian Med / rna-star


Commits:
b44f86bf by Lance Lin at 2024-02-06T20:28:46+07:00
routine-update: New upstream version

- - - - -
bd08540b by Lance Lin at 2024-02-06T20:28:53+07:00
New upstream version 2.7.11b+dfsg
- - - - -
6563d9ba by Lance Lin at 2024-02-06T20:28:54+07:00
Update upstream source from tag 'upstream/2.7.11b+dfsg'

Update to upstream version '2.7.11b+dfsg'
with Debian dir 4aa992f92c187d78b4d004f73e4ad75f3694c4f6
- - - - -
86c1f61a by Lance Lin at 2024-02-08T19:53:13+07:00
d/rules: Remove override_dh_auto_clean

- - - - -
27eab422 by Lance Lin at 2024-02-09T19:39:43+07:00
d/clean: Remove generated files (Closes: #1045634)

- - - - -
49493251 by Lance Lin at 2024-02-09T19:40:33+07:00
d/patches: Add patch to fix typos

- - - - -


15 changed files:

- CHANGES.md
- README.md
- debian/changelog
- debian/clean
- + debian/patches/fix_typos.patch
- debian/patches/series
- debian/rules
- doc/STARmanual.pdf
- extras/doc-latex/STARmanual.tex
- extras/doc-latex/parametersDefault.tex
- extras/docker/Dockerfile
- source/Parameters.cpp
- source/Parameters.h
- source/VERSION
- source/parametersDefault


Changes:

=====================================
CHANGES.md
=====================================
@@ -1,3 +1,8 @@
+STAR 2.7.11b --- 2024/01/24 ::: Minor in one parameter.
+===========================================
+* Replaced --quantTranscriptomeBan parameter with --quantTranscriptomeSAMoutput with more explicit naming of options. The default behavior is not affected.
+* New option: --quantTranscriptomeSAMoutput BanSingleEnd_ExtendSoftclip : prohibit single-end alignments, extend softclips, allow indels.
+
 STAR 2.7.11a --- 2023/08/15 ::: STARdiploid
 ===========================================
 * Implemented STARdiploid option --genomeTransformType Diploid that generates personal diploid genome. At the mapping step, --genomeTransformOutput options will transform the alignments into reference genome coordinates.


=====================================
README.md
=====================================
@@ -1,7 +1,7 @@
-STAR 2.7.11a
+STAR 2.7.11b
 ==========
 Spliced Transcripts Alignment to a Reference
-© Alexander Dobin, 2009-2022
+© Alexander Dobin, 2009-2024
 https://www.ncbi.nlm.nih.gov/pubmed/23104886
 
 AUTHOR/SUPPORT
@@ -37,9 +37,9 @@ Download the latest [release from](https://github.com/alexdobin/STAR/releases) a
 
 ```bash
 # Get latest STAR source from releases
-wget https://github.com/alexdobin/STAR/archive/2.7.11a.tar.gz
-tar -xzf 2.7.11a.tar.gz
-cd STAR-2.7.11a
+wget https://github.com/alexdobin/STAR/archive/2.7.11b.tar.gz
+tar -xzf 2.7.11b.tar.gz
+cd STAR-2.7.11b
 
 # Alternatively, get STAR source using git
 git clone https://github.com/alexdobin/STAR.git


=====================================
debian/changelog
=====================================
@@ -1,3 +1,13 @@
+rna-star (2.7.11b+dfsg-1) UNRELEASED; urgency=medium
+
+  * Team upload.
+  * New upstream version
+  * d/rules: Remove override_dh_auto_clean
+  * d/clean: Remove generated files (Closes: #1045634)
+  * d/patches: Add patch to fix typos
+
+ -- Lance Lin <lq27267 at gmail.com>  Tue, 06 Feb 2024 20:28:46 +0700
+
 rna-star (2.7.11a+dfsg-1) unstable; urgency=medium
 
   [ sangmeng ]


=====================================
debian/clean
=====================================
@@ -1,2 +1,4 @@
 source/parametersDefault.xxd
 opal/opal.o
+source/STAR-*
+source/STARlong-*


=====================================
debian/patches/fix_typos.patch
=====================================
@@ -0,0 +1,62 @@
+Description: Fix typos that extend into generated binaries
+Author: Lance Lin <lq27267 at gmail.com>
+Date: 08 Feb 2024
+
+--- a/docs/STARsolo.md
++++ b/docs/STARsolo.md
+@@ -448,10 +448,10 @@
+                             TopCells        ... only report top cells by UMI count, followed by the exact number of cells
+                             CellRanger2.2   ... simple filtering of CellRanger 2.2.
+                                                 Can be followed by numbers: number of expected cells, robust maximum percentile for UMI count, maximum to minimum ratio for UMI count
+-                                                The harcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10
++                                                The hardcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10
+                             EmptyDrops_CR   ... EmptyDrops filtering in CellRanger flavor. Please cite the original EmptyDrops paper: A.T.L Lun et al, Genome Biology, 20, 63 (2019): https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1662-y
+                                                 Can be followed by 10 numeric parameters:  nExpectedCells   maxPercentile   maxMinRatio   indMin   indMax   umiMin   umiMinFracMedian   candMaxN   FDR   simN
+-                                                The harcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000
++                                                The hardcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000
+ 
+ soloOutFormatFeaturesGeneField3	"Gene Expression"
+ 	string(s):				field 3 in the Gene features.tsv file. If "-", then no 3rd field is output.
+--- a/extras/doc-latex/parametersDefault.tex
++++ b/extras/doc-latex/parametersDefault.tex
+@@ -1044,12 +1044,12 @@
+   \optOpt{CellRanger2.2}   \optOptLine{simple filtering of CellRanger 2.2.}
+ \end{optOptTable}
+   \optLine{Can be followed by numbers: number of expected cells, robust maximum percentile for UMI count, maximum to minimum ratio for UMI count} 
+-  \optLine{The harcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10} 
++  \optLine{The hardcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10} 
+ \begin{optOptTable}
+   \optOpt{EmptyDrops{\textunderscore}CR}   \optOptLine{EmptyDrops filtering in CellRanger flavor. Please cite the original EmptyDrops paper: A.T.L Lun et al, Genome Biology, 20, 63 (2019): https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1662-y}
+ \end{optOptTable}
+   \optLine{Can be followed by 10 numeric parameters:  nExpectedCells   maxPercentile   maxMinRatio   indMin   indMax   umiMin   umiMinFracMedian   candMaxN   FDR   simN } 
+-  \optLine{The harcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000} 
++  \optLine{The hardcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000} 
+ \optName{soloOutFormatFeaturesGeneField3}
+   \optValue{"Gene Expression"}
+   \optLine{string(s):                field 3 in the Gene features.tsv file. If "-", then no 3rd field is output.} 
+--- a/source/parametersDefault
++++ b/source/parametersDefault
+@@ -901,10 +901,10 @@
+                             TopCells        ... only report top cells by UMI count, followed by the exact number of cells
+                             CellRanger2.2   ... simple filtering of CellRanger 2.2. 
+                                                 Can be followed by numbers: number of expected cells, robust maximum percentile for UMI count, maximum to minimum ratio for UMI count
+-                                                The harcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10
++                                                The hardcoded values are from CellRanger: nExpectedCells=3000;  maxPercentile=0.99;  maxMinRatio=10
+                             EmptyDrops_CR   ... EmptyDrops filtering in CellRanger flavor. Please cite the original EmptyDrops paper: A.T.L Lun et al, Genome Biology, 20, 63 (2019): https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1662-y
+                                                 Can be followed by 10 numeric parameters:  nExpectedCells   maxPercentile   maxMinRatio   indMin   indMax   umiMin   umiMinFracMedian   candMaxN   FDR   simN 
+-                                                The harcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000
++                                                The hardcoded values are from CellRanger:             3000            0.99            10    45000    90000      500               0.01      20000  0.01  10000
+ 
+ soloOutFormatFeaturesGeneField3    "Gene Expression"
+     string(s):                field 3 in the Gene features.tsv file. If "-", then no 3rd field is output.
+--- a/source/Parameters_openReadsFiles.cpp
++++ b/source/Parameters_openReadsFiles.cpp
+@@ -60,7 +60,7 @@
+ 					if (!rftry.good()){
+ 						exitWithError("EXITING: because of fatal INPUT file error: could not open read file: " + \
+ 									   readFilesNames[imate][ifile] + \
+-									   "\nSOLUTION: check that this file exists and has read permision.\n", \
++									   "\nSOLUTION: check that this file exists and has read permission.\n", \
+ 									   std::cerr, inOut->logMain, EXIT_CODE_PARAMETER, *this);
+ 					};
+ 					rftry.close();


=====================================
debian/patches/series
=====================================
@@ -5,3 +5,4 @@ mathRoutinesNotInScope.patch
 do-not-enforce-avx2.patch
 gcc-12.patch
 packaged_simde
+fix_typos.patch


=====================================
debian/rules
=====================================
@@ -41,9 +41,6 @@ else
 	dh_install source/STAR source/STARlong usr/bin/
 endif
 
-override_dh_auto_clean:
-	cd source && ${MAKE} clean
-
 override_dh_compress:
 	dh_compress --exclude=.pdf
 


=====================================
doc/STARmanual.pdf
=====================================
Binary files a/doc/STARmanual.pdf and b/doc/STARmanual.pdf differ


=====================================
extras/doc-latex/STARmanual.tex
=====================================
@@ -35,7 +35,7 @@
 
 \newcommand{\sechyperref}[1]{\hyperref[#1]{Section \ref{#1}. \nameref{#1}}}
 
-\title{STAR manual 2.7.11a}
+\title{STAR manual 2.7.11b}
 \author{Alexander Dobin\\
 dobin at cshl.edu}
 \maketitle


=====================================
extras/doc-latex/parametersDefault.tex
=====================================
@@ -66,7 +66,7 @@
   \optLine{uint(s){\textgreater}0: genome files exact sizes in bytes. Typically, this should not be defined by the user.} 
 \optName{genomeTransformOutput}
   \optValue{None}
-  \optLine{string(s)               which output to transform back to original genome} 
+  \optLine{string(s):              which output to transform back to original genome} 
 \begin{optOptTable}
   \optOpt{SAM}   \optOptLine{SAM/BAM alignments}
   \optOpt{SJ}   \optOptLine{splice junctions (SJ.out.tab)}
@@ -75,7 +75,7 @@
 \end{optOptTable}
 \optName{genomeChrSetMitochondrial}
   \optValue{chrM M MT}
-  \optLine{string(s)               names of the mitochondrial chromosomes. Presently only used for STARsolo statisics output/} 
+  \optLine{string(s):              names of the mitochondrial chromosomes. Presently only used for STARsolo statistics output/} 
 \end{optTable}
 \optSection{Genome Indexing Parameters - only used with --runMode genomeGenerate}\label{Genome_Indexing_Parameters_-_only_used_with_--runMode_genomeGenerate}
 \begin{optTable}
@@ -107,7 +107,7 @@
 \begin{optTable}
 \optName{sjdbFileChrStartEnd}
   \optValue{-}
-  \optLine{string(s): path to the files with genomic coordinates (chr {\textless}tab{\textgreater} start {\textless}tab{\textgreater} end {\textless}tab{\textgreater} strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated.} 
+  \optLine{string(s): path to the files with genomic coordinates (chr {\textless}tab{\textgreater} start {\textless}tab{\textgreater} end {\textless}tab{\textgreater} strand) for the splice junction introns. Multiple files can be supplied and will be concatenated.} 
 \optName{sjdbGTFfile}
   \optValue{-}
   \optLine{string: path to the GTF file with annotations} 
@@ -125,10 +125,10 @@
   \optLine{string: GTF attribute name for parent gene ID (default "gene{\textunderscore}id" works for GTF files)} 
 \optName{sjdbGTFtagExonParentGeneName}
   \optValue{gene{\textunderscore}name}
-  \optLine{string(s): GTF attrbute name for parent gene name} 
+  \optLine{string(s): GTF attribute name for parent gene name} 
 \optName{sjdbGTFtagExonParentGeneType}
   \optValue{gene{\textunderscore}type gene{\textunderscore}biotype}
-  \optLine{string(s): GTF attrbute name for parent gene type} 
+  \optLine{string(s): GTF attribute name for parent gene type} 
 \optName{sjdbOverhang}
   \optValue{100}
   \optLine{int{\textgreater}0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate{\textunderscore}length - 1)} 
@@ -240,7 +240,7 @@
   \optLine{int{\textgreater}0: maximum available RAM (bytes) for genome generation} 
 \optName{limitIObufferSize}
   \optValue{30000000 50000000}
-  \optLine{int{\textgreater}0: max available buffers size (bytes) for input/output, per thread} 
+  \optLine{int(s){\textgreater}0: max available buffers size (bytes) for input/output, per thread} 
 \optName{limitOutSAMoneReadBytes}
   \optValue{100000}
   \optLine{int{\textgreater}0: max size of the SAM record (bytes) for one read. Recommended value: {\textgreater}(2*(LengthMate1+LengthMate2+100)*outFilterMultimapNmax} 
@@ -255,7 +255,7 @@
   \optLine{int{\textgreater}=0: maximum available RAM (bytes) for sorting BAM. If =0, it will be set to the genome index size. 0 value can only be used with --genomeLoad NoSharedMemory option.} 
 \optName{limitSjdbInsertNsj}
   \optValue{1000000}
-  \optLine{int{\textgreater}=0: maximum number of junction to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run} 
+  \optLine{int{\textgreater}=0: maximum number of junctions to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run} 
 \optName{limitNreadsSoft}
   \optValue{-1}
   \optLine{int: soft limit on the number of reads} 
@@ -268,14 +268,16 @@
 \optName{outTmpDir}
   \optValue{-}
   \optLine{string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed!} 
-  \optLine{- the temp directory will default to outFileNamePrefix{\textunderscore}STARtmp} 
+\begin{optOptTable}
+  \optOpt{-}   \optOptLine{the temp directory will default to outFileNamePrefix{\textunderscore}STARtmp}
+\end{optOptTable}
 \optName{outTmpKeep}
   \optValue{None}
-  \optLine{string: whether to keep the tempporary files after STAR runs is finished} 
+  \optLine{string: whether to keep the temporary files after STAR runs is finished} 
 \begin{optOptTable}
   \optOpt{None}   \optOptLine{remove all temporary files}
+  \optOpt{All}   \optOptLine{keep all files}
 \end{optOptTable}
-  \optLine{All .. keep all files} 
 \optName{outStd}
   \optValue{Log}
   \optLine{string: which output will be directed to stdout (standard out)} 
@@ -337,7 +339,7 @@
 \end{optOptTable}
 \optName{outSAMattributes}
   \optValue{Standard}
-  \optLine{string: a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.} 
+  \optLine{string(s): a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.} 
   \optLine{***Presets:} 
 \begin{optOptTable}
   \optOpt{None}   \optOptLine{no attributes}
@@ -468,7 +470,7 @@
   \optLine{int: {\textgreater}=0: number of threads for BAM sorting. 0 will default to min(6,--runThreadN).} 
 \optName{outBAMsortingBinsN}
   \optValue{50}
-  \optLine{int: {\textgreater}0:  number of genome bins fo coordinate-sorting} 
+  \optLine{int: {\textgreater}0:  number of genome bins for coordinate-sorting} 
 \end{optTable}
 \optSection{BAM processing}\label{BAM_processing}
 \begin{optTable}
@@ -548,7 +550,7 @@
   \optLine{int: alignment will be output only if its score is higher than or equal to this value.} 
 \optName{outFilterScoreMinOverLread}
   \optValue{0.66}
-  \optLine{real: same as outFilterScoreMin, but  normalized to read length (sum of mates' lengths for paired-end reads)} 
+  \optLine{real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)} 
 \optName{outFilterMatchNmin}
   \optValue{0}
   \optLine{int: alignment will be output only if the number of matched bases is higher than or equal to this value.} 
@@ -624,28 +626,28 @@
   \optLine{int: non-canonical junction penalty (in addition to scoreGap)} 
 \optName{scoreGapGCAG}
   \optValue{-4}
-  \optLine{GC/AG and CT/GC junction penalty (in addition to scoreGap)} 
+  \optLine{int: GC/AG and CT/GC junction penalty (in addition to scoreGap)} 
 \optName{scoreGapATAC}
   \optValue{-8}
-  \optLine{AT/AC  and GT/AT junction penalty  (in addition to scoreGap)} 
+  \optLine{int: AT/AC  and GT/AT junction penalty  (in addition to scoreGap)} 
 \optName{scoreGenomicLengthLog2scale}
   \optValue{-0.25}
-  \optLine{extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)} 
+  \optLine{int: extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)} 
 \optName{scoreDelOpen}
   \optValue{-2}
-  \optLine{deletion open penalty} 
+  \optLine{int: deletion open penalty} 
 \optName{scoreDelBase}
   \optValue{-2}
-  \optLine{deletion extension penalty per base (in addition to scoreDelOpen)} 
+  \optLine{int: deletion extension penalty per base (in addition to scoreDelOpen)} 
 \optName{scoreInsOpen}
   \optValue{-2}
-  \optLine{insertion open penalty} 
+  \optLine{int: insertion open penalty} 
 \optName{scoreInsBase}
   \optValue{-2}
-  \optLine{insertion extension penalty per base (in addition to scoreInsOpen)} 
+  \optLine{int: insertion extension penalty per base (in addition to scoreInsOpen)} 
 \optName{scoreStitchSJshift}
   \optValue{1}
-  \optLine{maximum score reduction while searching for SJ boundaries in the stitching step} 
+  \optLine{int: maximum score reduction while searching for SJ boundaries in the stitching step} 
 \end{optTable}
 \optSection{Alignments and Seeding}\label{Alignments_and_Seeding}
 \begin{optTable}
@@ -678,13 +680,13 @@
   \optLine{int{\textgreater}0: min length of seeds to be mapped} 
 \optName{alignIntronMin}
   \optValue{21}
-  \optLine{minimum intron size: genomic gap is considered intron if its length{\textgreater}=alignIntronMin, otherwise it is considered Deletion} 
+  \optLine{int: minimum intron size, genomic gap is considered intron if its length{\textgreater}=alignIntronMin, otherwise it is considered Deletion} 
 \optName{alignIntronMax}
   \optValue{0}
-  \optLine{maximum intron size, if 0, max intron size will be determined by (2\^{}winBinNbits)*winAnchorDistNbins} 
+  \optLine{int: maximum intron size, if 0, max intron size will be determined by (2\^{}winBinNbits)*winAnchorDistNbins} 
 \optName{alignMatesGapMax}
   \optValue{0}
-  \optLine{maximum gap between two mates, if 0, max intron gap will be determined by (2\^{}winBinNbits)*winAnchorDistNbins} 
+  \optLine{int: maximum gap between two mates, if 0, max intron gap will be determined by (2\^{}winBinNbits)*winAnchorDistNbins} 
 \optName{alignSJoverhangMin}
   \optValue{5}
   \optLine{int{\textgreater}0: minimum overhang (i.e. block size) for spliced alignments} 
@@ -747,7 +749,7 @@
 \begin{optTable}
 \optName{peOverlapNbasesMin}
   \optValue{0}
-  \optLine{int{\textgreater}=0:             minimum number of overlap bases to trigger mates merging and realignment. Specify {\textgreater}0 value to switch on the "merginf of overlapping mates" algorithm.} 
+  \optLine{int{\textgreater}=0:             minimum number of overlapping bases to trigger mates merging and realignment. Specify {\textgreater}0 value to switch on the "merginf of overlapping mates" algorithm.} 
 \optName{peOverlapMMp}
   \optValue{0.01}
   \optLine{real, {\textgreater}=0 {\&} {\textless}1:     maximum proportion of mismatched bases in the overlap area} 
@@ -847,7 +849,7 @@
   \optOpt{GeneCounts}   \optOptLine{count reads per gene}
 \end{optOptTable}
 \optName{quantTranscriptomeBAMcompression}
-  \optValue{1 1}
+  \optValue{1}
   \optLine{int: -2 to 10  transcriptome BAM compression level} 
 \begin{optOptTable}
   \optOpt{-2}   \optOptLine{no BAM output}
@@ -855,12 +857,13 @@
   \optOpt{0}   \optOptLine{no compression}
   \optOpt{10}   \optOptLine{maximum compression}
 \end{optOptTable}
-\optName{quantTranscriptomeBan}
-  \optValue{IndelSoftclipSingleend}
-  \optLine{string: prohibit various alignment type} 
+\optName{quantTranscriptomeSAMoutput}
+  \optValue{BanSingleEnd{\textunderscore}BanIndels{\textunderscore}ExtendSoftclip}
+  \optLine{string: alignment filtering for TranscriptomeSAM output} 
 \begin{optOptTable}
-  \optOpt{IndelSoftclipSingleend}   \optOptLine{prohibit indels, soft clipping and single-end alignments - compatible with RSEM}
-  \optOpt{Singleend}   \optOptLine{prohibit single-end alignments}
+  \optOpt{BanSingleEnd{\textunderscore}BanIndels{\textunderscore}ExtendSoftclip}   \optOptLine{prohibit indels and single-end alignments, extend softclips - compatible with RSEM}
+  \optOpt{BanSingleEnd}   \optOptLine{prohibit single-end alignments, allow indels and softclips}
+  \optOpt{BanSingleEnd{\textunderscore}ExtendSoftclip}   \optOptLine{prohibit single-end alignments, extend softclips, allow indels}
 \end{optOptTable}
 \end{optTable}
 \optSection{2-pass Mapping}\label{2-pass_Mapping}
@@ -936,7 +939,7 @@
 \end{optOptTable}
 \optName{soloCBposition}
   \optValue{-}
-  \optLine{strings(s)              position of Cell Barcode(s) on the barcode read.} 
+  \optLine{strings(s):             position of Cell Barcode(s) on the barcode read.} 
   \optLine{Presently only works with --soloType CB{\textunderscore}UMI{\textunderscore}Complex, and barcodes are assumed to be on Read2.} 
   \optLine{Format for each barcode: startAnchor{\textunderscore}startPosition{\textunderscore}endAnchor{\textunderscore}endPosition} 
   \optLine{start(end)Anchor defines the Anchor Base for the CB: 0: read start; 1: read end; 2: adapter start; 3: adapter end} 
@@ -946,7 +949,7 @@
   \optLine{--soloCBposition  0{\textunderscore}0{\textunderscore}2{\textunderscore}-1  3{\textunderscore}1{\textunderscore}3{\textunderscore}8} 
 \optName{soloUMIposition}
   \optValue{-}
-  \optLine{string                  position of the UMI on the barcode read, same as soloCBposition} 
+  \optLine{string:                  position of the UMI on the barcode read, same as soloCBposition} 
   \optLine{Example: inDrop (Zilionis et al, Nat. Protocols, 2017):} 
   \optLine{--soloCBposition  3{\textunderscore}9{\textunderscore}3{\textunderscore}14} 
 \optName{soloAdapterSequence}
@@ -1020,7 +1023,7 @@
 \end{optOptTable}
 \optName{soloUMIfiltering}
   \optValue{-}
-  \optLine{string(s)               type of UMI filtering (for reads uniquely mapping to genes)} 
+  \optLine{string(s):              type of UMI filtering (for reads uniquely mapping to genes)} 
 \begin{optOptTable}
   \optOpt{-}   \optOptLine{basic filtering: remove UMIs with N and homopolymers (similar to CellRanger 2.2.0).}
   \optOpt{MultiGeneUMI}   \optOptLine{basic + remove lower-count UMIs that map to more than one gene.}
@@ -1030,7 +1033,7 @@
   \optLine{Only works with --soloUMIdedup 1MM{\textunderscore}CR} 
 \optName{soloOutFileNames}
   \optValue{Solo.out/ features.tsv barcodes.tsv matrix.mtx}
-  \optLine{string(s)               file names for STARsolo output:} 
+  \optLine{string(s):              file names for STARsolo output:} 
   \optLine{file{\textunderscore}name{\textunderscore}prefix   gene{\textunderscore}names   barcode{\textunderscore}sequences   cell{\textunderscore}feature{\textunderscore}count{\textunderscore}matrix} 
 \optName{soloCellFilter}
   \optValue{CellRanger2.2 3000 0.99 10}


=====================================
extras/docker/Dockerfile
=====================================
@@ -2,7 +2,7 @@ FROM debian:stable-slim
 
 MAINTAINER dobin at cshl.edu
 
-ARG STAR_VERSION=2.7.11a
+ARG STAR_VERSION=2.7.11b
 
 ENV PACKAGES gcc g++ make wget zlib1g-dev unzip
 


=====================================
source/Parameters.cpp
=====================================
@@ -261,7 +261,7 @@ Parameters::Parameters() {//initalize parameters info
     //quant
     parArray.push_back(new ParameterInfoVector <string> (-1, -1, "quantMode", &quant.mode));
     parArray.push_back(new ParameterInfoScalar <int>     (-1, -1, "quantTranscriptomeBAMcompression", &quant.trSAM.bamCompression));
-    parArray.push_back(new ParameterInfoScalar <string>     (-1, -1, "quantTranscriptomeBan", &quant.trSAM.ban));
+    parArray.push_back(new ParameterInfoScalar <string>     (-1, -1, "quantTranscriptomeSAMoutput", &quant.trSAM.output));
 
     //2-pass
     parArray.push_back(new ParameterInfoScalar <uint>   (-1, -1, "twopass1readsN", &twoPass.pass1readsN));
@@ -912,14 +912,18 @@ void Parameters::inputParameters (int argInN, char* argIn[]) {//input parameters
                     };
                     inOut->outQuantBAMfile=bgzf_open(outQuantBAMfileName.c_str(),("w"+to_string((long long) quant.trSAM.bamCompression)).c_str());
                 };
-                if (quant.trSAM.ban=="IndelSoftclipSingleend") {
+                if (quant.trSAM.output=="BanSingleEnd_BanIndels_ExtendSoftclip") {
                     quant.trSAM.indel=false;
                     quant.trSAM.softClip=false;
                     quant.trSAM.singleEnd=false;
-                } else if (quant.trSAM.ban=="Singleend") {
+                } else if (quant.trSAM.output=="BanSingleEnd") {
                     quant.trSAM.indel=true;
                     quant.trSAM.softClip=true;
                     quant.trSAM.singleEnd=false;
+                } else if (quant.trSAM.output=="BanSingleEnd_ExtendSoftclip") {
+                    quant.trSAM.indel=true;
+                    quant.trSAM.softClip=false;
+                    quant.trSAM.singleEnd=false;
                 };
             } else if  (quant.mode.at(ii)=="GeneCounts") {
                 quant.geCount.yes=true;


=====================================
source/Parameters.h
=====================================
@@ -301,7 +301,7 @@ class Parameters {
                 bool softClip;
                 bool singleEnd;
                 int bamCompression;
-                string ban;
+                string output;
             } trSAM;
 
             struct {


=====================================
source/VERSION
=====================================
@@ -1 +1 @@
-#define STAR_VERSION "2.7.11a"
+#define STAR_VERSION "2.7.11b"


=====================================
source/parametersDefault
=====================================
@@ -55,14 +55,14 @@ genomeFileSizes             0
     uint(s)>0: genome files exact sizes in bytes. Typically, this should not be defined by the user.
     
 genomeTransformOutput       None
-    string(s)               which output to transform back to original genome
+    string(s):              which output to transform back to original genome
                             SAM     ... SAM/BAM alignments
                             SJ      ... splice junctions (SJ.out.tab)
                             Quant   ... quantifications (from --quantMode option)
                             None    ... no transformation of the output        
 
 genomeChrSetMitochondrial   chrM M MT
-    string(s)               names of the mitochondrial chromosomes. Presently only used for STARsolo statisics output/
+    string(s):              names of the mitochondrial chromosomes. Presently only used for STARsolo statistics output/
 
 ### Genome Indexing Parameters - only used with --runMode genomeGenerate
 genomeChrBinNbits           18
@@ -105,7 +105,7 @@ genomeType                  Full
 
 ### Splice Junctions Database
 sjdbFileChrStartEnd                     -
-    string(s): path to the files with genomic coordinates (chr <tab> start <tab> end <tab> strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated.
+    string(s): path to the files with genomic coordinates (chr <tab> start <tab> end <tab> strand) for the splice junction introns. Multiple files can be supplied and will be concatenated.
 
 sjdbGTFfile                             -
     string: path to the GTF file with annotations
@@ -123,10 +123,10 @@ sjdbGTFtagExonParentGene                gene_id
     string: GTF attribute name for parent gene ID (default "gene_id" works for GTF files)
 
 sjdbGTFtagExonParentGeneName            gene_name
-    string(s): GTF attrbute name for parent gene name
+    string(s): GTF attribute name for parent gene name
 
 sjdbGTFtagExonParentGeneType            gene_type gene_biotype
-    string(s): GTF attrbute name for parent gene type
+    string(s): GTF attribute name for parent gene type
 
 sjdbOverhang                            100
     int>0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate_length - 1)
@@ -230,7 +230,7 @@ limitGenomeGenerateRAM               31000000000
     int>0: maximum available RAM (bytes) for genome generation
 
 limitIObufferSize                    30000000 50000000
-    int>0: max available buffers size (bytes) for input/output, per thread
+    int(s)>0: max available buffers size (bytes) for input/output, per thread
 
 limitOutSAMoneReadBytes              100000
     int>0: max size of the SAM record (bytes) for one read. Recommended value: >(2*(LengthMate1+LengthMate2+100)*outFilterMultimapNmax
@@ -245,7 +245,7 @@ limitBAMsortRAM                         0
     int>=0: maximum available RAM (bytes) for sorting BAM. If =0, it will be set to the genome index size. 0 value can only be used with --genomeLoad NoSharedMemory option.
 
 limitSjdbInsertNsj                     1000000
-    int>=0: maximum number of junction to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run
+    int>=0: maximum number of junctions to be inserted to the genome on the fly at the mapping stage, including those from annotations and those detected in the 1st step of the 2-pass run
 
 limitNreadsSoft                        -1
     int: soft limit on the number of reads
@@ -256,12 +256,12 @@ outFileNamePrefix               ./
 
 outTmpDir                       -
     string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed!
-            - the temp directory will default to outFileNamePrefix_STARtmp
+                                - ... the temp directory will default to outFileNamePrefix_STARtmp
 
 outTmpKeep                      None
-    string: whether to keep the tempporary files after STAR runs is finished
+    string: whether to keep the temporary files after STAR runs is finished
                                 None ... remove all temporary files
-                                All .. keep all files
+                                All ... keep all files
 
 outStd                          Log
     string: which output will be directed to stdout (standard out)
@@ -307,7 +307,7 @@ outSAMstrandField               None
                                 intronMotif ... strand derived from the intron motif. This option changes the output alignments: reads with inconsistent and/or non-canonical introns are filtered out.
 
 outSAMattributes                Standard
-    string: a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.
+    string(s): a string of desired SAM attributes, in the order desired for the output SAM. Tags can be listed in any combination/order.
                                 ***Presets:
                                 None        ... no attributes
                                 Standard    ... NH HI AS nM
@@ -415,7 +415,7 @@ outBAMsortingThreadN    0
     int: >=0: number of threads for BAM sorting. 0 will default to min(6,--runThreadN).
 
 outBAMsortingBinsN      50
-    int: >0:  number of genome bins fo coordinate-sorting
+    int: >0:  number of genome bins for coordinate-sorting
 
 ### BAM processing
 bamRemoveDuplicatesType  -
@@ -478,7 +478,7 @@ outFilterScoreMin               0
     int: alignment will be output only if its score is higher than or equal to this value.
 
 outFilterScoreMinOverLread      0.66
-    real: same as outFilterScoreMin, but  normalized to read length (sum of mates' lengths for paired-end reads)
+    real: same as outFilterScoreMin, but normalized to read length (sum of mates' lengths for paired-end reads)
 
 outFilterMatchNmin              0
     int: alignment will be output only if the number of matched bases is higher than or equal to this value.
@@ -540,28 +540,28 @@ scoreGapNoncan               -8
     int: non-canonical junction penalty (in addition to scoreGap)
 
 scoreGapGCAG                 -4
-    GC/AG and CT/GC junction penalty (in addition to scoreGap)
+    int: GC/AG and CT/GC junction penalty (in addition to scoreGap)
 
 scoreGapATAC                 -8
-    AT/AC  and GT/AT junction penalty  (in addition to scoreGap)
+    int: AT/AC  and GT/AT junction penalty  (in addition to scoreGap)
 
 scoreGenomicLengthLog2scale   -0.25
-    extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)
+    int: extra score logarithmically scaled with genomic length of the alignment: scoreGenomicLengthLog2scale*log2(genomicLength)
 
 scoreDelOpen                 -2
-    deletion open penalty
+    int: deletion open penalty
 
 scoreDelBase                 -2
-    deletion extension penalty per base (in addition to scoreDelOpen)
+    int: deletion extension penalty per base (in addition to scoreDelOpen)
 
 scoreInsOpen                 -2
-    insertion open penalty
+    int: insertion open penalty
 
 scoreInsBase                 -2
-    insertion extension penalty per base (in addition to scoreInsOpen)
+    int: insertion extension penalty per base (in addition to scoreInsOpen)
 
 scoreStitchSJshift           1
-    maximum score reduction while searching for SJ boundaries in the stitching step
+    int: maximum score reduction while searching for SJ boundaries in the stitching step
 
 
 ### Alignments and Seeding
@@ -594,13 +594,13 @@ seedMapMin              5
     int>0: min length of seeds to be mapped
 
 alignIntronMin              21
-    minimum intron size: genomic gap is considered intron if its length>=alignIntronMin, otherwise it is considered Deletion
+    int: minimum intron size, genomic gap is considered intron if its length>=alignIntronMin, otherwise it is considered Deletion
 
 alignIntronMax              0
-    maximum intron size, if 0, max intron size will be determined by (2^winBinNbits)*winAnchorDistNbins
+    int: maximum intron size, if 0, max intron size will be determined by (2^winBinNbits)*winAnchorDistNbins
 
 alignMatesGapMax            0
-    maximum gap between two mates, if 0, max intron gap will be determined by (2^winBinNbits)*winAnchorDistNbins
+    int: maximum gap between two mates, if 0, max intron gap will be determined by (2^winBinNbits)*winAnchorDistNbins
 
 alignSJoverhangMin          5
     int>0: minimum overhang (i.e. block size) for spliced alignments
@@ -653,7 +653,7 @@ alignInsertionFlush     None
 
 ### Paired-End reads
 peOverlapNbasesMin          0
-    int>=0:             minimum number of overlap bases to trigger mates merging and realignment. Specify >0 value to switch on the "merginf of overlapping mates" algorithm.
+    int>=0:             minimum number of overlapping bases to trigger mates merging and realignment. Specify >0 value to switch on the "merginf of overlapping mates" algorithm.
 
 peOverlapMMp                0.01
     real, >=0 & <1:     maximum proportion of mismatched bases in the overlap area
@@ -738,17 +738,19 @@ quantMode                   -
                             TranscriptomeSAM ... output SAM/BAM alignments to transcriptome into a separate file
                             GeneCounts       ... count reads per gene
 
-quantTranscriptomeBAMcompression    1       1
+quantTranscriptomeBAMcompression    1
     int: -2 to 10  transcriptome BAM compression level
                             -2  ... no BAM output
                             -1  ... default compression (6?)
                              0  ... no compression
                              10 ... maximum compression
 
-quantTranscriptomeBan       IndelSoftclipSingleend
-    string: prohibit various alignment type
-                            IndelSoftclipSingleend  ... prohibit indels, soft clipping and single-end alignments - compatible with RSEM
-                            Singleend               ... prohibit single-end alignments
+quantTranscriptomeSAMoutput BanSingleEnd_BanIndels_ExtendSoftclip
+    string: alignment filtering for TranscriptomeSAM output
+                            BanSingleEnd_BanIndels_ExtendSoftclip ... prohibit indels and single-end alignments, extend softclips - compatible with RSEM
+                            BanSingleEnd               ... prohibit single-end alignments, allow indels and softclips
+                            BanSingleEnd_ExtendSoftclip ... prohibit single-end alignments, extend softclips, allow indels
+
 
 ### 2-pass Mapping
 twopassMode                 None
@@ -799,14 +801,14 @@ soloBarcodeReadLength       1
                             1   ... equal to sum of soloCBlen+soloUMIlen
                             0   ... not defined, do not check
 
-soloBarcodeMate             0                            
+soloBarcodeMate             0
     int: identifies which read mate contains the barcode (CB+UMI) sequence
                             0   ... barcode sequence is on separate read, which should always be the last file in the --readFilesIn listed
                             1   ... barcode sequence is a part of mate 1
                             2   ... barcode sequence is a part of mate 2
 
 soloCBposition              -
-    strings(s)              position of Cell Barcode(s) on the barcode read.
+    strings(s):             position of Cell Barcode(s) on the barcode read.
                             Presently only works with --soloType CB_UMI_Complex, and barcodes are assumed to be on Read2.
                             Format for each barcode: startAnchor_startPosition_endAnchor_endPosition
                             start(end)Anchor defines the Anchor Base for the CB: 0: read start; 1: read end; 2: adapter start; 3: adapter end
@@ -816,7 +818,7 @@ soloCBposition              -
                             --soloCBposition  0_0_2_-1  3_1_3_8
 
 soloUMIposition             -
-    string                  position of the UMI on the barcode read, same as soloCBposition
+    string:                  position of the UMI on the barcode read, same as soloCBposition
                             Example: inDrop (Zilionis et al, Nat. Protocols, 2017):
                             --soloCBposition  3_9_3_14
 
@@ -882,7 +884,7 @@ soloUMIdedup                1MM_All
                             1MM_CR                      ... CellRanger2-4 algorithm for 1MM UMI collapsing.
 
 soloUMIfiltering            -
-    string(s)               type of UMI filtering (for reads uniquely mapping to genes)
+    string(s):              type of UMI filtering (for reads uniquely mapping to genes)
                             -                  ... basic filtering: remove UMIs with N and homopolymers (similar to CellRanger 2.2.0).
                             MultiGeneUMI       ... basic + remove lower-count UMIs that map to more than one gene.
                             MultiGeneUMI_All   ... basic + remove all UMIs that map to more than one gene.
@@ -890,7 +892,7 @@ soloUMIfiltering            -
                                                    Only works with --soloUMIdedup 1MM_CR
                                                 
 soloOutFileNames            Solo.out/          features.tsv barcodes.tsv        matrix.mtx
-    string(s)               file names for STARsolo output:
+    string(s):              file names for STARsolo output:
                             file_name_prefix   gene_names   barcode_sequences   cell_feature_count_matrix
 
 soloCellFilter              CellRanger2.2 3000 0.99 10



View it on GitLab: https://salsa.debian.org/med-team/rna-star/-/compare/916fa04036cca31804032446f7f2fc04d6486eeb...494932511928e3e94400423f7828a4aa426e93e4

-- 
View it on GitLab: https://salsa.debian.org/med-team/rna-star/-/compare/916fa04036cca31804032446f7f2fc04d6486eeb...494932511928e3e94400423f7828a4aa426e93e4
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240209/453df868/attachment-0001.htm>


More information about the debian-med-commit mailing list