[med-svn] [Git][med-team/salmon][master] 4 commits: routine-update: New upstream version

Steffen Möller (@moeller) gitlab at salsa.debian.org
Tue Dec 14 15:09:15 GMT 2021



Steffen Möller pushed to branch master at Debian Med / salmon


Commits:
2cbddf72 by Steffen Moeller at 2021-12-13T15:40:50+01:00
routine-update: New upstream version

- - - - -
fdb785fc by Steffen Moeller at 2021-12-13T15:40:52+01:00
New upstream version 1.6.0+ds1
- - - - -
7495e0c7 by Steffen Moeller at 2021-12-13T15:40:57+01:00
Update upstream source from tag 'upstream/1.6.0+ds1'

Update to upstream version '1.6.0+ds1'
with Debian dir 5076e564b49ed8732453b7e3826841cedbef49d2
- - - - -
b7cbdadd by Steffen Moeller at 2021-12-13T15:51:34+01:00
Builds and cowbuilds.

- - - - -


29 changed files:

- current_version.txt
- debian/changelog
- debian/control
- doc/source/alevin.rst
- doc/source/conf.py
- docker/Dockerfile
- docker/build_test.sh
- include/AlevinUtils.hpp
- include/ReadExperiment.hpp
- include/SalmonConfig.hpp
- include/SalmonDefaults.hpp
- include/SalmonIndex.hpp
- include/SingleCellProtocols.hpp
- include/cuckoohash_map.hh
- include/json.hpp
- scripts/fetchPufferfish.sh
- src/Alevin.cpp
- src/AlevinHash.cpp
- src/AlevinUtils.cpp
- src/BuildSalmonIndex.cpp
- src/CollapsedCellOptimizer.cpp
- src/GZipWriter.cpp
- src/ProgramOptionsGenerator.cpp
- src/Salmon.cpp
- src/SalmonAlevin.cpp
- src/SalmonQuantMerge.cpp
- src/SalmonQuantify.cpp
- src/SalmonQuantifyAlignments.cpp
- src/WhiteList.cpp


Changes:

=====================================
current_version.txt
=====================================
@@ -1,3 +1,3 @@
 VERSION_MAJOR 1
-VERSION_MINOR 5
-VERSION_PATCH 2
+VERSION_MINOR 6
+VERSION_PATCH 0


=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+salmon (1.6.0+ds1-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version
+
+ -- Steffen Moeller <moeller at debian.org>  Mon, 13 Dec 2021 15:41:10 +0100
+
 salmon (1.5.2+ds1-1) unstable; urgency=medium
 
   [ Steffen Moeller ]


=====================================
debian/control
=====================================
@@ -12,7 +12,7 @@ Build-Depends: debhelper-compat (= 13),
                libboost-thread-dev,
                libboost-program-options-dev,
                libboost-timer-dev,
-               libjellyfish-2.0-dev (>> 2.2.3-2),
+               libjellyfish-2.0-dev,
                libpthread-stubs0-dev,
                libsparsehash-dev,
                libdivsufsort-dev,
@@ -44,10 +44,10 @@ Rules-Requires-Root: no
 
 Package: salmon
 Architecture: any-amd64 arm64
-Built-Using: ${sphinxdoc:Built-Using}
 Depends: ${shlibs:Depends},
          ${misc:Depends},
          ${sphinxdoc:Depends}
+Built-Using: ${sphinxdoc:Built-Using}
 Description: wicked-fast transcript quantification from RNA-seq data
  Salmon is a wicked-fast program to produce a highly-accurate, transcript-level
  quantification estimates from RNA-seq data. Salmon achieves is accuracy and


=====================================
doc/source/alevin.rst
=====================================
@@ -1,10 +1,14 @@
 Alevin
 ================
 
-Alevin is a tool --- integrated with the salmon software --- that introduces a family of algorithms for quantification and analysis of 3' tagged-end single-cell sequencing data. Currently alevin supports the following two major droplet based single-cell protocols:
+Alevin is a tool --- integrated with the salmon software --- that introduces a family of algorithms for quantification and analysis of 3' tagged-end single-cell sequencing data. Currently alevin supports the following single-cell protocols:
 
 1. Drop-seq
 2. 10x-Chromium v1/2/3
+3. inDropV2
+4. CELSeq 1/2
+5. Quartz-Seq2
+6. sci-RNA-seq3
 
 Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other.  Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools).
 
@@ -177,6 +181,18 @@ map end-to-end.  Instead, the score of the mapping will be the position along th
 highest score.  This is the score which must reach the fraction threshold for the read to be considered
 as valid.
 
+Single-cell protocol specific notes
+------------------------------------
+
+In cases where single-cell protocol supports variable length cellbarcodes, alevin adds nucleotide padding to make the lengths uniform.
+Furthermore, the padding scheme ensures that there are no collisions added in the process. The padding scheme is as follows:
+
+1. sci-RNA-seq3: The barcode is composed of 9-10 bp hairpin adaptor and 10 bp reverse transcription index making it 19-20 bp long. If 
+the bacode is 20 bp long, alevin adds `A` and it adds `AC` if it is 19 bp long. Thus, the length of barcode in the output is 21 bp. 
+2. inDropV2: 8-11 bp barcode1 along with 8 bp barcode2 makes up the barcode. For barcode lengths of 16, 17, 18, and 19 bp, alevin adds
+`AAAC`, `AAG`, `AT`, and `A` respectively. Thus, the length of barcode in the output is 20 bp. Furthermore, the position of barcode1 is
+dependent on finding exact match of sequence `w1`. If exact match is not found, a search for `w1` is performed allowing a maximum hamming
+ distance 2 b/w `w1` and read2 substring of w1 length within the required bounds; the first match is returned.  
 
 Output
 ------


=====================================
doc/source/conf.py
=====================================
@@ -48,16 +48,16 @@ master_doc = 'index'
 
 # General information about the project.
 project = u'Salmon'
-copyright = u'2013-2017, Rob Patro, Geet Duggal, Mike Love, Rafael Irizarry and Carl Kingsford'
+copyright = u'2013-2021, Rob Patro, Geet Duggal, Mike Love, Rafael Irizarry and Carl Kingsford'
 
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
 # built documents.
 #
 # The short X.Y version.
-version = '1.5'
+version = '1.6'
 # The full version, including alpha/beta/rc tags.
-release = '1.5.2'
+release = '1.6.0'
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.


=====================================
docker/Dockerfile
=====================================
@@ -6,7 +6,7 @@ MAINTAINER salmon.maintainer at gmail.com
 
 ENV PACKAGES git gcc make g++ libboost-all-dev liblzma-dev libbz2-dev \
     ca-certificates zlib1g-dev libcurl4-openssl-dev curl unzip autoconf apt-transport-https ca-certificates gnupg software-properties-common wget
-ENV SALMON_VERSION 1.5.2
+ENV SALMON_VERSION 1.6.0
 
 # salmon binary will be installed in /home/salmon/bin/salmon
 


=====================================
docker/build_test.sh
=====================================
@@ -1,3 +1,3 @@
 #! /bin/bash
-SALMON_VERSION=1.5.2
+SALMON_VERSION=1.6.0
 docker build --no-cache -t combinelab/salmon:${SALMON_VERSION} -t combinelab/salmon:latest .


=====================================
include/AlevinUtils.hpp
=====================================
@@ -22,6 +22,7 @@
 #include <algorithm>
 #include <limits>
 #include <string>
+#include <numeric>
 
 #include "spdlog/spdlog.h"
 
@@ -72,6 +73,8 @@ namespace alevin{
     void readWhitelist(bfs::path& filePath,
                        TrueBcsT& trueBarcodes);
 
+    unsigned int hammingDistance(const std::string s1, const std::string s2);
+
     template <typename ProtocolT>
     bool processAlevinOpts(AlevinOpts<ProtocolT>& aopt,
                            SalmonOpts& sopt, bool noTgMap,
@@ -97,7 +100,7 @@ namespace alevin{
                       OrderedOptionsT& orderedOptions) {
       std::ofstream os(cmdInfoPath.string());
       cereal::JSONOutputArchive oa(os);
-      oa(cereal::make_nvp("salmon_version:", std::string(salmon::version)));
+      oa(cereal::make_nvp("salmon_version", std::string(salmon::version)));
       for (auto& opt : orderedOptions.options) {
         if (opt.value.size() == 1) {
           oa(cereal::make_nvp(opt.string_key, opt.value.front()));


=====================================
include/ReadExperiment.hpp
=====================================
@@ -25,6 +25,7 @@
 
 // Boost includes
 #include <boost/filesystem.hpp>
+#include <boost/filesystem/path.hpp>
 #include <boost/range/irange.hpp>
 
 // Cereal includes
@@ -48,10 +49,12 @@ class ReadExperiment {
 public:
   ReadExperiment(std::vector<ReadLibrary>& readLibraries,
                  // const boost::filesystem::path& transcriptFile,
-                 const boost::filesystem::path& indexDirectory,
+                 SalmonIndex* salmonIndex,
+                 // const boost::filesystem::path& indexDirectory,
                  SalmonOpts& sopt)
       : readLibraries_(readLibraries),
         // transcriptFile_(transcriptFile),
+        salmonIndex_(salmonIndex),
         transcripts_(std::vector<Transcript>()), totalAssignedFragments_(0),
         fragStartDists_(5), posBiasFW_(5), posBiasRC_(5), posBiasExpectFW_(5),
         posBiasExpectRC_(5), /*seqBiasModel_(1.0),*/ eqBuilder_(sopt.jointLog, sopt.maxHashResizeThreads),
@@ -115,24 +118,6 @@ public:
     }
     */
 
-    // ==== Figure out the index type
-    boost::filesystem::path versionPath = indexDirectory / "versionInfo.json";
-    SalmonIndexVersionInfo versionInfo;
-    versionInfo.load(versionPath);
-    if (versionInfo.indexVersion() == 0) {
-      fmt::MemoryWriter infostr;
-      infostr << "Error: The index version file " << versionPath.string()
-              << " doesn't seem to exist.  Please try re-building the salmon "
-                 "index.";
-      throw std::invalid_argument(infostr.str());
-    }
-    // Check index version compatibility here
-    auto indexType = versionInfo.indexType();
-    // ==== Figure out the index type
-
-    salmonIndex_.reset(new SalmonIndex(sopt.jointLog, indexType));
-    salmonIndex_->load(indexDirectory);
-
     // Now we'll have either an FMD-based index or a QUASI index
     // dispatch on the correct type.
     fmt::MemoryWriter infostr;
@@ -159,7 +144,7 @@ public:
     // Create the cluster forest for this set of transcripts
     clusters_.reset(new ClusterForest(transcripts_.size(), transcripts_));
   }
-
+  
   EQBuilderT& equivalenceClassBuilder() { return eqBuilder_; }
 
   std::string getIndexSeqHash256() const { return salmonIndex_->seqHash256(); }
@@ -262,7 +247,7 @@ public:
     }
   }
 
-  SalmonIndex* getIndex() { return salmonIndex_.get(); }
+  SalmonIndex* getIndex() { return salmonIndex_; }
 
   template <typename PuffIndexT>
   void loadTranscriptsFromPuff(PuffIndexT* idx_, const SalmonOpts& sopt) {
@@ -416,7 +401,7 @@ public:
     std::atomic<bool> burnedIn{
         totalAssignedFragments_ + numAssignedFragments_ >= sopt.numBurninFrags};
     for (auto& rl : readLibraries_) {
-      processReadLibrary(rl, salmonIndex_.get(), transcripts_, clusterForest(),
+      processReadLibrary(rl, salmonIndex_, transcripts_, clusterForest(),
                          *(fragLengthDist_.get()), numAssignedFragments_,
                          numThreads, burnedIn);
     }
@@ -806,7 +791,7 @@ private:
   /**
    * The index we've built on the set of transcripts.
    */
-  std::unique_ptr<SalmonIndex> salmonIndex_{nullptr};
+  SalmonIndex* salmonIndex_{nullptr};
   /**
    * The cluster forest maintains the dynamic relationship
    * defined by transcripts and reads --- if two transcripts


=====================================
include/SalmonConfig.hpp
=====================================
@@ -26,9 +26,9 @@
 
 namespace salmon {
 constexpr char majorVersion[] = "1";
-constexpr char minorVersion[] = "5";
-constexpr char patchVersion[] = "2";
-constexpr char version[] = "1.5.2";
+constexpr char minorVersion[] = "6";
+constexpr char patchVersion[] = "0";
+constexpr char version[] = "1.6.0";
 constexpr uint32_t indexVersion = 5;
 constexpr char requiredQuasiIndexVersion[] = "p7";
 } // namespace salmon


=====================================
include/SalmonDefaults.hpp
=====================================
@@ -140,6 +140,7 @@ namespace defaults {
   constexpr const bool isCELSeq{false};
   constexpr const bool isCELSeq2{false};
   constexpr const bool isQuartzSeq2{false};
+  constexpr const bool isSciSeq3{false};
   constexpr const bool noQuant{false};
   constexpr const bool dumpFQ{false};
   constexpr const bool dumpArborescences{false};


=====================================
include/SalmonIndex.hpp
=====================================
@@ -246,4 +246,9 @@ private:
   std::string decoyNameHash256_;
 };
 
+// Convenience function to load an index
+std::unique_ptr<SalmonIndex>
+checkLoadIndex(const boost::filesystem::path& indexDirectory,
+               std::shared_ptr<spdlog::logger>& logger);
+
 #endif //__SALMON_INDEX_HPP


=====================================
include/SingleCellProtocols.hpp
=====================================
@@ -124,17 +124,19 @@ namespace alevin{
       DropSeq(): Rule(12, 8, BarcodeEnd::FIVE, 16777216){}
     };
 
-    struct InDrop : Rule{
-        //InDrop starts from 5end with variable
-        //length barcodes so provide the full
-        // length of the barcod eincluding w1.
-        // UMI length is 6
-      InDrop(): Rule(42, 6, BarcodeEnd::FIVE, 22347776){}
+    struct InDropV2 : Rule{
+        //InDropV2 starts from 5end with variable
+        //length barcodes where barcode1 varies from 8 to 11 bp
+        // followed by w1 sequence, 8 bp barcode2 and 6bp UMI
+      InDropV2(): Rule(20, 6, BarcodeEnd::FIVE, 22347776){}
 
       std::string w1;
+      std::size_t w1Length, maxHammingDist = 2, bc2Len = 8;
       void setW1(std::string& w1_){
         w1 = w1_;
+        w1Length = w1.length();
       }
+      std::size_t w1Pos = 0, bc2EndPos;
     };
 
     struct CITESeq : Rule{
@@ -179,7 +181,14 @@ namespace alevin{
     struct Custom : Rule{
       Custom() : Rule(0,0,BarcodeEnd::FIVE,0){}
     };
-    
+    struct SciSeq3 : Rule{
+      SciSeq3() : Rule(21, 8, BarcodeEnd::FIVE, 1073741824){}
+      std::string anchorSeq = "CAGAGC";
+      std::size_t anchorSeqLen = anchorSeq.length();
+      std::size_t anchorPos = 0;
+      u_int16_t const maxHairpinIndexLen = 10;
+      u_int16_t const rtIdxLen = 10; // rev transcription index length
+    };
 
     // for the new type of specification
     struct CustomGeometry {


=====================================
include/cuckoohash_map.hh
=====================================
@@ -113,7 +113,7 @@ public:
         maximum_hashpower_(NO_MAXIMUM_HASHPOWER),
         max_num_worker_threads_(0) {
     all_locks_.emplace_back(std::min(bucket_count(), size_type(kMaxNumLocks)),
-                            spinlock(), get_allocator());
+                            get_allocator());
   }
 
   /**
@@ -695,7 +695,7 @@ private:
 
   void add_locks_from_other(const cuckoohash_map &other) {
     locks_t &other_locks = other.get_current_locks();
-    all_locks_.emplace_back(other_locks.size(), spinlock(), get_allocator());
+    all_locks_.emplace_back(other_locks.size(), get_allocator());
     std::copy(other_locks.begin(), other_locks.end(),
               get_current_locks().begin());
   }
@@ -794,7 +794,7 @@ private:
   // under this lock. One can compute the size of the table by summing the
   // elem_counter over all locks.
   //
-  // - is_migrated: When resizing with cuckoo_fast_doulbe, we do not
+  // - is_migrated: When resizing with cuckoo_fast_double, we do not
   // immediately rehash elements from the old buckets array to the new one.
   // Instead, we'll mark all of the locks as not migrated. So anybody trying to
   // acquire the lock must also migrate the corresponding buckets if
@@ -1823,7 +1823,7 @@ private:
     }
 
     locks_t new_locks(std::min(size_type(kMaxNumLocks), new_bucket_count),
-                      spinlock(), get_allocator());
+                      get_allocator());
     assert(new_locks.size() > current_locks.size());
     std::copy(current_locks.begin(), current_locks.end(), new_locks.begin());
     for (spinlock &lock : new_locks) {


=====================================
include/json.hpp
=====================================
The diff for this file was not included because it is too large.

=====================================
scripts/fetchPufferfish.sh
=====================================
@@ -22,11 +22,11 @@ if [ -d ${INSTALL_DIR}/src/pufferfish ] ; then
     rm -fr ${INSTALL_DIR}/src/pufferfish
 fi
 
-SVER=salmon-v1.5.2
+SVER=salmon-v1.6.0
 #SVER=develop
 #SVER=sketch-mode
 
-EXPECTED_SHA256=86c7ff465d40b8184dca7f6afee693ad1db63be5bf63242161ea39d3507d6d25
+EXPECTED_SHA256=f71b3c08f254200fcdc2eb8fe3dcca8a8e9489e79ef5952a4958d8b9979831dc
 
 mkdir -p ${EXTERNAL_DIR}
 curl -k -L https://github.com/COMBINE-lab/pufferfish/archive/${SVER}.zip -o ${EXTERNAL_DIR}/pufferfish.zip


=====================================
src/Alevin.cpp
=====================================
@@ -19,6 +19,7 @@
 <HEADER
 **/
 
+#include <memory>
 #include <random>
 #include <algorithm>
 #include <atomic>
@@ -63,11 +64,11 @@
 
 // salmon includes
 #include "FastxParser.hpp"
+#include "ProgramOptionsGenerator.hpp"
 #include "SalmonConfig.hpp"
 #include "SalmonDefaults.hpp"
 #include "SalmonOpts.hpp"
 #include "SalmonUtils.hpp"
-#include "ProgramOptionsGenerator.hpp"
 
 using paired_parser_qual = fastx_parser::FastxParser<fastx_parser::ReadQualPair>;
 using single_parser = fastx_parser::FastxParser<fastx_parser::ReadSeq>;
@@ -78,20 +79,18 @@ namespace apt = alevin::protocols;
 namespace aut = alevin::utils;
 
 template <typename ProtocolT>
-int alevin_sc_align(AlevinOpts<ProtocolT>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
+int alevin_sc_align(AlevinOpts<ProtocolT>& aopt, SalmonOpts& sopt,
+                    boost::program_options::parsed_options& orderedOptions,
+                    std::unique_ptr<SalmonIndex>& salmonIndex);
 
 template <typename ProtocolT>
-int alevinQuant(AlevinOpts<ProtocolT>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
+int alevinQuant(AlevinOpts<ProtocolT>& aopt, SalmonOpts& sopt,
+                SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
                 spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
                 spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
+                CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
 
 //colors for progress monitoring
 const char RESET_COLOR[] = "\x1b[0m";
@@ -835,7 +834,8 @@ void initiatePipeline(AlevinOpts<ProtocolT>& aopt,
                       boost::program_options::variables_map& vm,
                       std::string commentString, bool noTgMap,
                       std::vector<std::string> barcodeFiles,
-                      std::vector<std::string> readFiles){
+                      std::vector<std::string> readFiles,
+                      std::unique_ptr<SalmonIndex>& salmonIndex){
   bool isOptionsOk = aut::processAlevinOpts(aopt, sopt, noTgMap, vm);
   if (!isOptionsOk){
     aopt.jointLog->flush();
@@ -894,14 +894,14 @@ void initiatePipeline(AlevinOpts<ProtocolT>& aopt,
     // write out the cmd_info.json to make sure we have that
     boost::filesystem::path outputDirectory = vm["output"].as<std::string>();
     bool isWriteOk = aut::writeCmdInfo(outputDirectory / "cmd_info.json", orderedOptions);
+ 
     if(!isWriteOk){
       fmt::print(stderr, "Writing cmd_info.json in output directory failed.\nExiting now.");
       exit(1);
     }
 
     // do the actual mapping
-    auto rc = alevin_sc_align(aopt, sopt, orderedOptions);
-
+    auto rc = alevin_sc_align(aopt, sopt, orderedOptions, salmonIndex);
     if (rc == 0) {
       aopt.jointLog->info("sc-align successful.");
     } else {
@@ -949,7 +949,7 @@ void initiatePipeline(AlevinOpts<ProtocolT>& aopt,
     aopt.jointLog->info("Done with Barcode Processing; Moving to Quantify\n");
     alevinQuant(aopt, sopt, barcodeSoftMap, trueBarcodes,
                 txpToGeneMap, geneIdxMap, orderedOptions,
-                freqCounter, numLowConfidentBarcode);
+                freqCounter, numLowConfidentBarcode, salmonIndex);
   }
   else{
     boost::filesystem::path cmdInfoPath = vm["output"].as<std::string>();
@@ -962,7 +962,7 @@ void initiatePipeline(AlevinOpts<ProtocolT>& aopt,
   }
 }
 
-int salmonBarcoding(int argc, const char* argv[]) {
+int salmonBarcoding(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& salmonIndex) {
   namespace bfs = boost::filesystem;
   namespace po = boost::program_options;
 
@@ -1022,7 +1022,7 @@ salmon-based processing of single-cell RNA-seq data..
 
     bool noTgMap {false};
     bool dropseq = vm["dropseq"].as<bool>();
-    bool indrop = vm["indrop"].as<bool>();
+    bool indropV2 = vm["indropV2"].as<bool>();
     bool citeseq = vm["citeseq"].as<bool>();
     bool chromV3 = vm["chromiumV3"].as<bool>();
     bool chrom = vm["chromium"].as<bool>();
@@ -1030,6 +1030,7 @@ salmon-based processing of single-cell RNA-seq data..
     bool celseq = vm["celseq"].as<bool>();
     bool celseq2 = vm["celseq2"].as<bool>();
     bool quartzseq2 = vm["quartzseq2"].as<bool>();
+    bool sciseq3 = vm["sciseq3"].as<bool>();
     bool custom_old =  vm.count("barcodeLength") and
                    vm.count("umiLength") and
                    vm.count("end");
@@ -1039,7 +1040,7 @@ salmon-based processing of single-cell RNA-seq data..
 
     uint8_t validate_num_protocols {0};
     if (dropseq) validate_num_protocols += 1;
-    if (indrop) validate_num_protocols += 1;
+    if (indropV2) validate_num_protocols += 1;
     if (citeseq) { validate_num_protocols += 1; noTgMap = true;}
     if (chromV3) validate_num_protocols += 1;
     if (chrom) validate_num_protocols += 1;
@@ -1047,6 +1048,7 @@ salmon-based processing of single-cell RNA-seq data..
     if (celseq) validate_num_protocols += 1;
     if (celseq2) validate_num_protocols += 1;
     if (quartzseq2) validate_num_protocols += 1;
+    if (sciseq3) validate_num_protocols += 1;
     if (custom) validate_num_protocols += 1;
 
     if ( validate_num_protocols != 1 ) {
@@ -1077,22 +1079,20 @@ salmon-based processing of single-cell RNA-seq data.
       //aopt.jointLog->warn("Using DropSeq Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
-    else if(indrop){
-      std::cout<<"Indrop get neighbors removed, please use other protocols";
-      exit(1);
+    else if(indropV2){
       if(vm.count("w1") != 0){
         std::string w1 = vm["w1"].as<std::string>();
-        AlevinOpts<apt::InDrop> aopt;
+        AlevinOpts<apt::InDropV2> aopt;
         aopt.protocol.setW1(w1);
-        //aopt.jointLog->warn("Using InDrop Setting for Alevin");
+        //aopt.jointLog->warn("Using InDropV2 Setting for Alevin");
         initiatePipeline(aopt, sopt, orderedOptions,
                          vm, commentString, noTgMap,
-                         barcodeFiles, readFiles);
+                         barcodeFiles, readFiles, salmonIndex);
       }
       else{
-        fmt::print(stderr, "ERROR: indrop needs w1 flag too.\n Exiting Now");
+        fmt::print(stderr, "ERROR: indropV2 needs w1 flag too.\n Exiting Now");
         exit(1);
       }
     }
@@ -1102,10 +1102,10 @@ salmon-based processing of single-cell RNA-seq data.
         aopt.protocol.setFeatureLength(vm["featureLength"].as<size_t>());
         aopt.protocol.setFeatureStart(vm["featureStart"].as<size_t>());
 
-        //aopt.jointLog->warn("Using InDrop Setting for Alevin");
+        //aopt.jointLog->warn("Using InDropV2 Setting for Alevin");
         initiatePipeline(aopt, sopt, orderedOptions,
                          vm, commentString, noTgMap,
-                         barcodeFiles, readFiles);
+                         barcodeFiles, readFiles, salmonIndex);
       }
       else{
         fmt::print(stderr, "ERROR: citeseq needs featureStart and featureLength flag too.\n Exiting Now");
@@ -1117,54 +1117,60 @@ salmon-based processing of single-cell RNA-seq data.
       //aopt.jointLog->warn("Using 10x v3 Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
     else if(chrom){
       AlevinOpts<apt::Chromium> aopt;
       //aopt.jointLog->warn("Using 10x v2 Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
     else if(gemcode){
       AlevinOpts<apt::Gemcode> aopt;
       //aopt.jointLog->warn("Using 10x v1 Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       unmateFiles, readFiles);
+                       unmateFiles, readFiles, salmonIndex);
     }
     else if(celseq){
       AlevinOpts<apt::CELSeq> aopt;
       //aopt.jointLog->warn("Using CEL-Seq Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
     else if(celseq2){
       AlevinOpts<apt::CELSeq2> aopt;
       //aopt.jointLog->warn("Using CEL-Seq2 Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
     else if(quartzseq2){
       AlevinOpts<apt::QuartzSeq2> aopt;
       //aopt.jointLog->warn("Using Quartz-Seq2 Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
+    } else if(sciseq3){
+      AlevinOpts<apt::SciSeq3> aopt;
+      //aopt.jointLog->warn("Using Sci-Seq3 Setting for Alevin");
+      initiatePipeline(aopt, sopt, orderedOptions,
+                       vm, commentString, noTgMap,
+                       barcodeFiles, readFiles, salmonIndex);
     } else if (custom_old) {
       AlevinOpts<apt::Custom> aopt;
       //aopt.jointLog->warn("Using Custom Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     } else if (custom_new) {
       AlevinOpts<apt::CustomGeometry> aopt;
       //aopt.jointLog->warn("Using Custom Setting for Alevin");
       initiatePipeline(aopt, sopt, orderedOptions,
                        vm, commentString, noTgMap,
-                       barcodeFiles, readFiles);
+                       barcodeFiles, readFiles, salmonIndex);
     }
 
   } catch (po::error& e) {


=====================================
src/AlevinHash.cpp
=====================================
@@ -294,7 +294,7 @@ int salmonHashQuantify(AlevinOpts<apt::CITESeq>& aopt,
                        bfs::path& outputDirectory,
                        CFreqMapT& freqCounter);
 template
-int salmonHashQuantify(AlevinOpts<apt::InDrop>& aopt,
+int salmonHashQuantify(AlevinOpts<apt::InDropV2>& aopt,
                        bfs::path& outputDirectory,
                        CFreqMapT& freqCounter);
 template
@@ -310,6 +310,10 @@ int salmonHashQuantify(AlevinOpts<apt::QuartzSeq2>& aopt,
                        bfs::path& outputDirectory,
                        CFreqMapT& freqCounter);
 template
+int salmonHashQuantify(AlevinOpts<apt::SciSeq3>& aopt,
+                       bfs::path& outputDirectory,
+                       CFreqMapT& freqCounter);
+template
 int salmonHashQuantify(AlevinOpts<apt::Custom>& aopt,
                        bfs::path& outputDirectory,
                        CFreqMapT& freqCounter);


=====================================
src/AlevinUtils.cpp
=====================================
@@ -101,6 +101,14 @@ namespace alevin {
       return &seq2;
     }
     template <>
+    std::string*  getReadSequence(apt::SciSeq3& protocol,
+                         std::string& seq,
+                         std::string& seq2,
+                         std::string& subseq){
+      (void)seq;
+      return &seq2;
+    }
+    template <>
     std::string*  getReadSequence(apt::Custom& protocol,
                          std::string& seq,
                          std::string& seq2,
@@ -126,12 +134,12 @@ namespace alevin {
       return &seq2;
     }
     template <>
-    std::string*  getReadSequence(apt::InDrop& protocol,
+    std::string*  getReadSequence(apt::InDropV2& protocol,
                          std::string& seq,
                          std::string& seq2,
                          std::string& subseq){
-      (void)seq;
-      return &seq2;
+      (void)seq2;
+      return &seq;
     }
     // end of read extraction
 
@@ -223,6 +231,16 @@ namespace alevin {
         (umi.assign(read, 0, pt.umiLength), true) : false;
     }
     template <>
+    bool extractUMI<apt::SciSeq3>(std::string& read,
+                                     std::string& read2,
+                                     apt::SciSeq3& pt,
+                                     std::string& umi){
+      (void)read2;
+      return (read.length() >= pt.barcodeLength + pt.umiLength && 
+        pt.anchorPos != std::string::npos) ? // for the rare case if barcode has one N and thus gets recovered
+          (umi.assign(read, pt.anchorPos + pt.anchorSeqLen, pt.umiLength), true) : false;
+    }
+    template <>
     bool extractUMI<apt::CELSeq>(std::string& read,
                                  std::string& read2,
                                  apt::CELSeq& pt,
@@ -233,14 +251,14 @@ namespace alevin {
       return true;
     }
     template <>
-    bool extractUMI<apt::InDrop>(std::string& read,
+    bool extractUMI<apt::InDropV2>(std::string& read,
                                  std::string& read2,
-                                 apt::InDrop& pt,
+                                 apt::InDropV2& pt,
                                  std::string& umi){
       (void)read;
-      (void)read2;
-      std::cout<<"Incorrect call for umi extract";
-      exit(1);
+       return (read2.length() >= pt.w1Length + pt.barcodeLength + pt.umiLength) ?
+        (umi.assign(read2, pt.bc2EndPos, pt.umiLength), true) : false;
+      return true;
     }
 
     template <>
@@ -289,6 +307,27 @@ namespace alevin {
         (bc.assign(read, 0, pt.barcodeLength), true) : false;
     }
     template <>
+    bool extractBarcode<apt::SciSeq3>(std::string& read,
+                                         std::string& read2,
+                                          apt::SciSeq3& pt,
+                                          std::string& bc){
+      (void)read2;
+      pt.anchorPos = read.find(pt.anchorSeq);
+      if (pt.anchorPos != std::string::npos && ( pt.anchorPos == pt.maxHairpinIndexLen || pt.anchorPos == pt.maxHairpinIndexLen -1) // only 2 possible values of pt.anchorPos
+         && read.length() >= pt.barcodeLength + pt.umiLength + pt.anchorSeqLen) {
+           std::string bcAssign = read.substr(0,pt.anchorPos) + read.substr(pt.anchorPos + pt.anchorSeqLen + pt.umiLength, pt.rtIdxLen);
+        if (pt.anchorPos < pt.maxHairpinIndexLen) { // hairpin index can be 9 or 10 bp
+           bcAssign += "AC";
+        } else {
+          bcAssign += "A";
+        }
+        bc.assign(bcAssign);
+        return true;
+      } else {
+        return false;
+      }
+    }
+    template <>
     bool extractBarcode<apt::Custom>(std::string& read,
                                      std::string& read2,
                                      apt::Custom& pt,
@@ -339,21 +378,49 @@ namespace alevin {
         (bc.assign(read, pt.umiLength, pt.barcodeLength), true) : false;
     }
     template <>
-    bool extractBarcode<apt::InDrop>(std::string& read, 
+    bool extractBarcode<apt::InDropV2>(std::string& read, 
                                      std::string& read2, 
-                                     apt::InDrop& pt, std::string& bc){
-      (void)read2;
-      std::string::size_type index = read.find(pt.w1);
-      if (index == std::string::npos){
-        return false;
+                                     apt::InDropV2& pt, std::string& bc){
+      (void)read;
+      if(read2.length() >= (pt.w1Length + pt.barcodeLength + pt.umiLength)) {
+      pt.w1Pos = read2.find(pt.w1);
+      if (pt.w1Pos == std::string::npos){
+        bool found = false;
+        for( int i = 8; i <= 11; i++){
+          if (hammingDistance(pt.w1, read2.substr(i,pt.w1Length)) <= pt.maxHammingDist) {
+            pt.w1Pos = i;
+            found = true;
+            break;
+          }
+        }
+        if (!found) {return false;}
       }
-      bc = read.substr(0, index);
-      if(bc.size()<8 or bc.size()>12){
+      if(pt.w1Pos < 8 or pt.w1Pos > 11){
         return false;
       }
+      bc = read2.substr(0, pt.w1Pos);
       uint32_t offset = bc.size()+pt.w1.size();
-      bc += read.substr(offset, offset+8);
+      bc += read2.substr(offset, pt.bc2Len);
+      switch (pt.barcodeLength - bc.size())
+      {
+      case 1:
+        bc += "A";
+        break;
+      case 2:
+        bc += "AT";
+        break;
+      case 3:
+        bc += "AAG";
+        break;
+      case 4:
+        bc += "AAAC";
+        break;
+      }
+      pt.bc2EndPos = offset+pt.bc2Len;
       return true;
+      } else {
+        return false;
+      }
     }
 
     void getIndelNeighbors(
@@ -437,6 +504,16 @@ namespace alevin {
                         neighbors);
     }
 
+    unsigned int hammingDistance(const std::string s1, const std::string s2){
+      if(s1.size() != s2.size()){
+        throw std::invalid_argument("Strings have different lengths, can't compute hamming distance");
+      }
+
+      // compute dot product for all postisions, start with 0 and add if the values are not equal
+      return std::inner_product(s1.begin(),s1.end(),s2.begin(), 0, std::plus<unsigned int>(),
+        std::not2(std::equal_to<std::string::value_type>()));
+    }
+
     void getTxpToGeneMap(spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
                          spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
                          const std::string& t2gFileName,
@@ -1274,7 +1351,7 @@ namespace alevin {
                            SalmonOpts& sopt, bool noTgMap,
                            boost::program_options::variables_map& vm);
     template
-    bool processAlevinOpts(AlevinOpts<apt::InDrop>& aopt,
+    bool processAlevinOpts(AlevinOpts<apt::InDropV2>& aopt,
                            SalmonOpts& sopt, bool noTgMap,
                            boost::program_options::variables_map& vm);
     template
@@ -1306,6 +1383,10 @@ namespace alevin {
                            SalmonOpts& sopt, bool noTgMap,
                            boost::program_options::variables_map& vm);
     template
+    bool processAlevinOpts(AlevinOpts<apt::SciSeq3>& aopt,
+                           SalmonOpts& sopt, bool noTgMap,
+                           boost::program_options::variables_map& vm);
+    template
     bool processAlevinOpts(AlevinOpts<apt::QuartzSeq2>& aopt,
                            SalmonOpts& sopt, bool noTgMap,
                            boost::program_options::variables_map& vm);


=====================================
src/BuildSalmonIndex.cpp
=====================================
@@ -43,7 +43,7 @@
 // http://stackoverflow.com/questions/108318/whats-the-simplest-way-to-test-whether-a-number-is-a-power-of-2-in-c
 bool isPowerOfTwo(uint32_t n) { return (n > 0 and (n & (n - 1)) == 0); }
 
-int salmonIndex(int argc, const char* argv[]) {
+int salmonIndex(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& /* salmonIndex */) {
 
   using std::string;
   namespace bfs = boost::filesystem;
@@ -256,3 +256,25 @@ Creates a salmon index.
   }
   return ret;
 }
+
+std::unique_ptr<SalmonIndex> checkLoadIndex(const boost::filesystem::path& indexDirectory, std::shared_ptr<spdlog::logger>& logger) {
+  // ==== Figure out the index type
+  boost::filesystem::path versionPath =
+    indexDirectory / "versionInfo.json";
+  SalmonIndexVersionInfo versionInfo;
+  versionInfo.load(versionPath);
+  if (versionInfo.indexVersion() == 0) {
+    fmt::MemoryWriter infostr;
+    infostr
+      << "Error: The index version file " << versionPath.string()
+      << " doesn't seem to exist.  Please try re-building the salmon "
+      "index.";
+    throw std::invalid_argument(infostr.str());
+  }
+  // Check index version compatibility here
+  auto indexType = versionInfo.indexType();
+  // ==== Figure out the index type
+  std::unique_ptr<SalmonIndex> res(new SalmonIndex(logger, indexType));
+  res->load(indexDirectory);
+  return res;
+}


=====================================
src/CollapsedCellOptimizer.cpp
=====================================
@@ -1438,7 +1438,7 @@ template
 bool CollapsedCellOptimizer::optimize(EqMapT& fullEqMap,
                                       spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
                                       spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
-                                      AlevinOpts<apt::InDrop>& aopt,
+                                      AlevinOpts<apt::InDropV2>& aopt,
                                       GZipWriter& gzw,
                                       std::vector<std::string>& trueBarcodes,
                                       std::vector<uint32_t>& umiCount,
@@ -1516,6 +1516,16 @@ bool CollapsedCellOptimizer::optimize(EqMapT& fullEqMap,
                                       CFreqMapT& freqCounter,
                                       size_t numLowConfidentBarcode);
 
+template
+bool CollapsedCellOptimizer::optimize(EqMapT& fullEqMap,
+                                      spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+                                      spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                                      AlevinOpts<apt::SciSeq3>& aopt,
+                                      GZipWriter& gzw,
+                                      std::vector<std::string>& trueBarcodes,
+                                      std::vector<uint32_t>& umiCount,
+                                      CFreqMapT& freqCounter,
+                                      size_t numLowConfidentBarcode);
 
 template
 bool CollapsedCellOptimizer::optimize(EqMapT& fullEqMap,


=====================================
src/GZipWriter.cpp
=====================================
@@ -1865,8 +1865,8 @@ bool GZipWriter::writeEquivCounts<SCExpT, apt::CITESeq>(
                                                         const AlevinOpts<apt::CITESeq>& aopts,
                                                         SCExpT& readExp);
 template
-bool GZipWriter::writeEquivCounts<SCExpT, apt::InDrop>(
-                                                       const AlevinOpts<apt::InDrop>& aopts,
+bool GZipWriter::writeEquivCounts<SCExpT, apt::InDropV2>(
+                                                       const AlevinOpts<apt::InDropV2>& aopts,
                                                        SCExpT& readExp);
 template
 bool GZipWriter::writeEquivCounts<SCExpT, apt::ChromiumV3>(
@@ -1893,6 +1893,10 @@ bool GZipWriter::writeEquivCounts<SCExpT, apt::QuartzSeq2>(
                                                         const AlevinOpts<apt::QuartzSeq2>& aopts,
                                                         SCExpT& readExp);
 template
+bool GZipWriter::writeEquivCounts<SCExpT, apt::SciSeq3>(
+                                                         const AlevinOpts<apt::SciSeq3>& aopts,
+                                                         SCExpT& readExp);
+template
 bool GZipWriter::writeEquivCounts<SCExpT, apt::Custom>(
                                                        const AlevinOpts<apt::Custom>& aopts,
                                                        SCExpT& readExp);
@@ -1907,7 +1911,7 @@ template bool
 GZipWriter::writeMetaAlevin<apt::CITESeq>(const AlevinOpts<apt::CITESeq>& opts,
                                           boost::filesystem::path aux_dir);
 template bool
-GZipWriter::writeMetaAlevin<apt::InDrop>(const AlevinOpts<apt::InDrop>& opts,
+GZipWriter::writeMetaAlevin<apt::InDropV2>(const AlevinOpts<apt::InDropV2>& opts,
                                          boost::filesystem::path aux_dir);
 template bool
 GZipWriter::writeMetaAlevin<apt::Chromium>(const AlevinOpts<apt::Chromium>& opts,
@@ -1925,6 +1929,9 @@ template bool
 GZipWriter::writeMetaAlevin<apt::QuartzSeq2>(const AlevinOpts<apt::QuartzSeq2>& opts,
                                              boost::filesystem::path aux_dir);
 template bool
+GZipWriter::writeMetaAlevin<apt::SciSeq3>(const AlevinOpts<apt::SciSeq3>& opts,
+                                           boost::filesystem::path aux_dir);
+template bool
 GZipWriter::writeMetaAlevin<apt::Custom>(const AlevinOpts<apt::Custom>& opts,
                                          boost::filesystem::path aux_dir);
 template bool


=====================================
src/ProgramOptionsGenerator.cpp
=====================================
@@ -335,8 +335,8 @@ namespace salmon {
                                        "alevin-developer Options");
     alevindevs.add_options()
       (
-       "indrop", po::bool_switch()->default_value(alevin::defaults::isInDrop),
-       "Use inDrop (not extensively tested) Single Cell protocol for the library. must specify w1 too.")
+       "indropV2", po::bool_switch()->default_value(alevin::defaults::isInDrop),
+       "Use inDropV2 Single Cell protocol for the library. Must specify w1 too.")
       (
        "w1", po::value<std::string>(),
        "Must be used in conjunction with inDrop;")
@@ -413,6 +413,9 @@ namespace salmon {
       (
        "quartzseq2", po::bool_switch()->default_value(alevin::defaults::isQuartzSeq2),
        "Use Quartz-Seq2 v3.2 Single Cell protocol for the library assumes 15 length barcode and 8 length UMI.")
+      (
+       "sciseq3", po::bool_switch()->default_value(alevin::defaults::isSciSeq3),
+       "Use sci-RNA-seq3 protocol for the library.")
       (
        "whitelist", po::value<std::string>(),
        "File containing white-list barcodes")


=====================================
src/Salmon.cpp
=====================================
@@ -46,8 +46,9 @@
 #include "GenomicFeature.hpp"
 #include "SalmonConfig.hpp"
 #include "VersionChecker.hpp"
+#include "SalmonIndex.hpp"
 
-int help(const std::vector<std::string>& /*opts*/) { 
+int help(const std::vector<std::string>& /*opts*/) {
   fmt::MemoryWriter helpMsg;
   helpMsg.write("salmon v{}\n\n", salmon::version);
   helpMsg.write(
@@ -90,10 +91,12 @@ int dualModeMessage() {
   return 0;
 }
 
+typedef std::function<int(int, const char*[], std::unique_ptr<SalmonIndex>& index)> SubCmdType;
+
 /**
  * Bonus!
  */
-int salmonSwim(int /*argc*/, const char* /*argv*/[]) {
+int salmonSwim(int /*argc*/, const char* /*argv*/[], std::unique_ptr<SalmonIndex>& /*index*/) {
 
   std::cout << R"(
     _____       __
@@ -144,17 +147,18 @@ bibtex:
 )";
 }
 
-int salmonIndex(int argc, const char* argv[]);
-int salmonQuantify(int argc, const char* argv[]);
-int salmonAlignmentQuantify(int argc, const char* argv[]);
+int salmonIndex(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index);
+int salmonQuantify(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index);
+int salmonAlignmentQuantify(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index);
+int salmonAlignmentDualMode(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index);
 // TODO : PF_INTEGRATION
-int salmonBarcoding(int argc, const char* argv[]);
-int salmonQuantMerge(int argc, const char* argv[]);
+int salmonBarcoding(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index);
+int salmonQuantMerge(int argc, const char* argv[],
+                     std::unique_ptr<SalmonIndex>& index);
 
 bool verbose = false;
 
-int main(int argc, char* argv[]) {
-  show_backtrace();
+int main(int argc, const char* argv[]) {
   using std::string;
   namespace po = boost::program_options;
   std::setlocale(LC_ALL, "en_US.UTF-8");
@@ -235,9 +239,9 @@ int main(int argc, char* argv[]) {
       opts.insert(opts.begin(), "--help");
     }
 
-    std::unordered_map<string, std::function<int(int, const char* [])>> cmds(
+    std::unordered_map<string, SubCmdType> cmds(
         {{"index", salmonIndex},
-         {"quant", salmonQuantify},
+         {"quant", salmonAlignmentDualMode},
          {"quantmerge", salmonQuantMerge},
          // TODO : PF_INTEGRATION
          {"alevin", salmonBarcoding},
@@ -251,64 +255,24 @@ int main(int argc, char* argv[]) {
     std::copy_n( &argv[topLevelArgc], argc-topLevelArgc, &argv2[1] );
     */
 
-    int32_t subCommandArgc = opts.size() + 1;
-    std::unique_ptr<const char* []> argv2(new const char*[subCommandArgc]);
+    std::unique_ptr<SalmonIndex> preloadedIndex;
+
+    int32_t nargc = opts.size() + 1;
+    std::unique_ptr<const char* []> argv2(new const char*[nargc]);
     argv2[0] = argv[0];
-    for (int32_t i = 0; i < subCommandArgc - 1; ++i) {
+    for (int32_t i = 0; i < nargc - 1; ++i) {
       argv2[i + 1] = opts[i].c_str();
     }
+    const char** nargv = argv2.get();
 
-    auto cmdMain = cmds.find(cmd);
-    if (cmdMain == cmds.end()) {
-      // help(subCommandArgc, argv2);
-      return help(opts);
-    } else {
-      // If the command is quant; determine whether
-      // we're quantifying with raw sequences or alignments
-      if (cmdMain->first == "quant") {
-
-        if (subCommandArgc < 2) {
-          return dualModeMessage();
-        }
-        // detect mode-specific help request
-        if (strncmp(argv2[1], "--help-alignment", 16) == 0) {
-          std::vector<char> helpStr{'-', '-', 'h', 'e', 'l', 'p', '\0'};
-          const char* helpArgv[] = {argv[0], &helpStr[0]};
-          return salmonAlignmentQuantify(2, helpArgv);
-        } else if (strncmp(argv2[1], "--help-reads", 12) == 0) {
-          std::vector<char> helpStr{'-', '-', 'h', 'e', 'l', 'p', '\0'};
-          const char* helpArgv[] = {argv[0], &helpStr[0]};
-          return salmonQuantify(2, helpArgv);
-        }
-
-        // detect general help request
-        if (strncmp(argv2[1], "--help", 6) == 0 or
-            strncmp(argv2[1], "-h", 2) == 0) {
-          return dualModeMessage();
-        }
-
-        // otherwise, detect and dispatch the correct mode
-        bool useSalmonAlign{false};
-        for (int32_t i = 0; i < subCommandArgc; ++i) {
-          if (strncmp(argv2[i], "-a", 2) == 0 or
-              strncmp(argv2[i], "-e", 2) == 0 or
-              strncmp(argv2[i], "--alignments", 12) == 0 or
-              strncmp(argv2[i], "--eqclasses", 11) == 0 or
-              strcmp(argv2[i], "--ont") == 0) {
-            useSalmonAlign = true;
-            break;
-          }
-        }
-        if (useSalmonAlign) {
-          return salmonAlignmentQuantify(subCommandArgc, argv2.get());
-        } else {
-          return salmonQuantify(subCommandArgc, argv2.get());
-        }
-      } else {
-        return cmdMain->second(subCommandArgc, argv2.get());
+    while(true) {
+      auto cmdMain = cmds.find(cmd);
+      if (cmdMain == cmds.end()) {
+        // help(subCommandArgc, argv2);
+        return help(opts);
       }
+      return cmdMain->second(nargc, nargv, preloadedIndex);
     }
-
   } catch (po::error& e) {
     std::cerr << "Program Option Error (main) : [" << e.what()
               << "].\n Exiting.\n";
@@ -321,3 +285,41 @@ int main(int argc, char* argv[]) {
 
   return 0;
 }
+
+int salmonAlignmentDualMode(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& index) {
+  // If the command is quant; determine whether
+  // we're quantifying with raw sequences or alignments
+  if (argc < 2) {
+    return dualModeMessage();
+  }
+  // detect mode-specific help request
+  if (strncmp(argv[1], "--help-alignment", 16) == 0) {
+    const char* helpArgv[] = {argv[0], "--help", nullptr};
+    return salmonAlignmentQuantify(2, helpArgv, index);
+  } else if (strncmp(argv[1], "--help-reads", 12) == 0) {
+    const char* helpArgv[] = {argv[0], "--help", nullptr};
+    return salmonQuantify(2, helpArgv, index);
+  }
+
+  // detect general help request
+  if (strncmp(argv[1], "--help", 6) == 0 or strncmp(argv[1], "-h", 2) == 0) {
+    return dualModeMessage();
+  }
+
+  // otherwise, detect and dispatch the correct mode
+  bool useSalmonAlign{false};
+  for (int i = 0; i < argc; ++i) {
+    if (strncmp(argv[i], "-a", 2) == 0 or
+        strncmp(argv[i], "-e", 2) == 0 or
+        strncmp(argv[i], "--alignments", 12) == 0 or
+        strncmp(argv[i], "--eqclasses", 11) == 0) {
+      useSalmonAlign = true;
+      break;
+    }
+  }
+  if (useSalmonAlign) {
+    return salmonAlignmentQuantify(argc, argv, index);
+  } else {
+    return salmonQuantify(argc, argv, index);
+  }
+}


=====================================
src/SalmonAlevin.cpp
=====================================
@@ -28,6 +28,7 @@
 #include <functional>
 #include <iterator>
 #include <map>
+#include <memory>
 #include <mutex>
 #include <queue>
 #include <random>
@@ -633,6 +634,7 @@ void process_reads_sc_sketch(paired_parser* parser, ReadExperimentT& readExp, Re
     	extraBAMtags.reserve(reserveSize);
     }
 
+    auto localProtocol = alevinOpts.protocol;
     for (size_t i = 0; i < rangeSize; ++i) { // For all the read in this batch
       auto& rp = rg[i];
       readLenLeft = rp.first.seq.length();
@@ -661,7 +663,7 @@ void process_reads_sc_sketch(paired_parser* parser, ReadExperimentT& readExp, Re
       //barcode.clear();
       nonstd::optional<uint32_t> barcodeIdx;
       extraBAMtags.clear();
-      bool seqOk;
+      bool seqOk = false;
 
       // keep track of the *least* freqeuntly 
       // occurring hit in this fragment to consider 
@@ -671,19 +673,30 @@ void process_reads_sc_sketch(paired_parser* parser, ReadExperimentT& readExp, Re
 
       if (alevinOpts.protocol.end == bcEnd::FIVE ||
           alevinOpts.protocol.end == bcEnd::THREE){
-        bool extracted_bc = aut::extractBarcode(rp.first.seq, rp.second.seq, alevinOpts.protocol, barcode);
-        seqOk = (extracted_bc) ?
-          aut::sequenceCheck(barcode, Sequence::BARCODE) : false;
-
-        if (not seqOk){
-          bool recovered = aut::recoverBarcode(barcode);
-          if (recovered) { seqOk = true; }
+        // If the barcode sequence could be extracted, then this is set to true,
+        // but the barcode sequence itself may still be invalid (e.g. contain `N` characters).
+        // However, if extracted_barcode is false here, there is no hope to even recover the
+        // barcode and we shouldn't attempt it.
+        bool extracted_barcode = aut::extractBarcode(rp.first.seq, rp.second.seq, localProtocol, barcode);
+        // If we could pull out something where the barcode sequence should have been
+        // then continue to process it.
+        if (extracted_barcode) {
+          // if the barcode consisted of valid nucleotides, then seqOk is true
+          // otherwise false
+          seqOk =  aut::sequenceCheck(barcode, Sequence::BARCODE);
+          if (not seqOk){
+            // If the barcode contained invalid nucleotides
+            // this attempts to replace the first one with an `A`.
+            // If this returns true, there was only one `N` and we
+            // replaced it; otherwise there was more than one `N`
+            // and the barcode sequence should be treated as invalid.
+            seqOk = aut::recoverBarcode(barcode);
+          }
         }
 
         // If we have a valid barcode
         if (seqOk) {
-          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, alevinOpts.protocol, umi);
-          //aopt.jointLog->info("BC : {}, UMI : {}". barcode, umi);
+          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, localProtocol, umi);
           if ( !umi_ok ) {
             smallSeqs += 1;
           } else {
@@ -1162,6 +1175,7 @@ void process_reads_sc_align(paired_parser* parser, ReadExperimentT& readExp, Rea
     	extraBAMtags.reserve(reserveSize);
     }
 
+    auto localProtocol = alevinOpts.protocol;
     for (size_t i = 0; i < rangeSize; ++i) { // For all the read in this batch
       auto& rp = rg[i];
       readLenLeft = rp.first.seq.length();
@@ -1188,23 +1202,34 @@ void process_reads_sc_align(paired_parser* parser, ReadExperimentT& readExp, Rea
       //barcode.clear();
       nonstd::optional<uint32_t> barcodeIdx;
       extraBAMtags.clear();
-      bool seqOk;
+      bool seqOk = false;
 
       if (alevinOpts.protocol.end == bcEnd::FIVE ||
           alevinOpts.protocol.end == bcEnd::THREE){
-        bool extracted_barcode = aut::extractBarcode(rp.first.seq, rp.second.seq, alevinOpts.protocol, barcode);
-        seqOk = (extracted_barcode) ?
-          aut::sequenceCheck(barcode, Sequence::BARCODE) : false;
-
-        if (not seqOk){
-          bool recovered = aut::recoverBarcode(barcode);
-          if (recovered) { seqOk = true; }
+        // If the barcode sequence could be extracted, then this is set to true,
+        // but the barcode sequence itself may still be invalid (e.g. contain `N` characters).
+        // However, if extracted_barcode is false here, there is no hope to even recover the
+        // barcode and we shouldn't attempt it.
+        bool extracted_barcode = aut::extractBarcode(rp.first.seq, rp.second.seq, localProtocol, barcode);
+        // If we could pull out something where the barcode sequence should have been
+        // then continue to process it.
+        if (extracted_barcode) {
+          // if the barcode consisted of valid nucleotides, then seqOk is true
+          // otherwise false
+          seqOk =  aut::sequenceCheck(barcode, Sequence::BARCODE);
+          if (not seqOk){
+            // If the barcode contained invalid nucleotides
+            // this attempts to replace the first one with an `A`.
+            // If this returns true, there was only one `N` and we
+            // replaced it; otherwise there was more than one `N`
+            // and the barcode sequence should be treated as invalid.
+            seqOk = aut::recoverBarcode(barcode);
+          }
         }
 
         // If we have a valid barcode
         if (seqOk) {
-          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, alevinOpts.protocol, umi);
-
+          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, localProtocol, umi);
           if ( !umi_ok ) {
             smallSeqs += 1;
           } else {
@@ -1384,7 +1409,7 @@ void process_reads_sc_align(paired_parser* parser, ReadExperimentT& readExp, Rea
         } else {
           if (barcode_ok) {
             unmapped_bc_map[bck.word(0)] += 1;
-          }
+          } 
         }
 
 
@@ -1619,6 +1644,7 @@ void processReadsQuasi(
     	extraBAMtags.reserve(reserveSize);
     }
 
+    auto localProtocol = alevinOpts.protocol;
     for (size_t i = 0; i < rangeSize; ++i) { // For all the read in this batch
       auto& rp = rg[i];
       readLenLeft = rp.first.seq.length();
@@ -1646,17 +1672,29 @@ void processReadsQuasi(
       //barcode.clear();
       nonstd::optional<uint32_t> barcodeIdx;
       extraBAMtags.clear();
-      bool seqOk;
+      bool seqOk = false;
 
       if (alevinOpts.protocol.end == bcEnd::FIVE ||
           alevinOpts.protocol.end == bcEnd::THREE){
-        bool extracted_barcode = aut::extractBarcode(rp.first.seq, rp.second.seq, alevinOpts.protocol, barcode);
-        seqOk = (extracted_barcode) ?
-          aut::sequenceCheck(barcode, Sequence::BARCODE) : false;
-
-        if (not seqOk){
-          bool recovered = aut::recoverBarcode(barcode);
-          if (recovered) { seqOk = true; }
+        // If the barcode sequence could be extracted, then this is set to true,
+        // but the barcode sequence itself may still be invalid (e.g. contain `N` characters).
+        // However, if extracted_barcode is false here, there is no hope to even recover the
+        // barcode and we shouldn't attempt it.
+        bool extracted_barcode = aut::extractBarcode(rp.first.seq, rp.second.seq, localProtocol, barcode);
+        // If we could pull out something where the barcode sequence should have been
+        // then continue to process it.
+        if (extracted_barcode) {
+          // if the barcode consisted of valid nucleotides, then seqOk is true
+          // otherwise false
+          seqOk =  aut::sequenceCheck(barcode, Sequence::BARCODE);
+          if (not seqOk){
+            // If the barcode contained invalid nucleotides
+            // this attempts to replace the first one with an `A`.
+            // If this returns true, there was only one `N` and we
+            // replaced it; otherwise there was more than one `N`
+            // and the barcode sequence should be treated as invalid.
+            seqOk = aut::recoverBarcode(barcode);
+          }
         }
 
         // If we have a barcode sequence, but not yet an index
@@ -1694,7 +1732,7 @@ void processReadsQuasi(
         if (barcodeIdx) {
           //corrBarcodeIndex = barcodeMap[barcodeIndex];
           jointHitGroup.setBarcode(*barcodeIdx);
-          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, alevinOpts.protocol, umi);
+          bool umi_ok = aut::extractUMI(rp.first.seq, rp.second.seq, localProtocol, umi);
 
           if ( !umi_ok ) {
             smallSeqs += 1;
@@ -2662,7 +2700,8 @@ void alevinOptimize( std::vector<std::string>& trueBarcodesVec,
 template <typename ProtocolT>
 int alevin_sc_align(AlevinOpts<ProtocolT>& aopt,
                     SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions){
+                    boost::program_options::parsed_options& orderedOptions,
+                    std::unique_ptr<SalmonIndex>& salmonIndex){
   using std::cerr;
   using std::vector;
   using std::string;
@@ -2689,13 +2728,12 @@ int alevin_sc_align(AlevinOpts<ProtocolT>& aopt,
     }
     // ==== END: Library format processing ===
 
-    SalmonIndexVersionInfo versionInfo;
-    boost::filesystem::path versionPath = indexDirectory / "versionInfo.json";
-    versionInfo.load(versionPath);
-    auto idxType = versionInfo.indexType();
+    if(!salmonIndex)
+      salmonIndex = checkLoadIndex(indexDirectory, sopt.jointLog);
+    auto idxType = salmonIndex->indexType();
 
     MappingStatistics mstats;
-    ReadExperimentT experiment(readLibraries, indexDirectory, sopt);
+    ReadExperimentT experiment(readLibraries, salmonIndex.get(), sopt);
 
     // We currently do not support decoy sequence in the 
     // --justAlign or --sketch modes, so check that the 
@@ -2778,7 +2816,8 @@ int alevinQuant(AlevinOpts<ProtocolT>& aopt,
                 spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
                 spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter, size_t numLowConfidentBarcode){
+                CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+                std::unique_ptr<SalmonIndex>& salmonIndex){
   using std::cerr;
   using std::vector;
   using std::string;
@@ -2804,14 +2843,12 @@ int alevinQuant(AlevinOpts<ProtocolT>& aopt,
       std::exit(1);
     }
     // ==== END: Library format processing ===
-
-    SalmonIndexVersionInfo versionInfo;
-    boost::filesystem::path versionPath = indexDirectory / "versionInfo.json";
-    versionInfo.load(versionPath);
-    auto idxType = versionInfo.indexType();
+    if(!salmonIndex)
+      salmonIndex = checkLoadIndex(indexDirectory, sopt.jointLog);
+    auto idxType = salmonIndex->indexType();
 
     MappingStatistics mstats;
-    ReadExperimentT experiment(readLibraries, indexDirectory, sopt);
+    ReadExperimentT experiment(readLibraries, salmonIndex.get(), sopt);
     //experiment.computePolyAPositions();
 
     // This will be the class in charge of maintaining our
@@ -2996,133 +3033,132 @@ int alevinQuant(AlevinOpts<ProtocolT>& aopt,
 
 namespace apt = alevin::protocols;
 
-template 
-int alevin_sc_align(AlevinOpts<apt::DropSeq>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-
 template
-int alevinQuant(AlevinOpts<apt::DropSeq>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
-                boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
+int alevin_sc_align(AlevinOpts<apt::DropSeq>& aopt, SalmonOpts& sopt,
+                    boost::program_options::parsed_options& orderedOptions,
+                    std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevinQuant(AlevinOpts<apt::DropSeq>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
 
-template 
-int alevin_sc_align(AlevinOpts<apt::CITESeq>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
 template
-int alevinQuant(AlevinOpts<apt::CITESeq>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+int alevin_sc_align(AlevinOpts<apt::CITESeq>& aopt, SalmonOpts& sopt,
+                    boost::program_options::parsed_options& orderedOptions,
+                    std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::CITESeq>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::InDropV2>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-
-template 
-int alevin_sc_align(AlevinOpts<apt::InDrop>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::InDrop>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::InDropV2>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::ChromiumV3>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-
-template 
-int alevin_sc_align(AlevinOpts<apt::ChromiumV3>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::ChromiumV3>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::ChromiumV3>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::Chromium>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-
-template 
-int alevin_sc_align(AlevinOpts<apt::Chromium>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::Chromium>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::Chromium>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::Gemcode>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-
-template 
-int alevin_sc_align(AlevinOpts<apt::Gemcode>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::Gemcode>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::Gemcode>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::CELSeq>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-
-template 
-int alevin_sc_align(AlevinOpts<apt::CELSeq>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::CELSeq>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::CELSeq>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::CELSeq2>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-                
-template 
-int alevin_sc_align(AlevinOpts<apt::CELSeq2>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template
-int alevinQuant(AlevinOpts<apt::CELSeq2>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+template int
+alevinQuant(AlevinOpts<apt::CELSeq2>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::QuartzSeq2>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevinQuant(AlevinOpts<apt::QuartzSeq2>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
 
 template 
-int alevin_sc_align(AlevinOpts<apt::QuartzSeq2>& aopt,
+int alevin_sc_align(AlevinOpts<apt::SciSeq3>& aopt,
                     SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
+                    boost::program_options::parsed_options& orderedOptions,
+                    std::unique_ptr<SalmonIndex>& salmonIndex);
 template
-int alevinQuant(AlevinOpts<apt::QuartzSeq2>& aopt,
+int alevinQuant(AlevinOpts<apt::SciSeq3>& aopt,
                 SalmonOpts& sopt,
                 SoftMapT& barcodeMap,
                 TrueBcsT& trueBarcodes,
@@ -3130,34 +3166,34 @@ int alevinQuant(AlevinOpts<apt::QuartzSeq2>& aopt,
                 spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
                 boost::program_options::parsed_options& orderedOptions,
                 CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
+                size_t numLowConfidentBarcode,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
 
-template 
-int alevin_sc_align(AlevinOpts<apt::Custom>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
-template 
-int alevin_sc_align(AlevinOpts<apt::CustomGeometry>& aopt,
-                    SalmonOpts& sopt,
-                    boost::program_options::parsed_options& orderedOptions);
 
-template
-int alevinQuant(AlevinOpts<apt::Custom>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+template int
+alevin_sc_align(AlevinOpts<apt::Custom>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
-template
-int alevinQuant(AlevinOpts<apt::CustomGeometry>& aopt,
-                SalmonOpts& sopt,
-                SoftMapT& barcodeMap,
-                TrueBcsT& trueBarcodes,
-                spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
-                spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevinQuant(AlevinOpts<apt::Custom>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevin_sc_align(AlevinOpts<apt::CustomGeometry>& aopt, SalmonOpts& sopt,
                 boost::program_options::parsed_options& orderedOptions,
-                CFreqMapT& freqCounter,
-                size_t numLowConfidentBarcode);
+                std::unique_ptr<SalmonIndex>& salmonIndex);
+
+template int
+alevinQuant(AlevinOpts<apt::CustomGeometry>& aopt, SalmonOpts& sopt,
+            SoftMapT& barcodeMap, TrueBcsT& trueBarcodes,
+            spp::sparse_hash_map<uint32_t, uint32_t>& txpToGeneMap,
+            spp::sparse_hash_map<std::string, uint32_t>& geneIdxMap,
+            boost::program_options::parsed_options& orderedOptions,
+            CFreqMapT& freqCounter, size_t numLowConfidentBarcode,
+            std::unique_ptr<SalmonIndex>& salmonIndex);


=====================================
src/SalmonQuantMerge.cpp
=====================================
@@ -27,6 +27,7 @@
 // C++ string formatting library #include "spdlog/fmt/fmt.h"
 // logger includes
 #include "spdlog/spdlog.h"
+#include "SalmonIndex.hpp"
 
 enum class TargetColumn { LEN, ELEN, TPM, NREADS };
 
@@ -213,7 +214,7 @@ bool doMerge(QuantMergeOptions& qmOpts) {
   return true;
 }
 
-int salmonQuantMerge(int argc, const char* argv[]) {
+int salmonQuantMerge(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& /* salmonIndex */) {
   using std::cerr;
   using std::vector;
   using std::string;


=====================================
src/SalmonQuantify.cpp
=====================================
@@ -27,6 +27,7 @@
 #include <functional>
 #include <iterator>
 #include <map>
+#include <memory>
 #include <mutex>
 #include <queue>
 #include <random>
@@ -2385,7 +2386,7 @@ void quantifyLibrary(ReadExperimentT& experiment,
   jointLog->info("finished quantifyLibrary()");
 }
 
-int salmonQuantify(int argc, const char* argv[]) {
+int salmonQuantify(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& salmonIndex) {
   using std::cerr;
   using std::vector;
   using std::string;
@@ -2489,13 +2490,12 @@ transcript abundance from RNA-seq reads
     }
     // ==== END: Library format processing ===
 
-    SalmonIndexVersionInfo versionInfo;
-    boost::filesystem::path versionPath = indexDirectory / "versionInfo.json";
-    versionInfo.load(versionPath);
-    auto idxType = versionInfo.indexType();
+    if(!salmonIndex) {
+      salmonIndex = checkLoadIndex(indexDirectory, sopt.jointLog);
+    }
 
     MappingStatistics mstats;
-    ReadExperimentT experiment(readLibraries, indexDirectory, sopt);
+    ReadExperimentT experiment(readLibraries, salmonIndex.get(), sopt);
 
     // This will be the class in charge of maintaining our
     // rich equivalence classes


=====================================
src/SalmonQuantifyAlignments.cpp
=====================================
@@ -1573,7 +1573,7 @@ bool runSingleEndSample(std::vector<bfs::path>& alignmentFiles, bfs::path& trans
   return processSample<UnpairedRead>(alnLib, requiredObservations, sopt, sopt.outputDirectory);
 }
 
-int salmonAlignmentQuantify(int argc, const char* argv[]) {
+int salmonAlignmentQuantify(int argc, const char* argv[], std::unique_ptr<SalmonIndex>& /* salmon_index */) {
   using std::cerr;
   using std::vector;
   using std::string;


=====================================
src/WhiteList.cpp
=====================================
@@ -260,7 +260,7 @@ namespace alevin {
                                       std::vector<std::string>& trueBarcodes,
                                       bool useRibo, bool useMito,
                                       size_t numLowConfidentBarcode);
-    template bool performWhitelisting(AlevinOpts<alevin::protocols::InDrop>& aopt,
+    template bool performWhitelisting(AlevinOpts<alevin::protocols::InDropV2>& aopt,
                                       std::vector<std::string>& trueBarcodes,
                                       bool useRibo, bool useMito,
                                       size_t numLowConfidentBarcode);
@@ -288,6 +288,10 @@ namespace alevin {
                                       std::vector<std::string>& trueBarcodes,
                                       bool useRibo, bool useMito,
                                       size_t numLowConfidentBarcode);
+    template bool performWhitelisting(AlevinOpts<alevin::protocols::SciSeq3>& aopt,
+                                      std::vector<std::string>& trueBarcodes,
+                                      bool useRibo, bool useMito,
+                                      size_t numLowConfidentBarcode);
     template bool performWhitelisting(AlevinOpts<alevin::protocols::Custom>& aopt,
                                       std::vector<std::string>& trueBarcodes,
                                       bool useRibo, bool useMito,



View it on GitLab: https://salsa.debian.org/med-team/salmon/-/compare/bf67a606acd3cf2440116d1959a85805f5160060...b7cbdadd83f32b00b5b052c3972aa5720f986f31

-- 
View it on GitLab: https://salsa.debian.org/med-team/salmon/-/compare/bf67a606acd3cf2440116d1959a85805f5160060...b7cbdadd83f32b00b5b052c3972aa5720f986f31
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20211214/fab99b4c/attachment-0001.htm>


More information about the debian-med-commit mailing list