[Debian-med-packaging] I failed Fwd: [med-svn] [Git][med-team/gatb-core][master] 7 commits: New upstream version
    Steffen Möller 
    steffen_moeller at gmx.de
       
    Thu Dec  5 00:21:23 GMT 2019
    
    
  
Hi, hi Andres in particular,
I tried my best with routine-update on this one but somehow the shared
libraries give me bad vibes. This seems related to the previous fix of
yours. Could you please have a look?
Cheers,
Steffen
-------- Forwarded Message --------
Subject: 	[med-svn] [Git][med-team/gatb-core][master] 7 commits: New
upstream version
Date: 	Wed, 04 Dec 2019 23:52:04 +0000
From: 	Steffen Möller <gitlab at salsa.debian.org>
Reply-To: 	noreply at salsa.debian.org
To: 	debian-med-commit at lists.alioth.debian.org
GitLab
      Steffen Möller pushed to branch master at Debian Med / gatb-core
      <https://salsa.debian.org/med-team/gatb-core>
        Commits:
  * *f9b878ed
    <https://salsa.debian.org/med-team/gatb-core/commit/f9b878ed15fdddf238c53c3d55c1ce95614a9c87>*
    by Steffen Moeller /at 2019-12-04T23:22:51Z/
    New upstream version
  * *d6238780
    <https://salsa.debian.org/med-team/gatb-core/commit/d6238780168fff561ee805fa30923f988a8b9a3e>*
    by Steffen Moeller /at 2019-12-04T23:22:52Z/
    New upstream version 1.4.1+git20191130.664696c+dfsg
  * *79bb050d
    <https://salsa.debian.org/med-team/gatb-core/commit/79bb050d74f76e42f0b97d4a18c90b1de5c2c6db>*
    by Steffen Moeller /at 2019-12-04T23:22:56Z/
    Update upstream source from tag 'upstream/1.4.1+git20191130.664696c+dfsg'
    Update to upstream version '1.4.1+git20191130.664696c+dfsg'
    with Debian dir df4adf125d7696c65abb84dcc14f05bf87112c14
  * *94a8395a
    <https://salsa.debian.org/med-team/gatb-core/commit/94a8395a6a637982e9e4f1288366e79e10705388>*
    by Steffen Moeller /at 2019-12-04T23:22:59Z/
    Standards-Version: 4.4.1
  * *f2ed00d9
    <https://salsa.debian.org/med-team/gatb-core/commit/f2ed00d9a9919e39a8221f149baeaded6dd9170a>*
    by Steffen Moeller /at 2019-12-04T23:23:01Z/
    Set upstream metadata fields: Repository-Browse.
  * *61534d82
    <https://salsa.debian.org/med-team/gatb-core/commit/61534d822bc66ac7c4c18a62eb4f00f6acbb06d9>*
    by Steffen Moeller /at 2019-12-04T23:23:01Z/
    Remove obsolete fields Name from debian/upstream/metadata.
  * *c58b23ef
    <https://salsa.debian.org/med-team/gatb-core/commit/c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad>*
    by Steffen Moeller /at 2019-12-04T23:51:27Z/
    FTBFS
        27 changed files:
  * debian/changelog <#9c96da0e9f91d7d8937b69b524702c106258f0d1>
  * debian/control <#58ef006ab62b83b4bec5d81fe5b32c3b4c2d1cc2>
  * debian/upstream/metadata <#f5606a935d95a2f20059a7ac1136f47b2edadbf6>
  * gatb-core/CMakeLists.txt <#4c007bda60857aed5186f73b4dd88ee753952d0d>
  * gatb-core/doc/doxygen/src/dbgh5page.hpp
    <#2fba22cc64bca97d995e238b012045ac90567b80>
  * gatb-core/src/gatb/bank/impl/BankFasta.cpp
    <#6861fd4c43e7c21bb057ecc2ad7a189b8aaf5b33>
  * gatb-core/src/gatb/bcalm2/bglue_algo.cpp
    <#490f86cdfc07e7db3455fffd8859052bd5dbeb50>
  * gatb-core/src/gatb/bcalm2/bglue_algo.hpp
    <#349369b0ff3f84575fe0d9513fb9bfcadb1b3612>
  * gatb-core/src/gatb/debruijn/impl/Graph.cpp
    <#fa53c8abddac2ffbd797f8ad40c9772134223fed>
  * gatb-core/src/gatb/debruijn/impl/GraphUnitigs.cpp
    <#2ff315124cb4ed9eebb9d35c35743144de29a781>
  * gatb-core/src/gatb/debruijn/impl/LinkTigs.cpp
    <#34ae9ab374303a1000194fa7d58aeba9d58f32ce>
  * gatb-core/src/gatb/debruijn/impl/LinkTigs.hpp
    <#8d46876b0b01c8687ffec95a3e56a84f1d5ffc83>
  * gatb-core/src/gatb/debruijn/impl/UnitigsConstructionAlgorithm.cpp
    <#47e6fff3caa93f4ef7cc54b7f08ec8ac25b21393>
  * gatb-core/src/gatb/kmer/impl/SortingCountAlgorithm.cpp
    <#ed783240a52e80bd1c3a470d6646b3f6b3ab63e4>
  * gatb-core/src/gatb/system/impl/FileSystemCommon.hpp
    <#6dc8c95d8c6c04e96a48b4cf5c3c2ea31cc93750>
  * gatb-core/src/gatb/template/TemplateSpecialization10.cpp.in
    <#4c12716c6fcb516eda353ab99ab2b81002b77515>
  * gatb-core/src/gatb/tools/collections/impl/IteratorFile.hpp
    <#9be95217b1b4429f2eeabc2213c457fd389f08a0>
  * gatb-core/src/gatb/tools/misc/api/StringsRepository.hpp
    <#4ae6c527aafe52f4f16907962f525902d4b5ef6f>
  * gatb-core/src/gatb/tools/misc/impl/Tool.cpp
    <#3b8fda1a420e311e4fb4b0d9fe344d4bbc2f4e97>
  * gatb-core/src/gatb/tools/storage/impl/CollectionHDF5Patch.hpp
    <#f84f48b86f89b6fe6f3b401f6ed674d1ffab829b>
  * gatb-core/src/gatb/tools/storage/impl/Storage.hpp
    <#9e0109d139212f56eef6153e59e64fa4c6670360>
  * gatb-core/src/gatb/tools/storage/impl/StorageFile.hpp
    <#efdf123a4ecf95f1a9326ff6d3443694ceed944f>
  * gatb-core/src/gatb/tools/storage/impl/StorageHDF5.hpp
    <#dd31cf0d08329fcf5ce3e2647df00c99baa267ba>
  * gatb-core/test/unit/src/debruijn/TestDebruijn.cpp
    <#d7401b2bb5a2b6245ce8cd38b4b1c20e7ea1a058>
  * gatb-core/test/unit/src/kmer/TestDSK.cpp
    <#49ce0c04f76f93b18c25528d056dc803569026e0>
  * gatb-core/test/unit/src/tools/storage/TestStorage.cpp
    <#e2e4e0dc5c61805b798db5e2dee529b92e327235>
  * + gatb-core/thirdparty/update-boost.sh
    <#e79cab2022b11c71b2d8095f0ebe49260b3cc139>
        Changes:
# *debian/changelog*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9c96da0e9f91d7d8937b69b524702c106258f0d1>
------------------------------------------------------------------------
	1
+gatb-core (1.4.1+git20191130.664696c+dfsg-1) UNRELEASED; urgency=medium
	2
+
	3
+* Team upload.
	4
+* New upstream version
	5
+* Standards-Version: 4.4.1
	6
+* Set upstream metadata fields: Repository-Browse.
	7
+* Remove obsolete fields Name from debian/upstream/metadata.
	8
+
	9
+* FTBFS: Problem with symbols files, I presume
	10
+
	11
+-- Steffen Moeller <moeller at debian.org> Thu, 05 Dec 2019 00:23:01 +0100
	12
+
1 	13
  gatb-core (1.4.1+git20190813.a73b6dd+dfsg-1) unstable; urgency=medium
2 	14
3 	15
  * New upstream version
# *debian/control*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#58ef006ab62b83b4bec5d81fe5b32c3b4c2d1cc2>
------------------------------------------------------------------------
... 	... 	@@ -13,7 +13,7 @@ Build-Depends: debhelper-compat (= 12),
13 	13
  libjsoncpp-dev,
14 	14
  doxygen,
15 	15
  graphviz
16
-Standards-Version: 4.4.0
	16
+Standards-Version: 4.4.1
17 	17
  Vcs-Browser: https://salsa.debian.org/med-team/gatb-core
18 	18
  Vcs-Git: https://salsa.debian.org/med-team/gatb-core.git
19 	19
  Homepage: https://github.com/GATB/gatb-core
# *debian/upstream/metadata*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#f5606a935d95a2f20059a7ac1136f47b2edadbf6>
------------------------------------------------------------------------
1
-Name: gatb-core
2 	1
  Cite-As: >
3 	2
  E. Drezen, G. Rizk, R. Chikhi, C. Deltel, C. Lemaitre, P. Peterlongo,
4 	3
  D. Lavenier. (2014)
... 	... 	@@ -6,10 +5,10 @@ Cite-As: >
6 	5
  Bioinformatics, 30(20):2959-2961. / BioIT 2014 poster
7 	6
  Reference:
8 	7
  Author: >
9
-Erwan Drezen and Guillaume Rizk and Rayan Chikhi and Charles Deltel
10
-and Claire Lemaitre and Pierre Peterlongo and Dominique Lavenier
	8
+Erwan Drezen and Guillaume Rizk and Rayan Chikhi and Charles Deltel
	9
+and Claire Lemaitre and Pierre Peterlongo and Dominique Lavenier
11 	10
  Title: >
12
-GATB: Genome Assembly & Analysis Tool Box
	11
+GATB: Genome Assembly & Analysis Tool Box
13 	12
  Journal: Bioinformatics
14 	13
  Year: 2014
15 	14
  Volume: 30
... 	... 	@@ -19,7 +18,8 @@ Reference:
19 	18
  URL: http://dx.doi.org/10.1093/bioinformatics/btu406
20 	19
  Repository: https://github.com/GATB/gatb-core
21 	20
  Registry:
22
-- Name: OMICtools
23
-Entry: OMICS_04834
24
-- Name: conda:bioconda
25
-Entry: gatb
	21
+- Name: OMICtools
	22
+Entry: OMICS_04834
	23
+- Name: conda:bioconda
	24
+Entry: gatb
	25
+Repository-Browse: https://github.com/GATB/gatb-core
# *gatb-core/CMakeLists.txt*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4c007bda60857aed5186f73b4dd88ee753952d0d>
------------------------------------------------------------------------
... 	... 	@@ -110,8 +110,6 @@ if (debug)
110 	110
  set (LIBRARY_COMPILE_DEFINITIONS "${LIBRARY_COMPILE_DEFINITIONS}-g -p
${LIB_COMPILE_WARNINGS}")
111 	111
  set (CMAKE_BUILD_TYPE Debug) # else CMake adds DNDEBUG
112 	112
  message("-- COMPILATION IN DEBUG MODE")
113
-else()
114
-set (LIBRARY_COMPILE_DEFINITIONS "${LIBRARY_COMPILE_DEFINITIONS}-O3
-DNDEBUG ${LIB_COMPILE_WARNINGS}")
115 	113
  endif()
116 	114
117 	115
  if (INT128_FOUND)
# *gatb-core/doc/doxygen/src/dbgh5page.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#2fba22cc64bca97d995e238b012045ac90567b80>
------------------------------------------------------------------------
... 	... 	@@ -91,6 +91,8 @@
91 	91
  -verbose (1 arg) : verbosity level [default '1']
92 	92
  -email (1 arg) : send statistics to the given email address [default '']
93 	93
  -email-fmt (1 arg) : 'raw' or 'xml' [default 'raw']
	94
+-edge-km (1 arg) : Kececioglu-Myers edge representation [default '0']
	95
+
94 	96
  * \endcode
95 	97
  *
96 	98
  *
# *gatb-core/src/gatb/bank/impl/BankFasta.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#6861fd4c43e7c21bb057ecc2ad7a189b8aaf5b33>
------------------------------------------------------------------------
... 	... 	@@ -651,6 +651,7 @@ void BankFasta::Iterator::init ()
651 	651
  *bf = (buffered_file_t *) CALLOC (1, sizeof(buffered_file_t));
652 	652
  (*bf)->buffer = (unsigned char*) MALLOC (BUFFER_SIZE);
653 	653
  (*bf)->stream = gzopen (fname, "r");
	654
+gzbuffer((*bf)->stream,2*1024*1024);
654 	655
655 	656
  /** We check that we can open the file. */
656 	657
  if ((*bf)->stream == NULL)
# *gatb-core/src/gatb/bcalm2/bglue_algo.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#490f86cdfc07e7db3455fffd8859052bd5dbeb50>
------------------------------------------------------------------------
... 	... 	@@ -63,12 +63,9 @@ using namespace
gatb::core::tools::collections::impl;
63 	63
  using namespace std;
64 	64
65 	65
  // let's be clear here:
66
-// UF hashes will be stored in 32 bits for efficiency (as I don't want
to have a 64-bits UF for memory reasons, also, would require to modify
unionFind.hpp)
67
-typedef uint32_t uf_hashes_t;
68
-// but there can be more than 2^{32} sequences in the glue file
	66
+typedef uint64_t uf_hashes_t; // UF hashes are the hash values of k-mers
to be inserted into the UF data structure. Don't try setting to
uint32_t, would be a disaster
69 	67
  typedef uint64_t seq_idx_t;
70
-// so, potentially, more than 2^{32} UF hashes (but not necessarily,
consider that some sequences don't need to be glued)
71
-// what will happen is that more one UF class won't be linked to a
single unitig, but multiple unitigs
	68
+typedef uint32_t uf_class_t; // UF class is the identifier of an element
in the UF
72 	69
  // let's hope that there won't be saturation (only 1 UF class with all
unitigs)
73 	70
  // if this happens, then "Top 10 glue partitions by size:" will show
only one entry and BCALM will blow up in memory
74 	71
  // a fix would be to use a 64 bits UF (to be coded later)
... 	... 	@@ -197,6 +194,24 @@ static string skip_first_abundance(const
string& list)
197 	194
  return res;
198 	195
  }
199 	196
	197
+static string make_header(const int seq_size, const string& abundances,
bool all_abundance_counts)
	198
+{
	199
+string header;
	200
+float mean_abundance = get_mean_abundance(abundances);
	201
+uint64_t sum_abundances = get_sum_abundance(abundances);
	202
+if (all_abundance_counts)
	203
+{
	204
+// in this setting, all kmer wabundances are printed in the order of the
kmers in the sequence
	205
+header = "LN:i:" + to_string(seq_size) + " ab:Z:" + abundances;
	206
+}
	207
+else
	208
+{
	209
+// km is not a standard GFA field so i'm putting it in lower case as per
the spec
	210
+header = "LN:i:" + to_string(seq_size) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance);
	211
+}
	212
+return header;
	213
+}
	214
+
200 	215
  template<int SPAN>
201 	216
  struct markedSeq
202 	217
  {
... 	... 	@@ -699,6 +714,7 @@ void bglue(Storage *storage,
699 	714
  int kmerSize,
700 	715
  int nb_glue_partitions,
701 	716
  int nb_threads,
	717
+bool all_abundance_counts,
702 	718
  bool verbose
703 	719
  )
704 	720
  {
... 	... 	@@ -804,7 +820,7 @@ void bglue(Storage *storage,
804 	820
  }
805 	821
806 	822
  // create a UF data structure
807
-// this one stores nb_uf_keys * uint64_t (actually, atomic's).so it's
bigger than uf_hashes
	823
+// this one stores nb_uf_keys * uint64_t (actually, atomic's).
808 	824
  unionFind ufkmers(nb_uf_keys);
809 	825
810 	826
  #if 0
... 	... 	@@ -911,13 +927,13 @@ void bglue(Storage *storage,
911 	927
  if (only_uf) // for debugging
912 	928
  return;
913 	929
914
-/* now we're mirroring the UF to a vector of uint32_t's, it will take
less space, and strictly same information
	930
+/* now we're mirroring the UF to a vector of uint32_t's(uf_class_t), it
will take less space, and strictly same information
915 	931
  * this is to get rid of the rank (one uint32) per element in the current
UF implementation.
916 	932
  * To do this, we're using the disk to save space of populating one
vector from the other in memory.
917 	933
  * (saves having to allocate both vectors at the same time) */
918 	934
919
-BagFile<uf_hashes_t>*ufkmers_bagf=newBagFile<uf_hashes_t>(prefix+".glue.uf");
LOCAL(ufkmers_bagf);
920
-BagCache<uf_hashes_t>*ufkmers_bag=newBagCache<uf_hashes_t>(
ufkmers_bagf, 10000 ); LOCAL(ufkmers_bag);
	935
+BagFile<uf_class_t>*ufkmers_bagf=newBagFile<uf_class_t>(prefix+".glue.uf");
LOCAL(ufkmers_bagf);
	936
+BagCache<uf_class_t>*ufkmers_bag=newBagCache<uf_class_t>( ufkmers_bagf,
10000 ); LOCAL(ufkmers_bag);
921 	937
922 	938
  for (unsigned long i = 0; i < nb_uf_keys; i++)
923 	939
  //ufkmers_vector[i] = ufkmers.find(i); // just in-memory without the disk
... 	... 	@@ -930,15 +946,15 @@ void bglue(Storage *storage,
930 	946
931 	947
  ufkmers_bag->flush();
932 	948
933
-std::vector<uf_hashes_t> ufkmers_vector(nb_uf_keys);
934
-IteratorFile<uf_hashes_t> ufkmers_file(prefix+".glue.uf");
	949
+std::vector<uf_class_t> ufkmers_vector(nb_uf_keys);
	950
+IteratorFile<uf_class_t> ufkmers_file(prefix+".glue.uf");
935 	951
  unsigned long i = 0;
936 	952
  for (ufkmers_file.first(); !ufkmers_file.isDone(); ufkmers_file.next())
937 	953
  ufkmers_vector[i++] = ufkmers_file.item();
938 	954
939 	955
  System::file().remove (prefix+".glue.uf");
940 	956
941
-logging("loaded 32-bit UF (" +
to_string(nb_uf_keys*sizeof(uf_hashes_t)/1024/1024) + " MB)");
	957
+logging("loaded 32-bit UF (" +
to_string(nb_uf_keys*sizeof(uf_class_t)/1024/1024) + " MB)");
942 	958
943 	959
  // setup output file
944 	960
  string output_prefix = prefix;
... 	... 	@@ -1000,7 +1016,7 @@ void bglue(Storage *storage,
1000 	1016
1001 	1017
  // partition the glue into many files, à la dsk
1002 	1018
  auto partitionGlue = [k, &modelCanon /* crashes if copied!*/, \
1003
-&get_UFclass, &gluePartitions,
	1019
+&get_UFclass, &gluePartitions,all_abundance_counts,
1004 	1020
  &out, &outLock, &nb_seqs_in_partition, nbGluePartitions]
1005 	1021
  (const Sequence& sequence)
1006 	1022
  {
... 	... 	@@ -1024,11 +1040,8 @@ void bglue(Storage *storage,
1024 	1040
  if (!found_class) // this one doesn't need to be glued
1025 	1041
  {
1026 	1042
  const string abundances = comment.substr(3);
1027
-float mean_abundance = get_mean_abundance(abundances);
1028
-uint64_t sum_abundances = get_sum_abundance(abundances);
1029
-
1030
-// km is not a standard GFA field so i'm putting it in lower case as per
the spec
1031
-output(seq, out, "LN:i:" + to_string(seq.size()) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance));
	1043
+string header = make_header(seq.size(),abundances, all_abundance_counts);
	1044
+output(seq, out, header);
1032 	1045
  return;
1033 	1046
  }
1034 	1047
... 	... 	@@ -1082,7 +1095,7 @@ void bglue(Storage *storage,
1082 	1095
  for (int partition = 0; partition < nbGluePartitions; partition++)
1083 	1096
  {
1084 	1097
  auto glue_partition = [&modelCanon, &ufkmers, partition,
&gluePartition_prefix, nbGluePartitions, ©_nb_seqs_in_partition,
1085
-&get_UFclass, &out, &outLock, kmerSize]( int thread_id)
	1098
+&get_UFclass, &out, &outLock, kmerSize,all_abundance_counts]( int thread_id)
1086 	1099
  {
1087 	1100
  int k = kmerSize;
1088 	1101
... 	... 	@@ -1172,10 +1185,9 @@ void bglue(Storage *storage,
1172 	1185
  string seq, abs;
1173 	1186
  glue_sequences(seqs_to_glue[i], seqs_to_glue_is_circular[i], sequences,
abundances, kmerSize, seq, abs); // takes as input the indices of
ordered sequences, whether that sequence is circular, and the
markedSeq's themselves along with their abundances
1174 	1187
1175
-float mean_abundance = get_mean_abundance(abs);
1176
-uint32_t sum_abundances = get_sum_abundance(abs);
1177 	1188
  {
1178
-output(seq, out, "LN:i:" + to_string(seq.size()) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance));
	1189
+string header = make_header(seq.size(),abs, all_abundance_counts);
	1190
+output(seq, out, header);
1179 	1191
  }
1180 	1192
  }
1181 	1193
... 	... 	@@ -1198,7 +1210,7 @@ void bglue(Storage *storage,
1198 	1210
1199 	1211
  logging("end");
1200 	1212
1201
-bool debug_keep_glue_files = true;// for debugging // TODO enable it if
-redo-bglue param was provided (need some info from
UnitigsConstructionAlgorithm).
	1213
+bool debug_keep_glue_files = false;// for debugging // TODO warning: if
debug_keep_glue_files is set to 'false,' then the debug option
'-redo-bglue' cannot work because it needs those bglue files
1202 	1214
  if (debug_keep_glue_files)
1203 	1215
  {
1204 	1216
  std::cout << "debug: not deleting glue files" << std::endl;
# *gatb-core/src/gatb/bcalm2/bglue_algo.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#349369b0ff3f84575fe0d9513fb9bfcadb1b3612>
------------------------------------------------------------------------
... 	... 	@@ -150,6 +150,7 @@ void
bglue(gatb::core::tools::storage::impl::Storage* storage,
150 	150
  int kmerSize,
151 	151
  int nb_glue_partitions,
152 	152
  int nb_threads,
	153
+bool all_abundance_counts,
153 	154
  bool verbose
154 	155
  );
155 	156
# *gatb-core/src/gatb/debruijn/impl/Graph.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#fa53c8abddac2ffbd797f8ad40c9772134223fed>
------------------------------------------------------------------------
... 	... 	@@ -648,6 +648,8 @@ IOptionsParser* GraphTemplate<Node, Edge,
GraphDataVariant>::getOptionsParser (b
648 	648
  IOptionsParser* parserGeneral = new OptionsParser ("general");
649 	649
  parserGeneral->push_front (new OptionOneParam (STR_INTEGER_PRECISION,
"integers precision (0 for optimized value)", false, "0", false));
650 	650
  parserGeneral->push_front (new OptionOneParam (STR_VERBOSE, "verbosity
level", false, "1" ));
	651
+parserGeneral->push_front (new OptionOneParam
(STR_EDGE_KM_REPRESENTATION, "edge km representation", false, "0" ));
	652
+parserGeneral->push_front (new OptionNoParam (STR_ALL_ABUNDANCE_COUNTS,
"output all k-mer abundance counts instead of mean" ));
651 	653
  parserGeneral->push_front (new OptionOneParam (STR_NB_CORES, "number of
cores", false, "0" ));
652 	654
  parserGeneral->push_front (new OptionNoParam (STR_CONFIG_ONLY, "dump
config only"));
653 	655
... 	... 	@@ -661,7 +663,7 @@ IOptionsParser* GraphTemplate<Node, Edge,
GraphDataVariant>::getOptionsParser (b
661 	663
  parserDebug->push_front (new OptionNoParam ("-skip-links", "same, but
skip links"));
662 	664
  parserDebug->push_front (new OptionNoParam ("-redo-links", "same, but
redo links"));
663 	665
  parserDebug->push_front (new OptionNoParam ("-skip-bglue", "same, but
skip bglue"));
664
-parserDebug->push_front (new OptionNoParam ("-redo-bglue", "same, but
redo bglue"));
	666
+parserDebug->push_front (new OptionNoParam ("-redo-bglue", "same, but
redo bglue(needs debug_keep_glue_files=true in source code)"));
665 	667
  parserDebug->push_front (new OptionNoParam ("-skip-bcalm", "same, but
skip bcalm"));
666 	668
  parserDebug->push_front (new OptionNoParam ("-redo-bcalm", "debug
function, redo the bcalm algo"));
667 	669
# *gatb-core/src/gatb/debruijn/impl/GraphUnitigs.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#2ff315124cb4ed9eebb9d35c35743144de29a781>
------------------------------------------------------------------------
... 	... 	@@ -259,7 +259,7 @@ void
GraphUnitigsTemplate<span>::build_unitigs_postsolid(std::string unitigs_fil
259 	259
  }
260 	260
261 	261
  bool redo_bcalm = props->get("-redo-bcalm");
262
-bool redo_bglue = props->get("-redo-bglue");
	262
+bool redo_bglue = props->get("-redo-bglue");// note: if that option is
to be used, make sure to enable debug_keep_glue_files=true in bglue_algo.cpp
263 	263
  bool redo_links = props->get("-redo-links");
264 	264
265 	265
  bool skip_bcalm = props->get("-skip-bcalm");
# *gatb-core/src/gatb/debruijn/impl/LinkTigs.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#34ae9ab374303a1000194fa7d58aeba9d58f32ce>
------------------------------------------------------------------------
... 	... 	@@ -52,7 +52,7 @@ namespace gatb { namespace core { namespace
debruijn { namespace impl {
52 	52
  * Normally bcalm outputs consecutive unitig ID's but LinkTigs can also
work with non-consecutive, non-sorted IDs
53 	53
  */
54 	54
  template<size_t span>
55
-void link_tigs(string unitigs_filename, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, bool renumber_unitigs)
	55
+void link_tigs(string unitigs_filename, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, booledge_km_representation,bool
renumber_unitigs)
56 	56
  {
57 	57
  bcalm_logging = verbose;
58 	58
  BankFasta* out = new BankFasta(unitigs_filename+".linked");
... 	... 	@@ -60,7 +60,7 @@ void link_tigs(string unitigs_filename, int
kmerSize, int nb_threads, uint64_t &
60 	60
  logging("Finding links between unitigs");
61 	61
62 	62
  for (int pass = 0; pass < nb_passes; pass++)
63
-link_unitigs_pass<span>(unitigs_filename, verbose, pass, kmerSize,
renumber_unitigs);
	63
+link_unitigs_pass<span>(unitigs_filename, verbose, pass, kmerSize,
edge_km_representation,renumber_unitigs);
64 	64
65 	65
  write_final_output(unitigs_filename, verbose, out, nb_unitigs,
renumber_unitigs);
66 	66
... 	... 	@@ -265,7 +265,7 @@ static void record_links(uint64_t utig_id,
int pass, const string &link, std::of
265 	265
266 	266
267 	267
  template<size_t span>
268
-void link_unitigs_pass(const string unitigs_filename, bool verbose,
const int pass, const int kmerSize, const bool renumber_unitigs)
	268
+void link_unitigs_pass(const string unitigs_filename, bool verbose,
const int pass, const int kmerSize, booledge_km_representation,const
bool renumber_unitigs)
269 	269
  {
270 	270
  typedef typename kmer::impl::Kmer<span>::ModelCanonical Model;
271 	271
  typedef typename kmer::impl::Kmer<span>::Type Type;
... 	... 	@@ -376,7 +376,12 @@ void link_unitigs_pass(const string
unitigs_filename, bool verbose, const int pa
376 	376
  //bool rc = e_in.rc ^ (!beginInSameOrientation); // "rc" sets the
destination strand // i don't think it's the right formula because of
k-1-mers that are their self revcomp. see the mikko bug in the test
folder, that provides a nice illustration of that
377 	377
  bool rc = e_in.pos == UNITIG_END; // a better way to determine the rc
flag is just looking at position of e_in k-1-mer
378 	378
379
-in_links += "L:-:" + to_string(e_in.unitig) + ":" + (rc?"-":"+") + " ";
	379
+
	380
+if(edge_km_representation){
	381
+in_links += "J:0:" + to_string(e_in.unitig) + ":" + (rc?"1":"0") + " ";
	382
+}else{
	383
+in_links += "L:-:" + to_string(e_in.unitig) + ":" + (rc?"-":"+") + " ";
	384
+}
380 	385
381 	386
  /* what to do when kmerBegin is same as forward and reverse?
382 	387
  used to have this:
... 	... 	@@ -432,7 +437,13 @@ void link_unitigs_pass(const string
unitigs_filename, bool verbose, const int pa
432 	437
433 	438
  bool rc = e_out.pos == UNITIG_END; // a better way to determine the rc
flag is just looking at position of e_in k-1-mer
434 	439
435
-out_links += "L:+:" + to_string(e_out.unitig) + ":" + (rc?"-":"+") + " ";
	440
+if(edge_km_representation){
	441
+out_links += "J:1:" + to_string(e_out.unitig) + ":" + (rc?"1":"0") + " ";
	442
+}else{
	443
+out_links += "L:+:" + to_string(e_out.unitig) + ":" + (rc?"-":"+") + " ";
	444
+}
	445
+
	446
+
436 	447
437 	448
  if (debug) std::cout << " [valid] ";
438 	449
  }
# *gatb-core/src/gatb/debruijn/impl/LinkTigs.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#8d46876b0b01c8687ffec95a3e56a84f1d5ffc83>
------------------------------------------------------------------------
... 	... 	@@ -30,10 +30,10 @@ namespace gatb { namespace core {
namespace debruijn { namespace impl {
30 	30
31 	31
32 	32
  template<size_t SPAN>
33
-void link_tigs( std::string prefix, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, bool renumber_unitigs = false);
	33
+void link_tigs( std::string prefix, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, booledge_km_representation,bool
renumber_unitigs = false);
34 	34
35 	35
  template<size_t span>
36
-void link_unitigs_pass(const std::string unitigs_filename, bool verbose,
const int pass, const int kmerSize, constboolrenumber_unitigs);
	36
+void link_unitigs_pass(const std::string unitigs_filename, bool verbose,
const int pass, const int kmerSize,
booledge_km_representation,constboolrenumber_unitigs);
37 	37
38 	38
  }}}}
39 	39
# *gatb-core/src/gatb/debruijn/impl/UnitigsConstructionAlgorithm.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#47e6fff3caa93f4ef7cc54b7f08ec8ac25b21393>
------------------------------------------------------------------------
... 	... 	@@ -91,17 +91,15 @@
UnitigsConstructionAlgorithm<span>::~UnitigsConstructionAlgorithm ()
91 	91
  template <size_t span>
92 	92
  void UnitigsConstructionAlgorithm<span>::execute ()
93 	93
  {
94
-kmerSize =
95
-getInput()->getInt(STR_KMER_SIZE);
96
-int abundance =
97
-getInput()->getInt(STR_KMER_ABUNDANCE_MIN); // note: doesn't work when
it's "auto"
98
-int minimizerSize =
99
-getInput()->getInt(STR_MINIMIZER_SIZE);
100
-int nb_threads =
101
-getInput()->getInt(STR_NB_CORES);
102
-int minimizer_type =
103
-getInput()->getInt(STR_MINIMIZER_TYPE);
104
-bool verbose = getInput()->getInt(STR_VERBOSE);
	94
+kmerSize = getInput()->getInt(STR_KMER_SIZE);
	95
+int abundance = getInput()->getInt(STR_KMER_ABUNDANCE_MIN); // note:
doesn't work when it's "auto"
	96
+int minimizerSize = getInput()->getInt(STR_MINIMIZER_SIZE);
	97
+int nb_threads = getInput()->getInt(STR_NB_CORES);
	98
+int minimizer_type = getInput()->getInt(STR_MINIMIZER_TYPE);
	99
+bool verbose = getInput()->getInt(STR_VERBOSE);
	100
+bool edge_km_representation =
getInput()->getInt(STR_EDGE_KM_REPRESENTATION);
	101
+bool all_abundance_counts = getInput()->get(STR_ALL_ABUNDANCE_COUNTS);
	102
+
105 	103
  int nb_glue_partitions = 0;
106 	104
  if (getInput()->get("-nb-glue-partitions"))
107 	105
  nb_glue_partitions = getInput()->getInt("-nb-glue-partitions");
... 	... 	@@ -110,9 +108,9 @@ void
UnitigsConstructionAlgorithm<span>::execute ()
110 	108
  if ((unsigned int)nb_threads > nbThreads)
111 	109
  std::cout << "Uh. Unitigs graph construction called with nb_threads " <<
nb_threads << " but dispatcher has nbThreads " << nbThreads << std::endl;
112 	110
113
-if (do_bcalm) bcalm2<span>(&_storage, unitigs_filename, kmerSize,
abundance, minimizerSize, nbThreads, minimizer_type, verbose);
114
-if (do_bglue) bglue<span> (&_storage, unitigs_filename, kmerSize,
nb_glue_partitions, nbThreads, verbose);
115
-if (do_links) link_tigs<span>(unitigs_filename, kmerSize, nbThreads,
nb_unitigs, verbose);
	111
+if (do_bcalm) bcalm2<span>(&_storage, unitigs_filename, kmerSize,
abundance, minimizerSize, nbThreads, minimizer_type, verbose);
	112
+if (do_bglue) bglue<span> (&_storage, unitigs_filename, kmerSize,
nb_glue_partitions, nbThreads, all_abundance_counts, verbose);
	113
+if (do_links) link_tigs<span>(unitigs_filename, kmerSize, nbThreads,
nb_unitigs, verbose,edge_km_representation);
116 	114
117 	115
  /** We gather some statistics. */
118 	116
  // nb_unitigs will be used in GraphUnitigs
# *gatb-core/src/gatb/kmer/impl/SortingCountAlgorithm.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#ed783240a52e80bd1c3a470d6646b3f6b3ab63e4>
------------------------------------------------------------------------
... 	... 	@@ -1300,6 +1300,16 @@ void
SortingCountAlgorithm<span>::fillPartitions (size_t pass, Iterator<Sequence
1300 	1300
  itBanks[i]->finalize();
1301 	1301
  }
1302 	1302
  }
	1303
+
	1304
+// force close partitions and re-open them for reading
	1305
+// may prevent crash in large multi-bank counting instance on Lustre
filesystems
	1306
+if(_config._solidityKind != KMER_SOLIDITY_SUM)
	1307
+{
	1308
+string tmpStorageName = getInput()->getStr(STR_URI_OUTPUT_TMP) + "/" +
System::file().getTemporaryFilename("dsk_partitions");
	1309
+setPartitions (0); // close the partitions first, otherwise new files
are opened before closing parti from previous pass
	1310
+setPartitions ( & (*_tmpPartitionsStorage)().getPartition<Type> ("parts"));
	1311
+
	1312
+}
1303 	1313
  }
1304 	1314
1305 	1315
  /*********************************************************************
# *gatb-core/src/gatb/system/impl/FileSystemCommon.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#6dc8c95d8c6c04e96a48b4cf5c3c2ea31cc93750>
------------------------------------------------------------------------
... 	... 	@@ -36,6 +36,7 @@
36 	36
  #include <string.h>
37 	37
  #include <sys/stat.h>
38 	38
  #include <unistd.h>
	39
+#include <iostream>
39 	40
40 	41
  /********************************************************************************/
41 	42
  namespace gatb {
... 	... 	@@ -60,6 +61,7 @@ public:
60 	61
  {
61 	62
  _isStdout = path && strcmp(path,"stdout")==0;
62 	63
  _handle = _isStdout ? stdout : fopen (path, mode);
	64
+//std::cout << "opening file " << _path << " handle " << _handle <<
std::endl;
63 	65
  if(_handle == 0)
64 	66
  {
65 	67
  throw Exception ("cannot open %s %s",path,strerror(errno));
... 	... 	@@ -67,7 +69,9 @@ public:
67 	69
  }
68 	70
69 	71
  /** Destructor. */
70
-virtual ~CommonFile () { if (_handle && !_isStdout) { fclose (_handle); } }
	72
+virtual ~CommonFile () { if (_handle && !_isStdout) {
	73
+//std::cout << "closing file " << _path << " handle " << _handle <<
std::endl;
	74
+fclose (_handle); } }
71 	75
72 	76
  /** \copydoc IFile::isOpen */
73 	77
  bool isOpen () { return getHandle() != 0; }
# *gatb-core/src/gatb/template/TemplateSpecialization10.cpp.in*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4c12716c6fcb516eda353ab99ab2b81002b77515>
------------------------------------------------------------------------
... 	... 	@@ -25,15 +25,16 @@ template void bglue<${KSIZE}>(Storage*
storage,
25 	25
  int kmerSize,
26 	26
  int nb_glue_partitions,
27 	27
  int nb_threads,
	28
+bool all_abundance_counts,
28 	29
  bool verbose
29 	30
  );
30 	31
31 	32
  template class graph3<${KSIZE}>; // graph3<span> switch
32 	33
33 	34
  template void link_tigs<${KSIZE}>
34
-(std::string unitigs_filename, int kmerSize, int nb_threads, uint64_t
&nb_unitigs, bool verbose, bool renumber_unitigs = false);
	35
+(std::string unitigs_filename, int kmerSize, int nb_threads, uint64_t
&nb_unitigs, bool verbose, bool edge_km_representation, bool
renumber_unitigs = false);
35 	36
36
-template void link_unitigs_pass<${KSIZE}>(const std::string
unitigs_filename, bool verbose, const int pass, const int kmerSize,
const bool renumber_unitigs);
	37
+template void link_unitigs_pass<${KSIZE}>(const std::string
unitigs_filename, bool verbose, const int pass, const int kmerSize, bool
edge_km_representation, const bool renumber_unitigs);
37 	38
38 	39
39 	40
  /********************************************************************************/
# *gatb-core/src/gatb/tools/collections/impl/IteratorFile.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9be95217b1b4429f2eeabc2213c457fd389f08a0>
------------------------------------------------------------------------
... 	... 	@@ -239,6 +239,7 @@ public:
239 	239
  _filename(it._filename), _gzfile(0), _buffer(0), _cpt_buffer(0),
_idx(0), _cacheItemsNb(it._cacheItemsNb), _isDone(true)
240 	240
  {
241 	241
  _gzfile = gzopen(_filename.c_str(),"rb");
	242
+gzbuffer(_gzfile,2*1024*1024);
242 	243
  _buffer = (Item*) MALLOC (sizeof(Item) * _cacheItemsNb);
243 	244
  }
244 	245
... 	... 	@@ -248,6 +249,7 @@ public:
248 	249
249 	250
  {
250 	251
  _gzfile = gzopen(_filename.c_str(),"rb");
	252
+gzbuffer(_gzfile,2*1024*1024);
251 	253
  _buffer = (Item*) MALLOC (sizeof(Item) * _cacheItemsNb);
252 	254
  }
253 	255
... 	... 	@@ -273,6 +275,7 @@ public:
273 	275
  _isDone = it._isDone;
274 	276
275 	277
  _gzfile = gzopen(_filename.c_str(),"r");
	278
+gzbuffer(_gzfile,2*1024*1024);
276 	279
  _buffer = (Item*) MALLOC (sizeof(Item) * it._cacheItemsNb);
277 	280
  }
278 	281
  return *this;
# *gatb-core/src/gatb/tools/misc/api/StringsRepository.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4ae6c527aafe52f4f16907962f525902d4b5ef6f>
------------------------------------------------------------------------
... 	... 	@@ -83,6 +83,8 @@ public:
83 	83
  const char* graph () { return "-graph"; }
84 	84
  const char* kmer_size () { return "-kmer-size"; }
85 	85
  const char* minimizer_size () { return "-minimizer-size"; }
	86
+const char* edge_km_representation () { return "-edge-km"; }
	87
+const char* all_abundance_counts () { return "-all-abundance-counts"; }
86 	88
  const char* kmer_abundance () { return "-abundance"; }
87 	89
  const char* kmer_abundance_min () { return "-abundance-min"; }
88 	90
  const char* kmer_abundance_min_threshold () { return
"-abundance-min-threshold"; }
... 	... 	@@ -138,6 +140,8 @@ public:
138 	140
  #define STR_URI_GRAPH
gatb::core::tools::misc::StringRepository::singleton().graph ()
139 	141
  #define STR_KMER_SIZE
gatb::core::tools::misc::StringRepository::singleton().kmer_size ()
140 	142
  #define STR_MINIMIZER_SIZE
gatb::core::tools::misc::StringRepository::singleton().minimizer_size ()
	143
+#define STR_EDGE_KM_REPRESENTATION
gatb::core::tools::misc::StringRepository::singleton().edge_km_representation
()
	144
+#define STR_ALL_ABUNDANCE_COUNTS
gatb::core::tools::misc::StringRepository::singleton().all_abundance_counts
()
141 	145
  #define STR_INTEGER_PRECISION
gatb::core::tools::misc::StringRepository::singleton().integer_precision ()
142 	146
  #define STR_KMER_ABUNDANCE
gatb::core::tools::misc::StringRepository::singleton().kmer_abundance ()
143 	147
  #define STR_KMER_ABUNDANCE_MIN
gatb::core::tools::misc::StringRepository::singleton().kmer_abundance_min ()
# *gatb-core/src/gatb/tools/misc/impl/Tool.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#3b8fda1a420e311e4fb4b0d9fe344d4bbc2f4e97>
------------------------------------------------------------------------
... 	... 	@@ -57,7 +57,6 @@ Tool::Tool (const std::string& name) :
userDisplayHelp(0), _helpTarget(0),userDi
57 	57
58 	58
  getParser()->push_back (new OptionOneParam (STR_NB_CORES, "number of
cores", false, "0" ));
59 	59
  getParser()->push_back (new OptionOneParam (STR_VERBOSE, "verbosity
level", false, "1" ));
60
-
61 	60
  getParser()->push_back (new OptionNoParam (STR_VERSION, "version", false));
62 	61
  getParser()->push_back (new OptionNoParam (STR_HELP, "help", false));
63 	62
# *gatb-core/src/gatb/tools/storage/impl/CollectionHDF5Patch.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#f84f48b86f89b6fe6f3b401f6ed674d1ffab829b>
------------------------------------------------------------------------
... 	... 	@@ -266,6 +266,7 @@ public:
266 	266
  herr_t status = 0;
267 	267
268 	268
  {
	269
+//std::cout << "begin insert" << std::endl;
269 	270
  system::LocalSynchronizer localsynchro (_common->_synchro);
270 	271
271 	272
  /** We get the dataset id. */
... 	... 	@@ -300,6 +301,7 @@ public:
300 	301
  status = H5Sclose (filespaceId);
301 	302
  status = H5Sclose (memspaceId);
302 	303
  if (status != 0) { std::cout << "err H5Sclose" << std::endl; }
	304
+//std::cout << "end insert" << std::endl;
303 	305
  }
304 	306
305 	307
  /** We periodically clean up some HDF5 resources. */
... 	... 	@@ -373,12 +375,14 @@ private:
373 	375
  * NOTE !!! the 'clean' method called after this block is also synchronized,
374 	376
  * and therefore must not be in the same instruction block. */
375 	377
  {
	378
+//std::cout << "begin retrievecache" << std::endl;
376 	379
  system::LocalSynchronizer localsynchro (_common->_synchro);
377 	380
378 	381
  hid_t memspaceId = H5Screate_simple (1, &count, NULL);
379 	382
380 	383
  /** Select hyperslab on file dataset. */
381 	384
  hid_t filespaceId = H5Dget_space(_common->getDatasetId());
	385
+//std::cout << "filespaceId " << filespaceId << std::endl;
382 	386
  status = H5Sselect_hyperslab (filespaceId, H5S_SELECT_SET, &start, NULL,
&count, NULL);
383 	387
  if (status < 0) { throw gatb::core::system::Exception ("HDF5 error
(H5Sselect_hyperslab), status %d", status); }
384 	388
... 	... 	@@ -390,6 +394,7 @@ private:
390 	394
  status = H5Sclose (filespaceId);
391 	395
  status = H5Sclose (memspaceId);
392 	396
  if (status < 0) { throw gatb::core::system::Exception ("HDF5 error
(H5Sclose), status %d", status); }
	397
+//std::cout << "end retrievecache" << std::endl;
393 	398
  }
394 	399
395 	400
  /** We periodically clean up some HDF5 resources. */
# *gatb-core/src/gatb/tools/storage/impl/Storage.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9e0109d139212f56eef6153e59e64fa4c6670360>
------------------------------------------------------------------------
... 	... 	@@ -181,7 +181,7 @@ public:
181 	181
182 	182
  /** Get a child partition from its name. Created if not already exists.
183 	183
  * \param[in] name : name of the child partition to be retrieved.
184
-* \param[in] nb : in case of creation, tells how many collection belong
to the partition.
	184
+* \param[in] nb : in case of creation, tells how many collection belong
to the partition.IMPORTANT: if nb != 0, StorageFile will erase the
partition before opening it. So if you're opening a partition, just set
nb=0 and let it autodetect the size
185 	185
  * \return the child partition.
186 	186
  */
187 	187
  template <class Type> Partition<Type>& getPartition (const std::string&
name, size_t nb=0);
# *gatb-core/src/gatb/tools/storage/impl/StorageFile.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#efdf123a4ecf95f1a9326ff6d3443694ceed944f>
------------------------------------------------------------------------
... 	... 	@@ -96,7 +96,8 @@ namespace impl {
96 	96
  /** */
97 	97
  ~GroupFile()
98 	98
  {
99
-system::impl::System::file().rmdir(folder); // hack to remove the
trashme folers. I'd have liked to make that call in remove() but for
some reason remove() isn't called
	99
+//std::cout << "groupfile destructor called, removing folder " << folder
<< std::endl;
	100
+system::impl::System::file().rmdir(folder); // hack to remove the
trashme folers. I'd have liked to make that call in remove() but for
some reason remove() isn't called
100 	101
  }
101 	102
102 	103
  /** */
... 	... 	@@ -219,17 +220,24 @@ public:
219 	220
  if
(!system::impl::System::file().isFolderEndingWith(storage_prefix,"_gatb"))
220 	221
  file_folder += "_gatb/";
221 	222
222
-std::string filename = file_folder + parent->getFullId('.') +
std::string(".") + name;
223
-std::string folder = system::impl::System::file().getDirectory(filename);
224
-std::string prefix = system::impl::System::file().getBaseName(filename)
+ std::string(".") + name; // because gatb's getBaseName is stupid and
cuts after the last dot
	223
+std::string full_path = file_folder;
	224
+std::string parent_base = parent->getFullId('.');
	225
+std::string base_name = parent_base;
	226
+if (parent_base.size() > 0)
	227
+base_name += std::string("."); // because gatb's getBaseName is stupid
and cuts after the last dot
	228
+base_name += name;
	229
+
	230
+full_path += base_name; // but then base_name might have a suffix like
".1" for partitions
	231
+
	232
+//std::cout <<"name: " << name << " filename " << full_path << " prefix
" << base_name<< std::endl;
225 	233
226 	234
  if (nb == 0)
227 	235
  { // if nb is 0, it means we're opening partitions and not creating
them, thus we need to get the number of partitions.
228 	236
229 	237
  int nb_partitions=0;
230
-for (auto filename : system::impl::System::file().listdir(folder))
	238
+for (auto filename : system::impl::System::file().listdir(file_folder))
231 	239
  {
232
-if (!filename.compare(0, prefix.size(),prefix)) // startswith
	240
+if (!filename.compare(0, base_name.size(),base_name)) // startswith
233 	241
  {
234 	242
  nb_partitions++;
235 	243
  }
... 	... 	@@ -240,19 +248,20 @@ public:
240 	248
  std::cout << "error: could not get number of partition for " << name <<
" using StorageFile" << std::endl;
241 	249
  exit(1);
242 	250
  }
	251
+//std::cout << "got " << nb << " partitions" << std::endl;
243 	252
  }
244 	253
  else
245 	254
  {
246 	255
  // else, if nb is set, means we're creating some partitions. let's
delete all the previous ones to avoid wrongly counting
247
-for (auto filename : system::impl::System::file().listdir(folder))
	256
+for (auto filename : system::impl::System::file().listdir(file_folder))
248 	257
  {
249
-//std::cout <<"name: " << name << " comparing " << filename << " with
prefix " << prefix << std::endl;
250
-if (!filename.compare(0, prefix.size(),prefix)) // startswith
	258
+//std::cout <<"name: " << name << " comparing " << filename << " with
prefix " << base_name << std::endl;
	259
+if (!filename.compare(0, base_name.size(),base_name)) // startswith
251 	260
  {
252 	261
  // some additional guard:
253 	262
  if (filename == "." ||filename == "..") continue;
254
-system::impl::System::file().remove(folder + "/" + filename);
255
-//std::cout << "deleting" << folder << "/" << filename << std::endl;
	263
+system::impl::System::file().remove(file_folder + "/" + filename);
	264
+//std::cout << "deleting" << file_folder << "/" << filename << std::endl;
256 	265
  }
257 	266
  }
258 	267
  }
# *gatb-core/src/gatb/tools/storage/impl/StorageHDF5.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#dd31cf0d08329fcf5ce3e2647df00c99baa267ba>
------------------------------------------------------------------------
... 	... 	@@ -268,6 +268,7 @@ private:
268 	268
  std::string actualName = this->getFullId('/');
269 	269
270 	270
  /** We create the HDF5 group if needed. */
	271
+//std::cout << "actualname: "<< actualName << " end"<<std::endl;
271 	272
  htri_t doesExist = H5Lexists (storage->getFileId(), actualName.c_str(),
H5P_DEFAULT);
272 	273
273 	274
  if (doesExist <= 0)
# *gatb-core/test/unit/src/debruijn/TestDebruijn.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#d7401b2bb5a2b6245ce8cd38b4b1c20e7ea1a058>
------------------------------------------------------------------------
... 	... 	@@ -87,6 +87,7 @@ class TestDebruijn : public Test
87 	87
  /********************************************************************************/
88 	88
  CPPUNIT_TEST_SUITE_GATB (TestDebruijn);
89 	89
	90
+CPPUNIT_TEST_GATB (debruijn_build);
90 	91
  CPPUNIT_TEST_GATB (debruijn_test_small_kmers);
91 	92
  CPPUNIT_TEST_GATB (debruijn_large_abundance_query);
92 	93
  CPPUNIT_TEST_GATB (debruijn_test7);
... 	... 	@@ -104,7 +105,6 @@ class TestDebruijn : public Test
104 	105
  CPPUNIT_TEST_GATB (debruijn_test12);
105 	106
  CPPUNIT_TEST_GATB (debruijn_test13);
106 	107
  // CPPUNIT_TEST_GATB (debruijn_mutation); // has been removed due to it
crashing clang, and since mutate() isn't really used in apps, i didn't
bother.
107
-CPPUNIT_TEST_GATB (debruijn_build);
108 	108
  CPPUNIT_TEST_GATB (debruijn_checkbranching);
109 	109
  CPPUNIT_TEST_GATB (debruijn_mphf);
110 	110
  CPPUNIT_TEST_GATB (debruijn_mphf_nodeindex);
... 	... 	@@ -908,13 +908,22 @@ public:
908 	908
  IBank* inputBank = new BankStrings (sequences, nbSequences);
909 	909
  LOCAL (inputBank);
910 	910
	911
+
	912
+//std::cout << "g1 create" << std::endl;
911 	913
  Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -max-memory %d", "g1", MAX_MEMORY);
	914
+
	915
+//std::cout << "g2 create" << std::endl;
912 	916
  Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -branching-nodes none -max-memory %d", "g2", MAX_MEMORY);
913
-Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -solid-kmers-out none -max-memory %d", "g3", MAX_MEMORY);
	917
+
	918
+// This test doesn't work anymore.
	919
+// It's probably a small fix somewehre
	920
+// But I'd argue that the gatb feature of 'not outputting solid kmers to
disk' is useless
	921
+// So instead of bothering, I'm just removing the present unit test.
	922
+//Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -solid-kmers-out none -debloom none -branching-nodes none
-max-memory %d", "g3", MAX_MEMORY);
914 	923
915 	924
  debruijn_build_entry r1 = debruijn_build_aux_aux ("g1", true, true);
916 	925
  debruijn_build_entry r2 = debruijn_build_aux_aux ("g2", true, true);
917
-debruijn_build_entry r3 = debruijn_build_aux_aux ("g3", false, true);
	926
+//debruijn_build_entry r3 = debruijn_build_aux_aux ("g3", false, true);
918 	927
919 	928
  CPPUNIT_ASSERT (r1.nbNodes == r2.nbNodes);
920 	929
  CPPUNIT_ASSERT (r1.checksumNodes == r2.checksumNodes);
... 	... 	@@ -925,8 +934,8 @@ public:
925 	934
926 	935
  CPPUNIT_ASSERT (r1.nbBranchingNodes == r2.nbBranchingNodes);
927 	936
  CPPUNIT_ASSERT (r1.checksumBranchingNodes == r2.checksumBranchingNodes);
928
-CPPUNIT_ASSERT(r1.nbBranchingNodes==r3.nbBranchingNodes);
929
-CPPUNIT_ASSERT (r1.checksumBranchingNodes == r3.checksumBranchingNodes);
	937
+//CPPUNIT_ASSERT (r1.nbBranchingNodes == r3.nbBranchingNodes); //
uncomment if we ever fix r3 (see long comment above)
	938
+//CPPUNIT_ASSERT (r1.checksumBranchingNodes == r3.checksumBranchingNodes);
930 	939
  }
931 	940
932 	941
  /********************************************************************************/
# *gatb-core/test/unit/src/kmer/TestDSK.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#49ce0c04f76f93b18c25528d056dc803569026e0>
------------------------------------------------------------------------
... 	... 	@@ -471,6 +471,9 @@ public:
471 	471
  // printf ("min=%ld max=%ld nb=%ld check=%ld \n",
472 	472
  // nksMin, nksMax, sortingCount.getSolidCounts()->getNbItems(),checkNb
473 	473
  // );
	474
+
	475
+if (sortingCount.getSolidCounts()->getNbItems() != (int)checkNb)
	476
+std::cout << "counted " <<sortingCount.getSolidCounts()->getNbItems()<<
" kmers, expected " << (int)checkNb << std::endl;
474 	477
475 	478
  CPPUNIT_ASSERT (sortingCount.getSolidCounts()->getNbItems() ==
(int)checkNb);
476 	479
  }
# *gatb-core/test/unit/src/tools/storage/TestStorage.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#e2e4e0dc5c61805b798db5e2dee529b92e327235>
------------------------------------------------------------------------
... 	... 	@@ -132,7 +132,10 @@ public:
132 	132
  {
133 	133
  size_t nbIter = 0;
134 	134
  Iterator<NativeInt64>* it = partition[i].iterator(); LOCAL(it);
135
-for (it->first(); !it->isDone(); it->next(), nbIter++) { CPPUNIT_ASSERT
(it->item() == 2*i); }
	135
+for (it->first(); !it->isDone(); it->next(), nbIter++) {
	136
+if (it->item() != 2*i)
	137
+std::cout << std::endl << "item " << it->item() << " expected: " << 2*i
<< std::endl;
	138
+CPPUNIT_ASSERT (it->item() == 2*i); }
136 	139
  CPPUNIT_ASSERT (nbIter == 1);
137 	140
  }
138 	141
... 	... 	@@ -152,7 +155,9 @@ public:
152 	155
  Iterator<NativeInt64>* it = partition[i].iterator(); LOCAL(it);
153 	156
  for (it->first(); !it->isDone(); it->next(), nbIter++)
154 	157
  {
155
-if (nbIter==0) { CPPUNIT_ASSERT (it->item() == 2*i ); }
	158
+if (nbIter==0) { if (it->item() != 2*i)
	159
+std::cout << "item " << it->item() << " expected: " << 2*i << std::endl;
	160
+CPPUNIT_ASSERT (it->item() == 2*i ); }
156 	161
  if (nbIter==1) { CPPUNIT_ASSERT (it->item() == 2*i+1); }
157 	162
  }
158 	163
  CPPUNIT_ASSERT (nbIter == 2);
# *gatb-core/thirdparty/update-boost.sh*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#e79cab2022b11c71b2d8095f0ebe49260b3cc139>
------------------------------------------------------------------------
	1
+#this is the procedure I use to update to newer versions of boost in
gatb-core
	2
+#pretty simple but gets the job done
	3
+#to be run within thirdparty/
	4
+#-Rayan
	5
+
	6
+newdir=boost_1_71_0/boost/
	7
+olddir=boost
	8
+
	9
+for file in `ls $olddir`
	10
+do
	11
+echo $file
	12
+cp -R $newdir/$file $olddir/
	13
+done
—
View it on GitLab
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad>.
You're receiving this email because of your account on salsa.debian.org.
If you'd like to receive fewer emails, you can adjust your notification
settings.
-------------- next part --------------
_______________________________________________
debian-med-commit mailing list
debian-med-commit at alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/debian-med-commit
    
    
More information about the Debian-med-packaging
mailing list