[Debian-med-packaging] I failed Fwd: [med-svn] [Git][med-team/gatb-core][master] 7 commits: New upstream version
Steffen Möller
steffen_moeller at gmx.de
Thu Dec 5 00:21:23 GMT 2019
Hi, hi Andres in particular,
I tried my best with routine-update on this one but somehow the shared
libraries give me bad vibes. This seems related to the previous fix of
yours. Could you please have a look?
Cheers,
Steffen
-------- Forwarded Message --------
Subject: [med-svn] [Git][med-team/gatb-core][master] 7 commits: New
upstream version
Date: Wed, 04 Dec 2019 23:52:04 +0000
From: Steffen Möller <gitlab at salsa.debian.org>
Reply-To: noreply at salsa.debian.org
To: debian-med-commit at lists.alioth.debian.org
GitLab
Steffen Möller pushed to branch master at Debian Med / gatb-core
<https://salsa.debian.org/med-team/gatb-core>
Commits:
* *f9b878ed
<https://salsa.debian.org/med-team/gatb-core/commit/f9b878ed15fdddf238c53c3d55c1ce95614a9c87>*
by Steffen Moeller /at 2019-12-04T23:22:51Z/
New upstream version
* *d6238780
<https://salsa.debian.org/med-team/gatb-core/commit/d6238780168fff561ee805fa30923f988a8b9a3e>*
by Steffen Moeller /at 2019-12-04T23:22:52Z/
New upstream version 1.4.1+git20191130.664696c+dfsg
* *79bb050d
<https://salsa.debian.org/med-team/gatb-core/commit/79bb050d74f76e42f0b97d4a18c90b1de5c2c6db>*
by Steffen Moeller /at 2019-12-04T23:22:56Z/
Update upstream source from tag 'upstream/1.4.1+git20191130.664696c+dfsg'
Update to upstream version '1.4.1+git20191130.664696c+dfsg'
with Debian dir df4adf125d7696c65abb84dcc14f05bf87112c14
* *94a8395a
<https://salsa.debian.org/med-team/gatb-core/commit/94a8395a6a637982e9e4f1288366e79e10705388>*
by Steffen Moeller /at 2019-12-04T23:22:59Z/
Standards-Version: 4.4.1
* *f2ed00d9
<https://salsa.debian.org/med-team/gatb-core/commit/f2ed00d9a9919e39a8221f149baeaded6dd9170a>*
by Steffen Moeller /at 2019-12-04T23:23:01Z/
Set upstream metadata fields: Repository-Browse.
* *61534d82
<https://salsa.debian.org/med-team/gatb-core/commit/61534d822bc66ac7c4c18a62eb4f00f6acbb06d9>*
by Steffen Moeller /at 2019-12-04T23:23:01Z/
Remove obsolete fields Name from debian/upstream/metadata.
* *c58b23ef
<https://salsa.debian.org/med-team/gatb-core/commit/c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad>*
by Steffen Moeller /at 2019-12-04T23:51:27Z/
FTBFS
27 changed files:
* debian/changelog <#9c96da0e9f91d7d8937b69b524702c106258f0d1>
* debian/control <#58ef006ab62b83b4bec5d81fe5b32c3b4c2d1cc2>
* debian/upstream/metadata <#f5606a935d95a2f20059a7ac1136f47b2edadbf6>
* gatb-core/CMakeLists.txt <#4c007bda60857aed5186f73b4dd88ee753952d0d>
* gatb-core/doc/doxygen/src/dbgh5page.hpp
<#2fba22cc64bca97d995e238b012045ac90567b80>
* gatb-core/src/gatb/bank/impl/BankFasta.cpp
<#6861fd4c43e7c21bb057ecc2ad7a189b8aaf5b33>
* gatb-core/src/gatb/bcalm2/bglue_algo.cpp
<#490f86cdfc07e7db3455fffd8859052bd5dbeb50>
* gatb-core/src/gatb/bcalm2/bglue_algo.hpp
<#349369b0ff3f84575fe0d9513fb9bfcadb1b3612>
* gatb-core/src/gatb/debruijn/impl/Graph.cpp
<#fa53c8abddac2ffbd797f8ad40c9772134223fed>
* gatb-core/src/gatb/debruijn/impl/GraphUnitigs.cpp
<#2ff315124cb4ed9eebb9d35c35743144de29a781>
* gatb-core/src/gatb/debruijn/impl/LinkTigs.cpp
<#34ae9ab374303a1000194fa7d58aeba9d58f32ce>
* gatb-core/src/gatb/debruijn/impl/LinkTigs.hpp
<#8d46876b0b01c8687ffec95a3e56a84f1d5ffc83>
* gatb-core/src/gatb/debruijn/impl/UnitigsConstructionAlgorithm.cpp
<#47e6fff3caa93f4ef7cc54b7f08ec8ac25b21393>
* gatb-core/src/gatb/kmer/impl/SortingCountAlgorithm.cpp
<#ed783240a52e80bd1c3a470d6646b3f6b3ab63e4>
* gatb-core/src/gatb/system/impl/FileSystemCommon.hpp
<#6dc8c95d8c6c04e96a48b4cf5c3c2ea31cc93750>
* gatb-core/src/gatb/template/TemplateSpecialization10.cpp.in
<#4c12716c6fcb516eda353ab99ab2b81002b77515>
* gatb-core/src/gatb/tools/collections/impl/IteratorFile.hpp
<#9be95217b1b4429f2eeabc2213c457fd389f08a0>
* gatb-core/src/gatb/tools/misc/api/StringsRepository.hpp
<#4ae6c527aafe52f4f16907962f525902d4b5ef6f>
* gatb-core/src/gatb/tools/misc/impl/Tool.cpp
<#3b8fda1a420e311e4fb4b0d9fe344d4bbc2f4e97>
* gatb-core/src/gatb/tools/storage/impl/CollectionHDF5Patch.hpp
<#f84f48b86f89b6fe6f3b401f6ed674d1ffab829b>
* gatb-core/src/gatb/tools/storage/impl/Storage.hpp
<#9e0109d139212f56eef6153e59e64fa4c6670360>
* gatb-core/src/gatb/tools/storage/impl/StorageFile.hpp
<#efdf123a4ecf95f1a9326ff6d3443694ceed944f>
* gatb-core/src/gatb/tools/storage/impl/StorageHDF5.hpp
<#dd31cf0d08329fcf5ce3e2647df00c99baa267ba>
* gatb-core/test/unit/src/debruijn/TestDebruijn.cpp
<#d7401b2bb5a2b6245ce8cd38b4b1c20e7ea1a058>
* gatb-core/test/unit/src/kmer/TestDSK.cpp
<#49ce0c04f76f93b18c25528d056dc803569026e0>
* gatb-core/test/unit/src/tools/storage/TestStorage.cpp
<#e2e4e0dc5c61805b798db5e2dee529b92e327235>
* + gatb-core/thirdparty/update-boost.sh
<#e79cab2022b11c71b2d8095f0ebe49260b3cc139>
Changes:
# *debian/changelog*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9c96da0e9f91d7d8937b69b524702c106258f0d1>
------------------------------------------------------------------------
1
+gatb-core (1.4.1+git20191130.664696c+dfsg-1) UNRELEASED; urgency=medium
2
+
3
+* Team upload.
4
+* New upstream version
5
+* Standards-Version: 4.4.1
6
+* Set upstream metadata fields: Repository-Browse.
7
+* Remove obsolete fields Name from debian/upstream/metadata.
8
+
9
+* FTBFS: Problem with symbols files, I presume
10
+
11
+-- Steffen Moeller <moeller at debian.org> Thu, 05 Dec 2019 00:23:01 +0100
12
+
1 13
gatb-core (1.4.1+git20190813.a73b6dd+dfsg-1) unstable; urgency=medium
2 14
3 15
* New upstream version
# *debian/control*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#58ef006ab62b83b4bec5d81fe5b32c3b4c2d1cc2>
------------------------------------------------------------------------
... ... @@ -13,7 +13,7 @@ Build-Depends: debhelper-compat (= 12),
13 13
libjsoncpp-dev,
14 14
doxygen,
15 15
graphviz
16
-Standards-Version: 4.4.0
16
+Standards-Version: 4.4.1
17 17
Vcs-Browser: https://salsa.debian.org/med-team/gatb-core
18 18
Vcs-Git: https://salsa.debian.org/med-team/gatb-core.git
19 19
Homepage: https://github.com/GATB/gatb-core
# *debian/upstream/metadata*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#f5606a935d95a2f20059a7ac1136f47b2edadbf6>
------------------------------------------------------------------------
1
-Name: gatb-core
2 1
Cite-As: >
3 2
E. Drezen, G. Rizk, R. Chikhi, C. Deltel, C. Lemaitre, P. Peterlongo,
4 3
D. Lavenier. (2014)
... ... @@ -6,10 +5,10 @@ Cite-As: >
6 5
Bioinformatics, 30(20):2959-2961. / BioIT 2014 poster
7 6
Reference:
8 7
Author: >
9
-Erwan Drezen and Guillaume Rizk and Rayan Chikhi and Charles Deltel
10
-and Claire Lemaitre and Pierre Peterlongo and Dominique Lavenier
8
+Erwan Drezen and Guillaume Rizk and Rayan Chikhi and Charles Deltel
9
+and Claire Lemaitre and Pierre Peterlongo and Dominique Lavenier
11 10
Title: >
12
-GATB: Genome Assembly & Analysis Tool Box
11
+GATB: Genome Assembly & Analysis Tool Box
13 12
Journal: Bioinformatics
14 13
Year: 2014
15 14
Volume: 30
... ... @@ -19,7 +18,8 @@ Reference:
19 18
URL: http://dx.doi.org/10.1093/bioinformatics/btu406
20 19
Repository: https://github.com/GATB/gatb-core
21 20
Registry:
22
-- Name: OMICtools
23
-Entry: OMICS_04834
24
-- Name: conda:bioconda
25
-Entry: gatb
21
+- Name: OMICtools
22
+Entry: OMICS_04834
23
+- Name: conda:bioconda
24
+Entry: gatb
25
+Repository-Browse: https://github.com/GATB/gatb-core
# *gatb-core/CMakeLists.txt*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4c007bda60857aed5186f73b4dd88ee753952d0d>
------------------------------------------------------------------------
... ... @@ -110,8 +110,6 @@ if (debug)
110 110
set (LIBRARY_COMPILE_DEFINITIONS "${LIBRARY_COMPILE_DEFINITIONS}-g -p
${LIB_COMPILE_WARNINGS}")
111 111
set (CMAKE_BUILD_TYPE Debug) # else CMake adds DNDEBUG
112 112
message("-- COMPILATION IN DEBUG MODE")
113
-else()
114
-set (LIBRARY_COMPILE_DEFINITIONS "${LIBRARY_COMPILE_DEFINITIONS}-O3
-DNDEBUG ${LIB_COMPILE_WARNINGS}")
115 113
endif()
116 114
117 115
if (INT128_FOUND)
# *gatb-core/doc/doxygen/src/dbgh5page.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#2fba22cc64bca97d995e238b012045ac90567b80>
------------------------------------------------------------------------
... ... @@ -91,6 +91,8 @@
91 91
-verbose (1 arg) : verbosity level [default '1']
92 92
-email (1 arg) : send statistics to the given email address [default '']
93 93
-email-fmt (1 arg) : 'raw' or 'xml' [default 'raw']
94
+-edge-km (1 arg) : Kececioglu-Myers edge representation [default '0']
95
+
94 96
* \endcode
95 97
*
96 98
*
# *gatb-core/src/gatb/bank/impl/BankFasta.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#6861fd4c43e7c21bb057ecc2ad7a189b8aaf5b33>
------------------------------------------------------------------------
... ... @@ -651,6 +651,7 @@ void BankFasta::Iterator::init ()
651 651
*bf = (buffered_file_t *) CALLOC (1, sizeof(buffered_file_t));
652 652
(*bf)->buffer = (unsigned char*) MALLOC (BUFFER_SIZE);
653 653
(*bf)->stream = gzopen (fname, "r");
654
+gzbuffer((*bf)->stream,2*1024*1024);
654 655
655 656
/** We check that we can open the file. */
656 657
if ((*bf)->stream == NULL)
# *gatb-core/src/gatb/bcalm2/bglue_algo.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#490f86cdfc07e7db3455fffd8859052bd5dbeb50>
------------------------------------------------------------------------
... ... @@ -63,12 +63,9 @@ using namespace
gatb::core::tools::collections::impl;
63 63
using namespace std;
64 64
65 65
// let's be clear here:
66
-// UF hashes will be stored in 32 bits for efficiency (as I don't want
to have a 64-bits UF for memory reasons, also, would require to modify
unionFind.hpp)
67
-typedef uint32_t uf_hashes_t;
68
-// but there can be more than 2^{32} sequences in the glue file
66
+typedef uint64_t uf_hashes_t; // UF hashes are the hash values of k-mers
to be inserted into the UF data structure. Don't try setting to
uint32_t, would be a disaster
69 67
typedef uint64_t seq_idx_t;
70
-// so, potentially, more than 2^{32} UF hashes (but not necessarily,
consider that some sequences don't need to be glued)
71
-// what will happen is that more one UF class won't be linked to a
single unitig, but multiple unitigs
68
+typedef uint32_t uf_class_t; // UF class is the identifier of an element
in the UF
72 69
// let's hope that there won't be saturation (only 1 UF class with all
unitigs)
73 70
// if this happens, then "Top 10 glue partitions by size:" will show
only one entry and BCALM will blow up in memory
74 71
// a fix would be to use a 64 bits UF (to be coded later)
... ... @@ -197,6 +194,24 @@ static string skip_first_abundance(const
string& list)
197 194
return res;
198 195
}
199 196
197
+static string make_header(const int seq_size, const string& abundances,
bool all_abundance_counts)
198
+{
199
+string header;
200
+float mean_abundance = get_mean_abundance(abundances);
201
+uint64_t sum_abundances = get_sum_abundance(abundances);
202
+if (all_abundance_counts)
203
+{
204
+// in this setting, all kmer wabundances are printed in the order of the
kmers in the sequence
205
+header = "LN:i:" + to_string(seq_size) + " ab:Z:" + abundances;
206
+}
207
+else
208
+{
209
+// km is not a standard GFA field so i'm putting it in lower case as per
the spec
210
+header = "LN:i:" + to_string(seq_size) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance);
211
+}
212
+return header;
213
+}
214
+
200 215
template<int SPAN>
201 216
struct markedSeq
202 217
{
... ... @@ -699,6 +714,7 @@ void bglue(Storage *storage,
699 714
int kmerSize,
700 715
int nb_glue_partitions,
701 716
int nb_threads,
717
+bool all_abundance_counts,
702 718
bool verbose
703 719
)
704 720
{
... ... @@ -804,7 +820,7 @@ void bglue(Storage *storage,
804 820
}
805 821
806 822
// create a UF data structure
807
-// this one stores nb_uf_keys * uint64_t (actually, atomic's).so it's
bigger than uf_hashes
823
+// this one stores nb_uf_keys * uint64_t (actually, atomic's).
808 824
unionFind ufkmers(nb_uf_keys);
809 825
810 826
#if 0
... ... @@ -911,13 +927,13 @@ void bglue(Storage *storage,
911 927
if (only_uf) // for debugging
912 928
return;
913 929
914
-/* now we're mirroring the UF to a vector of uint32_t's, it will take
less space, and strictly same information
930
+/* now we're mirroring the UF to a vector of uint32_t's(uf_class_t), it
will take less space, and strictly same information
915 931
* this is to get rid of the rank (one uint32) per element in the current
UF implementation.
916 932
* To do this, we're using the disk to save space of populating one
vector from the other in memory.
917 933
* (saves having to allocate both vectors at the same time) */
918 934
919
-BagFile<uf_hashes_t>*ufkmers_bagf=newBagFile<uf_hashes_t>(prefix+".glue.uf");
LOCAL(ufkmers_bagf);
920
-BagCache<uf_hashes_t>*ufkmers_bag=newBagCache<uf_hashes_t>(
ufkmers_bagf, 10000 ); LOCAL(ufkmers_bag);
935
+BagFile<uf_class_t>*ufkmers_bagf=newBagFile<uf_class_t>(prefix+".glue.uf");
LOCAL(ufkmers_bagf);
936
+BagCache<uf_class_t>*ufkmers_bag=newBagCache<uf_class_t>( ufkmers_bagf,
10000 ); LOCAL(ufkmers_bag);
921 937
922 938
for (unsigned long i = 0; i < nb_uf_keys; i++)
923 939
//ufkmers_vector[i] = ufkmers.find(i); // just in-memory without the disk
... ... @@ -930,15 +946,15 @@ void bglue(Storage *storage,
930 946
931 947
ufkmers_bag->flush();
932 948
933
-std::vector<uf_hashes_t> ufkmers_vector(nb_uf_keys);
934
-IteratorFile<uf_hashes_t> ufkmers_file(prefix+".glue.uf");
949
+std::vector<uf_class_t> ufkmers_vector(nb_uf_keys);
950
+IteratorFile<uf_class_t> ufkmers_file(prefix+".glue.uf");
935 951
unsigned long i = 0;
936 952
for (ufkmers_file.first(); !ufkmers_file.isDone(); ufkmers_file.next())
937 953
ufkmers_vector[i++] = ufkmers_file.item();
938 954
939 955
System::file().remove (prefix+".glue.uf");
940 956
941
-logging("loaded 32-bit UF (" +
to_string(nb_uf_keys*sizeof(uf_hashes_t)/1024/1024) + " MB)");
957
+logging("loaded 32-bit UF (" +
to_string(nb_uf_keys*sizeof(uf_class_t)/1024/1024) + " MB)");
942 958
943 959
// setup output file
944 960
string output_prefix = prefix;
... ... @@ -1000,7 +1016,7 @@ void bglue(Storage *storage,
1000 1016
1001 1017
// partition the glue into many files, à la dsk
1002 1018
auto partitionGlue = [k, &modelCanon /* crashes if copied!*/, \
1003
-&get_UFclass, &gluePartitions,
1019
+&get_UFclass, &gluePartitions,all_abundance_counts,
1004 1020
&out, &outLock, &nb_seqs_in_partition, nbGluePartitions]
1005 1021
(const Sequence& sequence)
1006 1022
{
... ... @@ -1024,11 +1040,8 @@ void bglue(Storage *storage,
1024 1040
if (!found_class) // this one doesn't need to be glued
1025 1041
{
1026 1042
const string abundances = comment.substr(3);
1027
-float mean_abundance = get_mean_abundance(abundances);
1028
-uint64_t sum_abundances = get_sum_abundance(abundances);
1029
-
1030
-// km is not a standard GFA field so i'm putting it in lower case as per
the spec
1031
-output(seq, out, "LN:i:" + to_string(seq.size()) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance));
1043
+string header = make_header(seq.size(),abundances, all_abundance_counts);
1044
+output(seq, out, header);
1032 1045
return;
1033 1046
}
1034 1047
... ... @@ -1082,7 +1095,7 @@ void bglue(Storage *storage,
1082 1095
for (int partition = 0; partition < nbGluePartitions; partition++)
1083 1096
{
1084 1097
auto glue_partition = [&modelCanon, &ufkmers, partition,
&gluePartition_prefix, nbGluePartitions, ©_nb_seqs_in_partition,
1085
-&get_UFclass, &out, &outLock, kmerSize]( int thread_id)
1098
+&get_UFclass, &out, &outLock, kmerSize,all_abundance_counts]( int thread_id)
1086 1099
{
1087 1100
int k = kmerSize;
1088 1101
... ... @@ -1172,10 +1185,9 @@ void bglue(Storage *storage,
1172 1185
string seq, abs;
1173 1186
glue_sequences(seqs_to_glue[i], seqs_to_glue_is_circular[i], sequences,
abundances, kmerSize, seq, abs); // takes as input the indices of
ordered sequences, whether that sequence is circular, and the
markedSeq's themselves along with their abundances
1174 1187
1175
-float mean_abundance = get_mean_abundance(abs);
1176
-uint32_t sum_abundances = get_sum_abundance(abs);
1177 1188
{
1178
-output(seq, out, "LN:i:" + to_string(seq.size()) + " KC:i:" +
to_string(sum_abundances) + " km:f:" +
to_string_with_precision(mean_abundance));
1189
+string header = make_header(seq.size(),abs, all_abundance_counts);
1190
+output(seq, out, header);
1179 1191
}
1180 1192
}
1181 1193
... ... @@ -1198,7 +1210,7 @@ void bglue(Storage *storage,
1198 1210
1199 1211
logging("end");
1200 1212
1201
-bool debug_keep_glue_files = true;// for debugging // TODO enable it if
-redo-bglue param was provided (need some info from
UnitigsConstructionAlgorithm).
1213
+bool debug_keep_glue_files = false;// for debugging // TODO warning: if
debug_keep_glue_files is set to 'false,' then the debug option
'-redo-bglue' cannot work because it needs those bglue files
1202 1214
if (debug_keep_glue_files)
1203 1215
{
1204 1216
std::cout << "debug: not deleting glue files" << std::endl;
# *gatb-core/src/gatb/bcalm2/bglue_algo.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#349369b0ff3f84575fe0d9513fb9bfcadb1b3612>
------------------------------------------------------------------------
... ... @@ -150,6 +150,7 @@ void
bglue(gatb::core::tools::storage::impl::Storage* storage,
150 150
int kmerSize,
151 151
int nb_glue_partitions,
152 152
int nb_threads,
153
+bool all_abundance_counts,
153 154
bool verbose
154 155
);
155 156
# *gatb-core/src/gatb/debruijn/impl/Graph.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#fa53c8abddac2ffbd797f8ad40c9772134223fed>
------------------------------------------------------------------------
... ... @@ -648,6 +648,8 @@ IOptionsParser* GraphTemplate<Node, Edge,
GraphDataVariant>::getOptionsParser (b
648 648
IOptionsParser* parserGeneral = new OptionsParser ("general");
649 649
parserGeneral->push_front (new OptionOneParam (STR_INTEGER_PRECISION,
"integers precision (0 for optimized value)", false, "0", false));
650 650
parserGeneral->push_front (new OptionOneParam (STR_VERBOSE, "verbosity
level", false, "1" ));
651
+parserGeneral->push_front (new OptionOneParam
(STR_EDGE_KM_REPRESENTATION, "edge km representation", false, "0" ));
652
+parserGeneral->push_front (new OptionNoParam (STR_ALL_ABUNDANCE_COUNTS,
"output all k-mer abundance counts instead of mean" ));
651 653
parserGeneral->push_front (new OptionOneParam (STR_NB_CORES, "number of
cores", false, "0" ));
652 654
parserGeneral->push_front (new OptionNoParam (STR_CONFIG_ONLY, "dump
config only"));
653 655
... ... @@ -661,7 +663,7 @@ IOptionsParser* GraphTemplate<Node, Edge,
GraphDataVariant>::getOptionsParser (b
661 663
parserDebug->push_front (new OptionNoParam ("-skip-links", "same, but
skip links"));
662 664
parserDebug->push_front (new OptionNoParam ("-redo-links", "same, but
redo links"));
663 665
parserDebug->push_front (new OptionNoParam ("-skip-bglue", "same, but
skip bglue"));
664
-parserDebug->push_front (new OptionNoParam ("-redo-bglue", "same, but
redo bglue"));
666
+parserDebug->push_front (new OptionNoParam ("-redo-bglue", "same, but
redo bglue(needs debug_keep_glue_files=true in source code)"));
665 667
parserDebug->push_front (new OptionNoParam ("-skip-bcalm", "same, but
skip bcalm"));
666 668
parserDebug->push_front (new OptionNoParam ("-redo-bcalm", "debug
function, redo the bcalm algo"));
667 669
# *gatb-core/src/gatb/debruijn/impl/GraphUnitigs.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#2ff315124cb4ed9eebb9d35c35743144de29a781>
------------------------------------------------------------------------
... ... @@ -259,7 +259,7 @@ void
GraphUnitigsTemplate<span>::build_unitigs_postsolid(std::string unitigs_fil
259 259
}
260 260
261 261
bool redo_bcalm = props->get("-redo-bcalm");
262
-bool redo_bglue = props->get("-redo-bglue");
262
+bool redo_bglue = props->get("-redo-bglue");// note: if that option is
to be used, make sure to enable debug_keep_glue_files=true in bglue_algo.cpp
263 263
bool redo_links = props->get("-redo-links");
264 264
265 265
bool skip_bcalm = props->get("-skip-bcalm");
# *gatb-core/src/gatb/debruijn/impl/LinkTigs.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#34ae9ab374303a1000194fa7d58aeba9d58f32ce>
------------------------------------------------------------------------
... ... @@ -52,7 +52,7 @@ namespace gatb { namespace core { namespace
debruijn { namespace impl {
52 52
* Normally bcalm outputs consecutive unitig ID's but LinkTigs can also
work with non-consecutive, non-sorted IDs
53 53
*/
54 54
template<size_t span>
55
-void link_tigs(string unitigs_filename, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, bool renumber_unitigs)
55
+void link_tigs(string unitigs_filename, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, booledge_km_representation,bool
renumber_unitigs)
56 56
{
57 57
bcalm_logging = verbose;
58 58
BankFasta* out = new BankFasta(unitigs_filename+".linked");
... ... @@ -60,7 +60,7 @@ void link_tigs(string unitigs_filename, int
kmerSize, int nb_threads, uint64_t &
60 60
logging("Finding links between unitigs");
61 61
62 62
for (int pass = 0; pass < nb_passes; pass++)
63
-link_unitigs_pass<span>(unitigs_filename, verbose, pass, kmerSize,
renumber_unitigs);
63
+link_unitigs_pass<span>(unitigs_filename, verbose, pass, kmerSize,
edge_km_representation,renumber_unitigs);
64 64
65 65
write_final_output(unitigs_filename, verbose, out, nb_unitigs,
renumber_unitigs);
66 66
... ... @@ -265,7 +265,7 @@ static void record_links(uint64_t utig_id,
int pass, const string &link, std::of
265 265
266 266
267 267
template<size_t span>
268
-void link_unitigs_pass(const string unitigs_filename, bool verbose,
const int pass, const int kmerSize, const bool renumber_unitigs)
268
+void link_unitigs_pass(const string unitigs_filename, bool verbose,
const int pass, const int kmerSize, booledge_km_representation,const
bool renumber_unitigs)
269 269
{
270 270
typedef typename kmer::impl::Kmer<span>::ModelCanonical Model;
271 271
typedef typename kmer::impl::Kmer<span>::Type Type;
... ... @@ -376,7 +376,12 @@ void link_unitigs_pass(const string
unitigs_filename, bool verbose, const int pa
376 376
//bool rc = e_in.rc ^ (!beginInSameOrientation); // "rc" sets the
destination strand // i don't think it's the right formula because of
k-1-mers that are their self revcomp. see the mikko bug in the test
folder, that provides a nice illustration of that
377 377
bool rc = e_in.pos == UNITIG_END; // a better way to determine the rc
flag is just looking at position of e_in k-1-mer
378 378
379
-in_links += "L:-:" + to_string(e_in.unitig) + ":" + (rc?"-":"+") + " ";
379
+
380
+if(edge_km_representation){
381
+in_links += "J:0:" + to_string(e_in.unitig) + ":" + (rc?"1":"0") + " ";
382
+}else{
383
+in_links += "L:-:" + to_string(e_in.unitig) + ":" + (rc?"-":"+") + " ";
384
+}
380 385
381 386
/* what to do when kmerBegin is same as forward and reverse?
382 387
used to have this:
... ... @@ -432,7 +437,13 @@ void link_unitigs_pass(const string
unitigs_filename, bool verbose, const int pa
432 437
433 438
bool rc = e_out.pos == UNITIG_END; // a better way to determine the rc
flag is just looking at position of e_in k-1-mer
434 439
435
-out_links += "L:+:" + to_string(e_out.unitig) + ":" + (rc?"-":"+") + " ";
440
+if(edge_km_representation){
441
+out_links += "J:1:" + to_string(e_out.unitig) + ":" + (rc?"1":"0") + " ";
442
+}else{
443
+out_links += "L:+:" + to_string(e_out.unitig) + ":" + (rc?"-":"+") + " ";
444
+}
445
+
446
+
436 447
437 448
if (debug) std::cout << " [valid] ";
438 449
}
# *gatb-core/src/gatb/debruijn/impl/LinkTigs.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#8d46876b0b01c8687ffec95a3e56a84f1d5ffc83>
------------------------------------------------------------------------
... ... @@ -30,10 +30,10 @@ namespace gatb { namespace core {
namespace debruijn { namespace impl {
30 30
31 31
32 32
template<size_t SPAN>
33
-void link_tigs( std::string prefix, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, bool renumber_unitigs = false);
33
+void link_tigs( std::string prefix, int kmerSize, int nb_threads,
uint64_t &nb_unitigs, bool verbose, booledge_km_representation,bool
renumber_unitigs = false);
34 34
35 35
template<size_t span>
36
-void link_unitigs_pass(const std::string unitigs_filename, bool verbose,
const int pass, const int kmerSize, constboolrenumber_unitigs);
36
+void link_unitigs_pass(const std::string unitigs_filename, bool verbose,
const int pass, const int kmerSize,
booledge_km_representation,constboolrenumber_unitigs);
37 37
38 38
}}}}
39 39
# *gatb-core/src/gatb/debruijn/impl/UnitigsConstructionAlgorithm.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#47e6fff3caa93f4ef7cc54b7f08ec8ac25b21393>
------------------------------------------------------------------------
... ... @@ -91,17 +91,15 @@
UnitigsConstructionAlgorithm<span>::~UnitigsConstructionAlgorithm ()
91 91
template <size_t span>
92 92
void UnitigsConstructionAlgorithm<span>::execute ()
93 93
{
94
-kmerSize =
95
-getInput()->getInt(STR_KMER_SIZE);
96
-int abundance =
97
-getInput()->getInt(STR_KMER_ABUNDANCE_MIN); // note: doesn't work when
it's "auto"
98
-int minimizerSize =
99
-getInput()->getInt(STR_MINIMIZER_SIZE);
100
-int nb_threads =
101
-getInput()->getInt(STR_NB_CORES);
102
-int minimizer_type =
103
-getInput()->getInt(STR_MINIMIZER_TYPE);
104
-bool verbose = getInput()->getInt(STR_VERBOSE);
94
+kmerSize = getInput()->getInt(STR_KMER_SIZE);
95
+int abundance = getInput()->getInt(STR_KMER_ABUNDANCE_MIN); // note:
doesn't work when it's "auto"
96
+int minimizerSize = getInput()->getInt(STR_MINIMIZER_SIZE);
97
+int nb_threads = getInput()->getInt(STR_NB_CORES);
98
+int minimizer_type = getInput()->getInt(STR_MINIMIZER_TYPE);
99
+bool verbose = getInput()->getInt(STR_VERBOSE);
100
+bool edge_km_representation =
getInput()->getInt(STR_EDGE_KM_REPRESENTATION);
101
+bool all_abundance_counts = getInput()->get(STR_ALL_ABUNDANCE_COUNTS);
102
+
105 103
int nb_glue_partitions = 0;
106 104
if (getInput()->get("-nb-glue-partitions"))
107 105
nb_glue_partitions = getInput()->getInt("-nb-glue-partitions");
... ... @@ -110,9 +108,9 @@ void
UnitigsConstructionAlgorithm<span>::execute ()
110 108
if ((unsigned int)nb_threads > nbThreads)
111 109
std::cout << "Uh. Unitigs graph construction called with nb_threads " <<
nb_threads << " but dispatcher has nbThreads " << nbThreads << std::endl;
112 110
113
-if (do_bcalm) bcalm2<span>(&_storage, unitigs_filename, kmerSize,
abundance, minimizerSize, nbThreads, minimizer_type, verbose);
114
-if (do_bglue) bglue<span> (&_storage, unitigs_filename, kmerSize,
nb_glue_partitions, nbThreads, verbose);
115
-if (do_links) link_tigs<span>(unitigs_filename, kmerSize, nbThreads,
nb_unitigs, verbose);
111
+if (do_bcalm) bcalm2<span>(&_storage, unitigs_filename, kmerSize,
abundance, minimizerSize, nbThreads, minimizer_type, verbose);
112
+if (do_bglue) bglue<span> (&_storage, unitigs_filename, kmerSize,
nb_glue_partitions, nbThreads, all_abundance_counts, verbose);
113
+if (do_links) link_tigs<span>(unitigs_filename, kmerSize, nbThreads,
nb_unitigs, verbose,edge_km_representation);
116 114
117 115
/** We gather some statistics. */
118 116
// nb_unitigs will be used in GraphUnitigs
# *gatb-core/src/gatb/kmer/impl/SortingCountAlgorithm.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#ed783240a52e80bd1c3a470d6646b3f6b3ab63e4>
------------------------------------------------------------------------
... ... @@ -1300,6 +1300,16 @@ void
SortingCountAlgorithm<span>::fillPartitions (size_t pass, Iterator<Sequence
1300 1300
itBanks[i]->finalize();
1301 1301
}
1302 1302
}
1303
+
1304
+// force close partitions and re-open them for reading
1305
+// may prevent crash in large multi-bank counting instance on Lustre
filesystems
1306
+if(_config._solidityKind != KMER_SOLIDITY_SUM)
1307
+{
1308
+string tmpStorageName = getInput()->getStr(STR_URI_OUTPUT_TMP) + "/" +
System::file().getTemporaryFilename("dsk_partitions");
1309
+setPartitions (0); // close the partitions first, otherwise new files
are opened before closing parti from previous pass
1310
+setPartitions ( & (*_tmpPartitionsStorage)().getPartition<Type> ("parts"));
1311
+
1312
+}
1303 1313
}
1304 1314
1305 1315
/*********************************************************************
# *gatb-core/src/gatb/system/impl/FileSystemCommon.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#6dc8c95d8c6c04e96a48b4cf5c3c2ea31cc93750>
------------------------------------------------------------------------
... ... @@ -36,6 +36,7 @@
36 36
#include <string.h>
37 37
#include <sys/stat.h>
38 38
#include <unistd.h>
39
+#include <iostream>
39 40
40 41
/********************************************************************************/
41 42
namespace gatb {
... ... @@ -60,6 +61,7 @@ public:
60 61
{
61 62
_isStdout = path && strcmp(path,"stdout")==0;
62 63
_handle = _isStdout ? stdout : fopen (path, mode);
64
+//std::cout << "opening file " << _path << " handle " << _handle <<
std::endl;
63 65
if(_handle == 0)
64 66
{
65 67
throw Exception ("cannot open %s %s",path,strerror(errno));
... ... @@ -67,7 +69,9 @@ public:
67 69
}
68 70
69 71
/** Destructor. */
70
-virtual ~CommonFile () { if (_handle && !_isStdout) { fclose (_handle); } }
72
+virtual ~CommonFile () { if (_handle && !_isStdout) {
73
+//std::cout << "closing file " << _path << " handle " << _handle <<
std::endl;
74
+fclose (_handle); } }
71 75
72 76
/** \copydoc IFile::isOpen */
73 77
bool isOpen () { return getHandle() != 0; }
# *gatb-core/src/gatb/template/TemplateSpecialization10.cpp.in*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4c12716c6fcb516eda353ab99ab2b81002b77515>
------------------------------------------------------------------------
... ... @@ -25,15 +25,16 @@ template void bglue<${KSIZE}>(Storage*
storage,
25 25
int kmerSize,
26 26
int nb_glue_partitions,
27 27
int nb_threads,
28
+bool all_abundance_counts,
28 29
bool verbose
29 30
);
30 31
31 32
template class graph3<${KSIZE}>; // graph3<span> switch
32 33
33 34
template void link_tigs<${KSIZE}>
34
-(std::string unitigs_filename, int kmerSize, int nb_threads, uint64_t
&nb_unitigs, bool verbose, bool renumber_unitigs = false);
35
+(std::string unitigs_filename, int kmerSize, int nb_threads, uint64_t
&nb_unitigs, bool verbose, bool edge_km_representation, bool
renumber_unitigs = false);
35 36
36
-template void link_unitigs_pass<${KSIZE}>(const std::string
unitigs_filename, bool verbose, const int pass, const int kmerSize,
const bool renumber_unitigs);
37
+template void link_unitigs_pass<${KSIZE}>(const std::string
unitigs_filename, bool verbose, const int pass, const int kmerSize, bool
edge_km_representation, const bool renumber_unitigs);
37 38
38 39
39 40
/********************************************************************************/
# *gatb-core/src/gatb/tools/collections/impl/IteratorFile.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9be95217b1b4429f2eeabc2213c457fd389f08a0>
------------------------------------------------------------------------
... ... @@ -239,6 +239,7 @@ public:
239 239
_filename(it._filename), _gzfile(0), _buffer(0), _cpt_buffer(0),
_idx(0), _cacheItemsNb(it._cacheItemsNb), _isDone(true)
240 240
{
241 241
_gzfile = gzopen(_filename.c_str(),"rb");
242
+gzbuffer(_gzfile,2*1024*1024);
242 243
_buffer = (Item*) MALLOC (sizeof(Item) * _cacheItemsNb);
243 244
}
244 245
... ... @@ -248,6 +249,7 @@ public:
248 249
249 250
{
250 251
_gzfile = gzopen(_filename.c_str(),"rb");
252
+gzbuffer(_gzfile,2*1024*1024);
251 253
_buffer = (Item*) MALLOC (sizeof(Item) * _cacheItemsNb);
252 254
}
253 255
... ... @@ -273,6 +275,7 @@ public:
273 275
_isDone = it._isDone;
274 276
275 277
_gzfile = gzopen(_filename.c_str(),"r");
278
+gzbuffer(_gzfile,2*1024*1024);
276 279
_buffer = (Item*) MALLOC (sizeof(Item) * it._cacheItemsNb);
277 280
}
278 281
return *this;
# *gatb-core/src/gatb/tools/misc/api/StringsRepository.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#4ae6c527aafe52f4f16907962f525902d4b5ef6f>
------------------------------------------------------------------------
... ... @@ -83,6 +83,8 @@ public:
83 83
const char* graph () { return "-graph"; }
84 84
const char* kmer_size () { return "-kmer-size"; }
85 85
const char* minimizer_size () { return "-minimizer-size"; }
86
+const char* edge_km_representation () { return "-edge-km"; }
87
+const char* all_abundance_counts () { return "-all-abundance-counts"; }
86 88
const char* kmer_abundance () { return "-abundance"; }
87 89
const char* kmer_abundance_min () { return "-abundance-min"; }
88 90
const char* kmer_abundance_min_threshold () { return
"-abundance-min-threshold"; }
... ... @@ -138,6 +140,8 @@ public:
138 140
#define STR_URI_GRAPH
gatb::core::tools::misc::StringRepository::singleton().graph ()
139 141
#define STR_KMER_SIZE
gatb::core::tools::misc::StringRepository::singleton().kmer_size ()
140 142
#define STR_MINIMIZER_SIZE
gatb::core::tools::misc::StringRepository::singleton().minimizer_size ()
143
+#define STR_EDGE_KM_REPRESENTATION
gatb::core::tools::misc::StringRepository::singleton().edge_km_representation
()
144
+#define STR_ALL_ABUNDANCE_COUNTS
gatb::core::tools::misc::StringRepository::singleton().all_abundance_counts
()
141 145
#define STR_INTEGER_PRECISION
gatb::core::tools::misc::StringRepository::singleton().integer_precision ()
142 146
#define STR_KMER_ABUNDANCE
gatb::core::tools::misc::StringRepository::singleton().kmer_abundance ()
143 147
#define STR_KMER_ABUNDANCE_MIN
gatb::core::tools::misc::StringRepository::singleton().kmer_abundance_min ()
# *gatb-core/src/gatb/tools/misc/impl/Tool.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#3b8fda1a420e311e4fb4b0d9fe344d4bbc2f4e97>
------------------------------------------------------------------------
... ... @@ -57,7 +57,6 @@ Tool::Tool (const std::string& name) :
userDisplayHelp(0), _helpTarget(0),userDi
57 57
58 58
getParser()->push_back (new OptionOneParam (STR_NB_CORES, "number of
cores", false, "0" ));
59 59
getParser()->push_back (new OptionOneParam (STR_VERBOSE, "verbosity
level", false, "1" ));
60
-
61 60
getParser()->push_back (new OptionNoParam (STR_VERSION, "version", false));
62 61
getParser()->push_back (new OptionNoParam (STR_HELP, "help", false));
63 62
# *gatb-core/src/gatb/tools/storage/impl/CollectionHDF5Patch.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#f84f48b86f89b6fe6f3b401f6ed674d1ffab829b>
------------------------------------------------------------------------
... ... @@ -266,6 +266,7 @@ public:
266 266
herr_t status = 0;
267 267
268 268
{
269
+//std::cout << "begin insert" << std::endl;
269 270
system::LocalSynchronizer localsynchro (_common->_synchro);
270 271
271 272
/** We get the dataset id. */
... ... @@ -300,6 +301,7 @@ public:
300 301
status = H5Sclose (filespaceId);
301 302
status = H5Sclose (memspaceId);
302 303
if (status != 0) { std::cout << "err H5Sclose" << std::endl; }
304
+//std::cout << "end insert" << std::endl;
303 305
}
304 306
305 307
/** We periodically clean up some HDF5 resources. */
... ... @@ -373,12 +375,14 @@ private:
373 375
* NOTE !!! the 'clean' method called after this block is also synchronized,
374 376
* and therefore must not be in the same instruction block. */
375 377
{
378
+//std::cout << "begin retrievecache" << std::endl;
376 379
system::LocalSynchronizer localsynchro (_common->_synchro);
377 380
378 381
hid_t memspaceId = H5Screate_simple (1, &count, NULL);
379 382
380 383
/** Select hyperslab on file dataset. */
381 384
hid_t filespaceId = H5Dget_space(_common->getDatasetId());
385
+//std::cout << "filespaceId " << filespaceId << std::endl;
382 386
status = H5Sselect_hyperslab (filespaceId, H5S_SELECT_SET, &start, NULL,
&count, NULL);
383 387
if (status < 0) { throw gatb::core::system::Exception ("HDF5 error
(H5Sselect_hyperslab), status %d", status); }
384 388
... ... @@ -390,6 +394,7 @@ private:
390 394
status = H5Sclose (filespaceId);
391 395
status = H5Sclose (memspaceId);
392 396
if (status < 0) { throw gatb::core::system::Exception ("HDF5 error
(H5Sclose), status %d", status); }
397
+//std::cout << "end retrievecache" << std::endl;
393 398
}
394 399
395 400
/** We periodically clean up some HDF5 resources. */
# *gatb-core/src/gatb/tools/storage/impl/Storage.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#9e0109d139212f56eef6153e59e64fa4c6670360>
------------------------------------------------------------------------
... ... @@ -181,7 +181,7 @@ public:
181 181
182 182
/** Get a child partition from its name. Created if not already exists.
183 183
* \param[in] name : name of the child partition to be retrieved.
184
-* \param[in] nb : in case of creation, tells how many collection belong
to the partition.
184
+* \param[in] nb : in case of creation, tells how many collection belong
to the partition.IMPORTANT: if nb != 0, StorageFile will erase the
partition before opening it. So if you're opening a partition, just set
nb=0 and let it autodetect the size
185 185
* \return the child partition.
186 186
*/
187 187
template <class Type> Partition<Type>& getPartition (const std::string&
name, size_t nb=0);
# *gatb-core/src/gatb/tools/storage/impl/StorageFile.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#efdf123a4ecf95f1a9326ff6d3443694ceed944f>
------------------------------------------------------------------------
... ... @@ -96,7 +96,8 @@ namespace impl {
96 96
/** */
97 97
~GroupFile()
98 98
{
99
-system::impl::System::file().rmdir(folder); // hack to remove the
trashme folers. I'd have liked to make that call in remove() but for
some reason remove() isn't called
99
+//std::cout << "groupfile destructor called, removing folder " << folder
<< std::endl;
100
+system::impl::System::file().rmdir(folder); // hack to remove the
trashme folers. I'd have liked to make that call in remove() but for
some reason remove() isn't called
100 101
}
101 102
102 103
/** */
... ... @@ -219,17 +220,24 @@ public:
219 220
if
(!system::impl::System::file().isFolderEndingWith(storage_prefix,"_gatb"))
220 221
file_folder += "_gatb/";
221 222
222
-std::string filename = file_folder + parent->getFullId('.') +
std::string(".") + name;
223
-std::string folder = system::impl::System::file().getDirectory(filename);
224
-std::string prefix = system::impl::System::file().getBaseName(filename)
+ std::string(".") + name; // because gatb's getBaseName is stupid and
cuts after the last dot
223
+std::string full_path = file_folder;
224
+std::string parent_base = parent->getFullId('.');
225
+std::string base_name = parent_base;
226
+if (parent_base.size() > 0)
227
+base_name += std::string("."); // because gatb's getBaseName is stupid
and cuts after the last dot
228
+base_name += name;
229
+
230
+full_path += base_name; // but then base_name might have a suffix like
".1" for partitions
231
+
232
+//std::cout <<"name: " << name << " filename " << full_path << " prefix
" << base_name<< std::endl;
225 233
226 234
if (nb == 0)
227 235
{ // if nb is 0, it means we're opening partitions and not creating
them, thus we need to get the number of partitions.
228 236
229 237
int nb_partitions=0;
230
-for (auto filename : system::impl::System::file().listdir(folder))
238
+for (auto filename : system::impl::System::file().listdir(file_folder))
231 239
{
232
-if (!filename.compare(0, prefix.size(),prefix)) // startswith
240
+if (!filename.compare(0, base_name.size(),base_name)) // startswith
233 241
{
234 242
nb_partitions++;
235 243
}
... ... @@ -240,19 +248,20 @@ public:
240 248
std::cout << "error: could not get number of partition for " << name <<
" using StorageFile" << std::endl;
241 249
exit(1);
242 250
}
251
+//std::cout << "got " << nb << " partitions" << std::endl;
243 252
}
244 253
else
245 254
{
246 255
// else, if nb is set, means we're creating some partitions. let's
delete all the previous ones to avoid wrongly counting
247
-for (auto filename : system::impl::System::file().listdir(folder))
256
+for (auto filename : system::impl::System::file().listdir(file_folder))
248 257
{
249
-//std::cout <<"name: " << name << " comparing " << filename << " with
prefix " << prefix << std::endl;
250
-if (!filename.compare(0, prefix.size(),prefix)) // startswith
258
+//std::cout <<"name: " << name << " comparing " << filename << " with
prefix " << base_name << std::endl;
259
+if (!filename.compare(0, base_name.size(),base_name)) // startswith
251 260
{
252 261
// some additional guard:
253 262
if (filename == "." ||filename == "..") continue;
254
-system::impl::System::file().remove(folder + "/" + filename);
255
-//std::cout << "deleting" << folder << "/" << filename << std::endl;
263
+system::impl::System::file().remove(file_folder + "/" + filename);
264
+//std::cout << "deleting" << file_folder << "/" << filename << std::endl;
256 265
}
257 266
}
258 267
}
# *gatb-core/src/gatb/tools/storage/impl/StorageHDF5.hpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#dd31cf0d08329fcf5ce3e2647df00c99baa267ba>
------------------------------------------------------------------------
... ... @@ -268,6 +268,7 @@ private:
268 268
std::string actualName = this->getFullId('/');
269 269
270 270
/** We create the HDF5 group if needed. */
271
+//std::cout << "actualname: "<< actualName << " end"<<std::endl;
271 272
htri_t doesExist = H5Lexists (storage->getFileId(), actualName.c_str(),
H5P_DEFAULT);
272 273
273 274
if (doesExist <= 0)
# *gatb-core/test/unit/src/debruijn/TestDebruijn.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#d7401b2bb5a2b6245ce8cd38b4b1c20e7ea1a058>
------------------------------------------------------------------------
... ... @@ -87,6 +87,7 @@ class TestDebruijn : public Test
87 87
/********************************************************************************/
88 88
CPPUNIT_TEST_SUITE_GATB (TestDebruijn);
89 89
90
+CPPUNIT_TEST_GATB (debruijn_build);
90 91
CPPUNIT_TEST_GATB (debruijn_test_small_kmers);
91 92
CPPUNIT_TEST_GATB (debruijn_large_abundance_query);
92 93
CPPUNIT_TEST_GATB (debruijn_test7);
... ... @@ -104,7 +105,6 @@ class TestDebruijn : public Test
104 105
CPPUNIT_TEST_GATB (debruijn_test12);
105 106
CPPUNIT_TEST_GATB (debruijn_test13);
106 107
// CPPUNIT_TEST_GATB (debruijn_mutation); // has been removed due to it
crashing clang, and since mutate() isn't really used in apps, i didn't
bother.
107
-CPPUNIT_TEST_GATB (debruijn_build);
108 108
CPPUNIT_TEST_GATB (debruijn_checkbranching);
109 109
CPPUNIT_TEST_GATB (debruijn_mphf);
110 110
CPPUNIT_TEST_GATB (debruijn_mphf_nodeindex);
... ... @@ -908,13 +908,22 @@ public:
908 908
IBank* inputBank = new BankStrings (sequences, nbSequences);
909 909
LOCAL (inputBank);
910 910
911
+
912
+//std::cout << "g1 create" << std::endl;
911 913
Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -max-memory %d", "g1", MAX_MEMORY);
914
+
915
+//std::cout << "g2 create" << std::endl;
912 916
Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -branching-nodes none -max-memory %d", "g2", MAX_MEMORY);
913
-Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -solid-kmers-out none -max-memory %d", "g3", MAX_MEMORY);
917
+
918
+// This test doesn't work anymore.
919
+// It's probably a small fix somewehre
920
+// But I'd argue that the gatb feature of 'not outputting solid kmers to
disk' is useless
921
+// So instead of bothering, I'm just removing the present unit test.
922
+//Graph::create (inputBank, "-kmer-size 31 -out %s -abundance-min 1
-verbose 0 -solid-kmers-out none -debloom none -branching-nodes none
-max-memory %d", "g3", MAX_MEMORY);
914 923
915 924
debruijn_build_entry r1 = debruijn_build_aux_aux ("g1", true, true);
916 925
debruijn_build_entry r2 = debruijn_build_aux_aux ("g2", true, true);
917
-debruijn_build_entry r3 = debruijn_build_aux_aux ("g3", false, true);
926
+//debruijn_build_entry r3 = debruijn_build_aux_aux ("g3", false, true);
918 927
919 928
CPPUNIT_ASSERT (r1.nbNodes == r2.nbNodes);
920 929
CPPUNIT_ASSERT (r1.checksumNodes == r2.checksumNodes);
... ... @@ -925,8 +934,8 @@ public:
925 934
926 935
CPPUNIT_ASSERT (r1.nbBranchingNodes == r2.nbBranchingNodes);
927 936
CPPUNIT_ASSERT (r1.checksumBranchingNodes == r2.checksumBranchingNodes);
928
-CPPUNIT_ASSERT(r1.nbBranchingNodes==r3.nbBranchingNodes);
929
-CPPUNIT_ASSERT (r1.checksumBranchingNodes == r3.checksumBranchingNodes);
937
+//CPPUNIT_ASSERT (r1.nbBranchingNodes == r3.nbBranchingNodes); //
uncomment if we ever fix r3 (see long comment above)
938
+//CPPUNIT_ASSERT (r1.checksumBranchingNodes == r3.checksumBranchingNodes);
930 939
}
931 940
932 941
/********************************************************************************/
# *gatb-core/test/unit/src/kmer/TestDSK.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#49ce0c04f76f93b18c25528d056dc803569026e0>
------------------------------------------------------------------------
... ... @@ -471,6 +471,9 @@ public:
471 471
// printf ("min=%ld max=%ld nb=%ld check=%ld \n",
472 472
// nksMin, nksMax, sortingCount.getSolidCounts()->getNbItems(),checkNb
473 473
// );
474
+
475
+if (sortingCount.getSolidCounts()->getNbItems() != (int)checkNb)
476
+std::cout << "counted " <<sortingCount.getSolidCounts()->getNbItems()<<
" kmers, expected " << (int)checkNb << std::endl;
474 477
475 478
CPPUNIT_ASSERT (sortingCount.getSolidCounts()->getNbItems() ==
(int)checkNb);
476 479
}
# *gatb-core/test/unit/src/tools/storage/TestStorage.cpp*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#e2e4e0dc5c61805b798db5e2dee529b92e327235>
------------------------------------------------------------------------
... ... @@ -132,7 +132,10 @@ public:
132 132
{
133 133
size_t nbIter = 0;
134 134
Iterator<NativeInt64>* it = partition[i].iterator(); LOCAL(it);
135
-for (it->first(); !it->isDone(); it->next(), nbIter++) { CPPUNIT_ASSERT
(it->item() == 2*i); }
135
+for (it->first(); !it->isDone(); it->next(), nbIter++) {
136
+if (it->item() != 2*i)
137
+std::cout << std::endl << "item " << it->item() << " expected: " << 2*i
<< std::endl;
138
+CPPUNIT_ASSERT (it->item() == 2*i); }
136 139
CPPUNIT_ASSERT (nbIter == 1);
137 140
}
138 141
... ... @@ -152,7 +155,9 @@ public:
152 155
Iterator<NativeInt64>* it = partition[i].iterator(); LOCAL(it);
153 156
for (it->first(); !it->isDone(); it->next(), nbIter++)
154 157
{
155
-if (nbIter==0) { CPPUNIT_ASSERT (it->item() == 2*i ); }
158
+if (nbIter==0) { if (it->item() != 2*i)
159
+std::cout << "item " << it->item() << " expected: " << 2*i << std::endl;
160
+CPPUNIT_ASSERT (it->item() == 2*i ); }
156 161
if (nbIter==1) { CPPUNIT_ASSERT (it->item() == 2*i+1); }
157 162
}
158 163
CPPUNIT_ASSERT (nbIter == 2);
# *gatb-core/thirdparty/update-boost.sh*
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad#e79cab2022b11c71b2d8095f0ebe49260b3cc139>
------------------------------------------------------------------------
1
+#this is the procedure I use to update to newer versions of boost in
gatb-core
2
+#pretty simple but gets the job done
3
+#to be run within thirdparty/
4
+#-Rayan
5
+
6
+newdir=boost_1_71_0/boost/
7
+olddir=boost
8
+
9
+for file in `ls $olddir`
10
+do
11
+echo $file
12
+cp -R $newdir/$file $olddir/
13
+done
—
View it on GitLab
<https://salsa.debian.org/med-team/gatb-core/compare/79d0f52f9ef343e1e3980713d2fc11c1f3e51014...c58b23ef69cbe45b0c25effd4d5d8410e7bfb1ad>.
You're receiving this email because of your account on salsa.debian.org.
If you'd like to receive fewer emails, you can adjust your notification
settings.
-------------- next part --------------
_______________________________________________
debian-med-commit mailing list
debian-med-commit at alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/debian-med-commit
More information about the Debian-med-packaging
mailing list