[med-svn] [Git][med-team/xxsds-dynamic][master] 13 commits: Fix Vcs fields
Andreas Tille (@tille)
gitlab at salsa.debian.org
Mon Dec 13 19:09:02 GMT 2021
Andreas Tille pushed to branch master at Debian Med / xxsds-dynamic
Commits:
a1a3754b by Andreas Tille at 2021-12-13T20:00:36+01:00
Fix Vcs fields
- - - - -
d3b0dd04 by Andreas Tille at 2021-12-13T20:01:27+01:00
Debian Med team maintenance
- - - - -
4748e52a by Andreas Tille at 2021-12-13T20:06:49+01:00
Use git mode in watch file
- - - - -
4cdccbd6 by Andreas Tille at 2021-12-13T20:07:20+01:00
New upstream version 1.0~alpha.1+git20210426.548c6f7
- - - - -
a82a0a1f by Andreas Tille at 2021-12-13T20:07:20+01:00
Update upstream source from tag 'upstream/1.0_alpha.1+git20210426.548c6f7'
Update to upstream version '1.0~alpha.1+git20210426.548c6f7'
with Debian dir 0e426670ec76c373b752fc40f2bf3e4239f81e9a
- - - - -
9575a435 by Andreas Tille at 2021-12-13T20:07:39+01:00
routine-update: Standards-Version: 4.6.0
- - - - -
150db82a by Andreas Tille at 2021-12-13T20:07:39+01:00
routine-update: debhelper-compat 13
- - - - -
7ae43126 by Andreas Tille at 2021-12-13T20:07:41+01:00
routine-update: Remove trailing whitespace in debian/changelog
- - - - -
e4b2846d by Andreas Tille at 2021-12-13T20:07:41+01:00
routine-update: Rules-Requires-Root: no
- - - - -
8e63d7f4 by Andreas Tille at 2021-12-13T20:07:44+01:00
Set upstream metadata fields: Bug-Database, Bug-Submit.
Changes-By: lintian-brush
Fixes: lintian: upstream-metadata-file-is-missing
See-also: https://lintian.debian.org/tags/upstream-metadata-file-is-missing.html
Fixes: lintian: upstream-metadata-missing-bug-tracking
See-also: https://lintian.debian.org/tags/upstream-metadata-missing-bug-tracking.html
- - - - -
cf6748bc by Andreas Tille at 2021-12-13T20:07:45+01:00
Avoid explicitly specifying -Wl,--as-needed linker flag.
Changes-By: lintian-brush
Fixes: lintian: debian-rules-uses-as-needed-linker-flag
See-also: https://lintian.debian.org/tags/debian-rules-uses-as-needed-linker-flag.html
- - - - -
451f824c by Andreas Tille at 2021-12-13T20:07:47+01:00
Apply multi-arch hints.
+ libxxsds-dynamic-dev: Add Multi-Arch: foreign.
Changes-By: apply-multiarch-hints
- - - - -
8e74cdc1 by Andreas Tille at 2021-12-13T20:08:36+01:00
routine-update: Ready to upload to unstable
- - - - -
7 changed files:
- debian/changelog
- debian/control
- debian/rules
- + debian/upstream/metadata
- debian/watch
- h0_lz77.cpp
- include/dynamic/algorithms/h0_lz77.hpp
Changes:
=====================================
debian/changelog
=====================================
@@ -1,3 +1,20 @@
+xxsds-dynamic (1.0~alpha.1+git20210426.548c6f7-1) unstable; urgency=medium
+
+ * Team upload.
+ * Fix Vcs fields
+ * Debian Med team maintenance
+ * Use git mode in watch file
+ * Standards-Version: 4.6.0 (routine-update)
+ * debhelper-compat 13 (routine-update)
+ * Remove trailing whitespace in debian/changelog (routine-update)
+ * Rules-Requires-Root: no (routine-update)
+ * Set upstream metadata fields: Bug-Database, Bug-Submit.
+ * Avoid explicitly specifying -Wl,--as-needed linker flag.
+ * Apply multi-arch hints.
+ + libxxsds-dynamic-dev: Add Multi-Arch: foreign.
+
+ -- Andreas Tille <tille at debian.org> Mon, 13 Dec 2021 20:07:49 +0100
+
xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-3) unstable; urgency=medium
* Fixed omitted trailer in d/changelog
@@ -14,7 +31,7 @@ xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-2) unstable; urgency=medium
(thanks go to FTPmaster Thorsten for spotting that omission).
-- Steffen Moeller <moeller at debian.org> Sun, 13 Sep 2020 19:48:50 +0200
-
+
xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-1) UNRELEASED; urgency=medium
* Initial submission to main.
=====================================
debian/control
=====================================
@@ -1,19 +1,22 @@
Source: xxsds-dynamic
Priority: optional
-Maintainer: Steffen Moeller <moeller at debian.org>
-Build-Depends: debhelper-compat (= 12),
+Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
+Uploaders: Steffen Moeller <moeller at debian.org>
+Build-Depends: debhelper-compat (= 13),
cmake,
libtsl-hopscotch-map-dev
-Standards-Version: 4.5.0
+Standards-Version: 4.6.0
Section: libs
+Vcs-Browser: https://salsa.debian.org/med-team/xxsds-dynamic
+Vcs-Git: https://salsa.debian.org/med-team/xxsds-dynamic.git
Homepage: https://github.com/xxsds/DYNAMIC
-Vcs-Browser: https://salsa.debian.org/med-team/dynamic
-Vcs-Git: https://salsa.debian.org/med-team/dynamic.git
+Rules-Requires-Root: no
Package: libxxsds-dynamic-dev
Section: libdevel
Architecture: all
Depends: ${misc:Depends}
+Multi-Arch: foreign
Description: succinct and compressed fully-dynamic data structures library
This library offers space- and time-efficient implementations of some
basic succinct/compressed dynamic data structures. It only ships header
=====================================
debian/rules
=====================================
@@ -3,7 +3,6 @@ export DH_VERBOSE = 1
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
export DEB_CFLAGS_MAINT_APPEND = -Wall -pedantic
-export DEB_LDFLAGS_MAINT_APPEND = -Wl,--as-needed
%:
=====================================
debian/upstream/metadata
=====================================
@@ -0,0 +1,3 @@
+---
+Bug-Database: https://github.com/xxsds/DYNAMIC/issues
+Bug-Submit: https://github.com/xxsds/DYNAMIC/issues/new
=====================================
debian/watch
=====================================
@@ -1,4 +1,8 @@
version=4
-opts="uversionmangle=s/-/~/" \
- https://github.com/xxsds/DYNAMIC/tags \
- (?:.*?/)?v(\d\.[0-9alphbetgm.-]*)\.tar\.gz
+
+opts="uversionmangle=s/-/~/,mode=git,pretty=1.0~alpha.1+git%cd.%h" \
+ https://github.com/xxsds/DYNAMIC.git HEAD
+
+#opts="uversionmangle=s/-/~/" \
+# https://github.com/xxsds/DYNAMIC/tags \
+# (?:.*?/)?v(\d\.[0-9alphbetgm.-]*)\.tar\.gz
=====================================
h0_lz77.cpp
=====================================
@@ -23,65 +23,105 @@
using namespace std;
using namespace dyn;
+ulint sa_rate = 0;
+bool int_file = false;
+
+void help(){
+
+cout << "Build LZ77 using a zero-order compressed FM index." << endl << endl;
+ cout << "Usage: h0_lz77 [options] <input_file> <output_file> " << endl;
+ cout << "Options: " << endl;
+ cout << "-s <sample_rate> store one SA sample every sample_rate positions. default: 256." << endl;
+ cout << "-i Interpret the file as a stream of 32-bits integers." << endl;
+ cout << "input_file: file to be parsed" << endl;
+ cout << "output_file: LZ77 triples <start,length,trailing_character> will be saved in binary format in this file" << endl << endl;
+ cout << "Note: the file should terminate with a character (or int if -i) not appearing elsewhere." << endl;
+
+ exit(0);
+
+}
+
+
+void parse_args(char** argv, int argc, int &ptr){
+
+ assert(ptr<argc);
+
+ string s(argv[ptr]);
+ ptr++;
+
+ if(s.compare("-s")==0){
+
+ sa_rate = atoi(argv[ptr++]);
+
+ }else if(s.compare("-i")==0){
+
+ int_file = true;
+
+ }else{
+ cout << "Error: unrecognized '" << s << "' option." << endl;
+ help();
+ }
+
+}
+
+
int main(int argc,char** argv) {
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::duration;
- if(argc!=3 and argc !=4){
+ if(argc < 3) help();
- cout << "Build LZ77 using a zero-order compressed FM index." << endl << endl;
- cout << "Usage: h0_lz77 [sample_rate] <input_file> <output_file> " << endl;
- cout << " sample_rate: store one SA sample every sample_rate positions. default: 256." << endl;
- cout << " input_file: file to be parsed" << endl;
- cout << " output_file: LZ77 triples <start,length,char> will be saved in text format in this file" << endl;
+ //parse options
- exit(0);
+ int ptr = 1;
- }
+ if(argc<3) help();
- using lz77_t = h0_lz77<wt_fmi>;
+ while(ptr<argc-2)
+ parse_args(argv, argc, ptr);
- /*
- * uncomment this (and comment the above line) to use instead a
- * run-length encoded FM index.
- */
- //using lz77_t = h0_lz77<rle_fmi>;
+ string in = string(argv[ptr++]);
+ string out = string(argv[ptr]);
- auto t1 = high_resolution_clock::now();
+ using lz77_t = h0_lz77<wt_fmi>;
lz77_t lz77;
ulint DEFAULT_SA_RATE = lz77_t::DEFAULT_SA_RATE;
- ulint sa_rate = argc == 3 ? DEFAULT_SA_RATE : atoi(argv[1]);
+ sa_rate = not sa_rate ? DEFAULT_SA_RATE : sa_rate;
- sa_rate = sa_rate == 0 ? 1 : sa_rate;
+ auto t1 = high_resolution_clock::now();
- string in(argv[1+(argc==4)]);
- string out(argv[2+(argc==4)]);
cout << "Sample rate is " << sa_rate << endl;
- {
+ if(not int_file){
- cout << "Detecting alphabet ... " << flush;
- std::ifstream ifs(in);
+ {
+ cout << "Detecting alphabet ... " << flush;
+ std::ifstream ifs(in);
- lz77 = lz77_t(ifs, sa_rate);
- ifs.close();
+ lz77 = lz77_t(ifs, sa_rate);
- cout << "done." << endl;
+ cout << "done." << endl;
+ }
- }
+ std::ifstream ifs(in);
+ std::ofstream os(out, ios::binary);
+
+ lz77.parse(ifs,os,1,true);
- std::ifstream ifs(in);
- std::ofstream os(out);
+ }else{
- lz77.parse(ifs,os,15,true);
+ lz77 = lz77_t(~uint(0), sa_rate);
+ std::ifstream ifs(in, ios::binary);
+ std::ofstream os(out, ios::binary);
- ifs.close();
- os.close();
+ lz77.parse_int(ifs,os,1,true);
+
+ }
auto t2 = high_resolution_clock::now();
=====================================
include/dynamic/algorithms/h0_lz77.hpp
=====================================
@@ -104,9 +104,7 @@ public:
* input: an input stream and an output stream
* the algorithms scans the input (just 1 scan) and
* saves to the output stream (could be a file) a series
- * of triples <pos,len,c> of type <ulint,ulint,uchar>. Types
- * are converted to char* before streaming them to out
- * (i.e. ulint to 8 bytes and uchar to 1 byte). len is the length
+ * of triples <pos,len,c> of type <ulint,ulint,uchar>. len is the length
* of the copied string (i.e. excluded skipped characters in the end)
*
* after the end of a phrase, skip 'skip'>0 characters, included trailing character (LZ77
@@ -184,12 +182,9 @@ public:
exit(0);
}
- auto start = (char*)(new ulint(p));
- auto l = (char*)(new ulint(len));
-
- out.write(start,sizeof(ulint));
- out.write(l,sizeof(ulint));
- out.write(&cc,1);
+ out.write((char*)&p,sizeof(ulint));
+ out.write((char*)&len,sizeof(ulint));
+ out.write((char*)&cc,sizeof(cc));
gamma_bits += gamma(uint64_t(backward_pos+1));
gamma_bits += gamma(uint64_t(len+1));
@@ -199,10 +194,6 @@ public:
delta_bits += delta(uint64_t(len+1));
delta_bits += delta(uint64_t(uint8_t(cc)));
-
- delete start;
- delete l;
-
z++;
len = 0;
p = 0;
@@ -248,6 +239,140 @@ public:
}
+ /*
+ * input: an input integer stream (32 bits) and an output stream
+ * the algorithms scans the input (just 1 scan) and
+ * saves to the output stream (could be a file) a series
+ * of triples <pos,len,c> of type <ulint,ulint,int>. len is the length
+ * of the copied string (i.e. excluded skipped characters in the end)
+ *
+ * after the end of a phrase, skip 'skip'>0 characters, included trailing character (LZ77
+ * sparsification, experimental)
+ *
+ * to get also the last factor, input stream should
+ * terminate with a character that does not appear elsewhere
+ * in the stream
+ *
+ */
+ void parse_int(istream& in, ostream& out, ulint skip = 1, bool verbose = false){
+
+ //size of the output if this is compressed using gamma/delta encoding
+ uint64_t gamma_bits = 0;
+ uint64_t delta_bits = 0;
+
+ assert(skip>0);
+
+ long int step = 100000; //print status every step characters
+ long int last_step = 0;
+
+ assert(fmi.size()==1); //only terminator
+
+ pair<ulint, ulint> range = fmi.get_full_interval(); //BWT range of current phrase
+
+ ulint len = 0; //length of current LZ phrase
+ ulint i = 0; //position of terminator character in bwt
+ ulint p = 0; //phrase occurrence
+
+ ulint z = 0; //number of LZ77 phrases
+
+ if(verbose) cout << "Parsing ..." << endl;
+
+ int cc;
+ ulint n = 0;
+ while(in.read((char*)&cc,sizeof(int))){
+
+ n++;
+ //cout << cc;
+
+ if(verbose){
+
+ if(n>last_step+(step-1)){
+
+ last_step = n;
+ cout << " " << n << " integers processed ..." << endl;
+
+ }
+
+ }
+
+ uint c(cc);
+
+ auto new_range = fmi.LF(range,c);
+
+ if(new_range.first >= new_range.second){
+
+ //cout << ":";
+
+ //empty range: new factor
+
+ ulint occ;
+
+ if(len>0){
+
+ occ = i == range.first ? range.second-1 : range.first;
+ p = fmi.locate(occ) - len;
+
+ }
+
+ fmi.extend(c);
+
+ uint64_t backward_pos = len == 0 ? 0 : (fmi.text_length() - len - 1) - p;
+
+ if(backward_pos > fmi.text_length()){
+ cout << "err" << endl;
+ exit(0);
+ }
+
+ out.write((char*)&p,sizeof(ulint));
+ out.write((char*)&len,sizeof(ulint));
+ out.write((char*)&cc,sizeof(cc));
+
+ z++;
+ len = 0;
+ p = 0;
+
+ //skip characters
+
+ ulint k = 0;
+
+ while(k < skip-1 && in.read((char*)&cc,sizeof(int))){
+
+ //cout << cc;
+
+ fmi.extend(uint(cc));
+ k++;
+ n++;
+
+ }
+
+ //cout << "|";
+
+ range = fmi.get_full_interval();
+
+ }else{
+
+ len++; //increase current phrase length
+ fmi.extend(c); //insert character c in the BWT
+ i = fmi.get_terminator_position(); //get new terminator position
+ range = {new_range.first, new_range.second+1}; //new suffix falls inside current range: extend
+
+ }
+
+
+ }
+
+ if(verbose){
+
+ cout << "\nNumber of integers: " << n << endl;
+ cout << "Number of LZ77 phrases: " << z << endl;
+
+
+ }
+
+
+ }
+
+
/*
* Total number of bits allocated in RAM for this structure
*
View it on GitLab: https://salsa.debian.org/med-team/xxsds-dynamic/-/compare/79ff5f4c3b95d8f14e9200f6588620206d98b292...8e74cdc1b7c31209c14dd30a8646ce30668866e7
--
View it on GitLab: https://salsa.debian.org/med-team/xxsds-dynamic/-/compare/79ff5f4c3b95d8f14e9200f6588620206d98b292...8e74cdc1b7c31209c14dd30a8646ce30668866e7
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20211213/05c4059d/attachment-0001.htm>
More information about the debian-med-commit
mailing list