[med-svn] [Git][med-team/xxsds-dynamic][master] 13 commits: Fix Vcs fields

Andreas Tille (@tille) gitlab at salsa.debian.org
Mon Dec 13 19:09:02 GMT 2021



Andreas Tille pushed to branch master at Debian Med / xxsds-dynamic


Commits:
a1a3754b by Andreas Tille at 2021-12-13T20:00:36+01:00
Fix Vcs fields

- - - - -
d3b0dd04 by Andreas Tille at 2021-12-13T20:01:27+01:00
Debian Med team maintenance

- - - - -
4748e52a by Andreas Tille at 2021-12-13T20:06:49+01:00
Use git mode in watch file

- - - - -
4cdccbd6 by Andreas Tille at 2021-12-13T20:07:20+01:00
New upstream version 1.0~alpha.1+git20210426.548c6f7
- - - - -
a82a0a1f by Andreas Tille at 2021-12-13T20:07:20+01:00
Update upstream source from tag 'upstream/1.0_alpha.1+git20210426.548c6f7'

Update to upstream version '1.0~alpha.1+git20210426.548c6f7'
with Debian dir 0e426670ec76c373b752fc40f2bf3e4239f81e9a
- - - - -
9575a435 by Andreas Tille at 2021-12-13T20:07:39+01:00
routine-update: Standards-Version: 4.6.0

- - - - -
150db82a by Andreas Tille at 2021-12-13T20:07:39+01:00
routine-update: debhelper-compat 13

- - - - -
7ae43126 by Andreas Tille at 2021-12-13T20:07:41+01:00
routine-update: Remove trailing whitespace in debian/changelog

- - - - -
e4b2846d by Andreas Tille at 2021-12-13T20:07:41+01:00
routine-update: Rules-Requires-Root: no

- - - - -
8e63d7f4 by Andreas Tille at 2021-12-13T20:07:44+01:00
Set upstream metadata fields: Bug-Database, Bug-Submit.

Changes-By: lintian-brush
Fixes: lintian: upstream-metadata-file-is-missing
See-also: https://lintian.debian.org/tags/upstream-metadata-file-is-missing.html
Fixes: lintian: upstream-metadata-missing-bug-tracking
See-also: https://lintian.debian.org/tags/upstream-metadata-missing-bug-tracking.html

- - - - -
cf6748bc by Andreas Tille at 2021-12-13T20:07:45+01:00
Avoid explicitly specifying -Wl,--as-needed linker flag.

Changes-By: lintian-brush
Fixes: lintian: debian-rules-uses-as-needed-linker-flag
See-also: https://lintian.debian.org/tags/debian-rules-uses-as-needed-linker-flag.html

- - - - -
451f824c by Andreas Tille at 2021-12-13T20:07:47+01:00
Apply multi-arch hints.
+ libxxsds-dynamic-dev: Add Multi-Arch: foreign.

Changes-By: apply-multiarch-hints

- - - - -
8e74cdc1 by Andreas Tille at 2021-12-13T20:08:36+01:00
routine-update: Ready to upload to unstable

- - - - -


7 changed files:

- debian/changelog
- debian/control
- debian/rules
- + debian/upstream/metadata
- debian/watch
- h0_lz77.cpp
- include/dynamic/algorithms/h0_lz77.hpp


Changes:

=====================================
debian/changelog
=====================================
@@ -1,3 +1,20 @@
+xxsds-dynamic (1.0~alpha.1+git20210426.548c6f7-1) unstable; urgency=medium
+
+  * Team upload.
+  * Fix Vcs fields
+  * Debian Med team maintenance
+  * Use git mode in watch file
+  * Standards-Version: 4.6.0 (routine-update)
+  * debhelper-compat 13 (routine-update)
+  * Remove trailing whitespace in debian/changelog (routine-update)
+  * Rules-Requires-Root: no (routine-update)
+  * Set upstream metadata fields: Bug-Database, Bug-Submit.
+  * Avoid explicitly specifying -Wl,--as-needed linker flag.
+  * Apply multi-arch hints.
+    + libxxsds-dynamic-dev: Add Multi-Arch: foreign.
+
+ -- Andreas Tille <tille at debian.org>  Mon, 13 Dec 2021 20:07:49 +0100
+
 xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-3) unstable; urgency=medium
 
   * Fixed omitted trailer in d/changelog
@@ -14,7 +31,7 @@ xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-2) unstable; urgency=medium
     (thanks go to FTPmaster Thorsten for spotting that omission).
 
  -- Steffen Moeller <moeller at debian.org>  Sun, 13 Sep 2020 19:48:50 +0200
-    
+
 xxsds-dynamic (1.0~alpha.1+2020072524git5390b6c-1) UNRELEASED; urgency=medium
 
   * Initial submission to main.


=====================================
debian/control
=====================================
@@ -1,19 +1,22 @@
 Source: xxsds-dynamic
 Priority: optional
-Maintainer: Steffen Moeller <moeller at debian.org>
-Build-Depends: debhelper-compat (= 12),
+Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
+Uploaders: Steffen Moeller <moeller at debian.org>
+Build-Depends: debhelper-compat (= 13),
                cmake,
                libtsl-hopscotch-map-dev
-Standards-Version: 4.5.0
+Standards-Version: 4.6.0
 Section: libs
+Vcs-Browser: https://salsa.debian.org/med-team/xxsds-dynamic
+Vcs-Git: https://salsa.debian.org/med-team/xxsds-dynamic.git
 Homepage: https://github.com/xxsds/DYNAMIC
-Vcs-Browser: https://salsa.debian.org/med-team/dynamic
-Vcs-Git: https://salsa.debian.org/med-team/dynamic.git
+Rules-Requires-Root: no
 
 Package: libxxsds-dynamic-dev
 Section: libdevel
 Architecture: all
 Depends: ${misc:Depends}
+Multi-Arch: foreign
 Description: succinct and compressed fully-dynamic data structures library
  This library offers space- and time-efficient implementations of some
  basic succinct/compressed dynamic data structures. It only ships header


=====================================
debian/rules
=====================================
@@ -3,7 +3,6 @@ export DH_VERBOSE = 1
 
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 export DEB_CFLAGS_MAINT_APPEND  = -Wall -pedantic
-export DEB_LDFLAGS_MAINT_APPEND = -Wl,--as-needed
 
 
 %:


=====================================
debian/upstream/metadata
=====================================
@@ -0,0 +1,3 @@
+---
+Bug-Database: https://github.com/xxsds/DYNAMIC/issues
+Bug-Submit: https://github.com/xxsds/DYNAMIC/issues/new


=====================================
debian/watch
=====================================
@@ -1,4 +1,8 @@
 version=4
-opts="uversionmangle=s/-/~/" \
-   https://github.com/xxsds/DYNAMIC/tags \
-   (?:.*?/)?v(\d\.[0-9alphbetgm.-]*)\.tar\.gz
+
+opts="uversionmangle=s/-/~/,mode=git,pretty=1.0~alpha.1+git%cd.%h" \
+    https://github.com/xxsds/DYNAMIC.git HEAD
+
+#opts="uversionmangle=s/-/~/" \
+#   https://github.com/xxsds/DYNAMIC/tags \
+#   (?:.*?/)?v(\d\.[0-9alphbetgm.-]*)\.tar\.gz


=====================================
h0_lz77.cpp
=====================================
@@ -23,65 +23,105 @@
 using namespace std;
 using namespace dyn;
 
+ulint sa_rate = 0;
+bool int_file = false;
+
+void help(){
+
+cout << "Build LZ77 using a zero-order compressed FM index." << endl << endl;
+		cout << "Usage: h0_lz77 [options] <input_file> <output_file> " << endl;
+		cout << "Options: " << endl;
+		cout << "-s <sample_rate>   store one SA sample every sample_rate positions. default: 256." << endl;
+		cout << "-i                 Interpret the file as a stream of 32-bits integers." << endl;
+		cout << "input_file: file to be parsed" << endl;
+		cout << "output_file: LZ77 triples <start,length,trailing_character> will be saved in binary format in this file" << endl << endl;
+		cout << "Note: the file should terminate with a character (or int if -i) not appearing elsewhere." << endl;
+
+		exit(0);
+
+}
+
+
+void parse_args(char** argv, int argc, int &ptr){
+
+	assert(ptr<argc);
+
+	string s(argv[ptr]);
+	ptr++;
+
+	if(s.compare("-s")==0){
+
+		sa_rate = atoi(argv[ptr++]);
+
+	}else if(s.compare("-i")==0){
+
+		int_file = true;
+
+	}else{
+		cout << "Error: unrecognized '" << s << "' option." << endl;
+		help();
+	}
+
+}
+
+
 int main(int argc,char** argv) {
 
 	using std::chrono::high_resolution_clock;
 	using std::chrono::duration_cast;
 	using std::chrono::duration;
 
-	if(argc!=3 and argc !=4){
+	if(argc < 3) help();
 
-		cout << "Build LZ77 using a zero-order compressed FM index." << endl << endl;
-		cout << "Usage: h0_lz77 [sample_rate] <input_file> <output_file> " << endl;
-		cout << "   sample_rate: store one SA sample every sample_rate positions. default: 256." << endl;
-		cout << "   input_file: file to be parsed" << endl;
-		cout << "   output_file: LZ77 triples <start,length,char> will be saved in text format in this file" << endl;
+	//parse options
 
-		exit(0);
+	int ptr = 1;
 
-	}
+	if(argc<3) help();
 
-	using lz77_t = h0_lz77<wt_fmi>;
+	while(ptr<argc-2)
+		parse_args(argv, argc, ptr);
 
-	/*
-	 * uncomment this (and comment the above line) to use instead a
-	 * run-length encoded FM index.
-	 */
-	//using lz77_t = h0_lz77<rle_fmi>;
+	string in = string(argv[ptr++]);
+	string out = string(argv[ptr]);
 
-	auto t1 = high_resolution_clock::now();
+	using lz77_t = h0_lz77<wt_fmi>;
 
 	lz77_t lz77;
 	ulint DEFAULT_SA_RATE = lz77_t::DEFAULT_SA_RATE;
 
-	ulint sa_rate = argc == 3 ? DEFAULT_SA_RATE : atoi(argv[1]);
+	sa_rate = not sa_rate ? DEFAULT_SA_RATE : sa_rate;
 
-	sa_rate = sa_rate == 0 ? 1 : sa_rate;
+	auto t1 = high_resolution_clock::now();
 
-	string in(argv[1+(argc==4)]);
-	string out(argv[2+(argc==4)]);
 
 	cout << "Sample rate is " << sa_rate << endl;
 
-	{
+	if(not int_file){
 
-		cout << "Detecting alphabet ... " << flush;
-		std::ifstream ifs(in);
+		{
+			cout << "Detecting alphabet ... " << flush;
+			std::ifstream ifs(in);
 
-		lz77 = lz77_t(ifs, sa_rate);
-		ifs.close();
+			lz77 = lz77_t(ifs, sa_rate);
 
-		cout << "done." << endl;
+			cout << "done." << endl;
+		}
 
-	}
+		std::ifstream ifs(in);
+		std::ofstream os(out, ios::binary);
+
+		lz77.parse(ifs,os,1,true);
 
-	std::ifstream ifs(in);
-	std::ofstream os(out);
+	}else{
 
-	lz77.parse(ifs,os,15,true);
+		lz77 = lz77_t(~uint(0), sa_rate);
+		std::ifstream ifs(in, ios::binary);
+		std::ofstream os(out, ios::binary);
 
-	ifs.close();
-	os.close();
+		lz77.parse_int(ifs,os,1,true);
+
+	}
 
 	auto t2 = high_resolution_clock::now();
 


=====================================
include/dynamic/algorithms/h0_lz77.hpp
=====================================
@@ -104,9 +104,7 @@ public:
 	 * input: an input stream and an output stream
 	 * the algorithms scans the input (just 1 scan) and
 	 * saves to the output stream (could be a file) a series
-	 * of triples <pos,len,c> of type <ulint,ulint,uchar>. Types
-	 * are converted to char* before streaming them to out
-	 * (i.e. ulint to 8 bytes and uchar to 1 byte). len is the length
+	 * of triples <pos,len,c> of type <ulint,ulint,uchar>. len is the length
 	 * of the copied string (i.e. excluded skipped characters in the end)
 	 *
 	 * after the end of a phrase, skip 'skip'>0 characters, included trailing character (LZ77
@@ -184,12 +182,9 @@ public:
 					exit(0);
 				}
 
-				auto start = (char*)(new ulint(p));
-				auto l = (char*)(new ulint(len));
-
-				out.write(start,sizeof(ulint));
-				out.write(l,sizeof(ulint));
-				out.write(&cc,1);
+				out.write((char*)&p,sizeof(ulint));
+				out.write((char*)&len,sizeof(ulint));
+				out.write((char*)&cc,sizeof(cc));
 
 				gamma_bits += gamma(uint64_t(backward_pos+1));
 				gamma_bits += gamma(uint64_t(len+1));
@@ -199,10 +194,6 @@ public:
 				delta_bits += delta(uint64_t(len+1));
 				delta_bits += delta(uint64_t(uint8_t(cc)));
 
-
-				delete start;
-				delete l;
-
 				z++;
 				len = 0;
 				p = 0;
@@ -248,6 +239,140 @@ public:
 
 	}
 
+	/*
+	 * input: an input integer stream (32 bits) and an output stream
+	 * the algorithms scans the input (just 1 scan) and
+	 * saves to the output stream (could be a file) a series
+	 * of triples <pos,len,c> of type <ulint,ulint,int>. len is the length
+	 * of the copied string (i.e. excluded skipped characters in the end)
+	 *
+	 * after the end of a phrase, skip 'skip'>0 characters, included trailing character (LZ77
+	 * sparsification, experimental)
+	 *
+	 * to get also the last factor, input stream should
+	 * terminate with a character that does not appear elsewhere
+	 * in the stream
+	 *
+	 */
+	void parse_int(istream& in, ostream& out, ulint skip = 1, bool verbose = false){
+
+		//size of the output if this is compressed using gamma/delta encoding
+		uint64_t gamma_bits = 0;
+		uint64_t delta_bits = 0;
+
+		assert(skip>0);
+
+		long int step = 100000;	//print status every step characters
+		long int last_step = 0;
+
+		assert(fmi.size()==1);	//only terminator
+
+		pair<ulint, ulint> range = fmi.get_full_interval();	//BWT range of current phrase
+
+		ulint len = 0;	//length of current LZ phrase
+		ulint i = 0;	//position of terminator character in bwt
+		ulint p = 0;	//phrase occurrence
+
+		ulint z = 0; 	//number of LZ77 phrases
+
+		if(verbose) cout << "Parsing ..." << endl;
+
+		int cc;
+		ulint n = 0;
+		while(in.read((char*)&cc,sizeof(int))){
+
+			n++;
+			//cout << cc;
+
+			if(verbose){
+
+				if(n>last_step+(step-1)){
+
+					last_step = n;
+					cout << " " << n << " integers processed ..." << endl;
+
+				}
+
+			}
+
+			uint c(cc);
+
+			auto new_range = fmi.LF(range,c);
+
+			if(new_range.first >= new_range.second){
+
+				//cout << ":";
+
+				//empty range: new factor
+
+				ulint occ;
+
+				if(len>0){
+
+					occ = i == range.first ? range.second-1 : range.first;
+					p = fmi.locate(occ) - len;
+
+				}
+
+				fmi.extend(c);
+
+				uint64_t backward_pos = len == 0 ? 0 : (fmi.text_length() - len - 1) - p;
+
+				if(backward_pos > fmi.text_length()){
+					cout << "err" << endl;
+					exit(0);
+				}
+
+				out.write((char*)&p,sizeof(ulint));
+				out.write((char*)&len,sizeof(ulint));
+				out.write((char*)&cc,sizeof(cc));
+
+				z++;
+				len = 0;
+				p = 0;
+
+				//skip characters
+
+				ulint k = 0;
+
+				while(k < skip-1 && in.read((char*)&cc,sizeof(int))){
+
+					//cout << cc;
+
+					fmi.extend(uint(cc));
+					k++;
+					n++;
+
+				}
+
+				//cout << "|";
+
+				range = fmi.get_full_interval();
+
+			}else{
+
+				len++;			//increase current phrase length
+				fmi.extend(c);	//insert character c in the BWT
+				i = fmi.get_terminator_position();				//get new terminator position
+				range = {new_range.first, new_range.second+1};	//new suffix falls inside current range: extend
+
+			}
+
+
+		}
+
+		if(verbose){
+
+			cout << "\nNumber of integers: " << n << endl;
+			cout << "Number of LZ77 phrases: " << z << endl;
+
+
+		}
+
+
+	}
+
+
 	/*
 	 * Total number of bits allocated in RAM for this structure
 	 *



View it on GitLab: https://salsa.debian.org/med-team/xxsds-dynamic/-/compare/79ff5f4c3b95d8f14e9200f6588620206d98b292...8e74cdc1b7c31209c14dd30a8646ce30668866e7

-- 
View it on GitLab: https://salsa.debian.org/med-team/xxsds-dynamic/-/compare/79ff5f4c3b95d8f14e9200f6588620206d98b292...8e74cdc1b7c31209c14dd30a8646ce30668866e7
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20211213/05c4059d/attachment-0001.htm>


More information about the debian-med-commit mailing list