[med-svn] [Git][med-team/placnet][master] 16 commits: New upstream version 1.04

Andreas Tille (@tille) gitlab at salsa.debian.org
Tue Sep 28 09:08:41 BST 2021



Andreas Tille pushed to branch master at Debian Med / placnet


Commits:
e8962f52 by Andreas Tille at 2021-09-28T09:35:36+02:00
New upstream version 1.04
- - - - -
ef9c3b3d by Andreas Tille at 2021-09-28T09:35:36+02:00
Update upstream source from tag 'upstream/1.04'

Update to upstream version '1.04'
with Debian dir 88c8ab6eafe603cf94b0267c22b0d58ac45ecc79
- - - - -
dcdab407 by Andreas Tille at 2021-09-28T09:36:32+02:00
New upstream version

- - - - -
13399585 by Andreas Tille at 2021-09-28T09:38:14+02:00
Use fake watch file

- - - - -
3ee6c374 by Andreas Tille at 2021-09-28T09:43:05+02:00
Add some documentation for internal use at RKI

- - - - -
51dd0871 by Andreas Tille at 2021-09-28T09:43:35+02:00
Refresh patch

- - - - -
e338e322 by Andreas Tille at 2021-09-28T09:43:54+02:00
routine-update: Standards-Version: 4.6.0

- - - - -
936a2fd7 by Andreas Tille at 2021-09-28T09:43:54+02:00
routine-update: debhelper-compat 13

- - - - -
8449607f by Andreas Tille at 2021-09-28T09:43:57+02:00
routine-update: Secure URI in copyright format

- - - - -
22aa678e by Andreas Tille at 2021-09-28T10:05:36+02:00
d/rules: do not read d/changelog

- - - - -
9b23aef1 by Andreas Tille at 2021-09-28T10:05:57+02:00
Fix d/get-orig-source

- - - - -
28024d78 by Andreas Tille at 2021-09-28T10:06:05+02:00
routine-update: Do not parse d/changelog

- - - - -
acc37cfe by Andreas Tille at 2021-09-28T10:06:05+02:00
routine-update: Add salsa-ci file

- - - - -
e2030d97 by Andreas Tille at 2021-09-28T10:06:05+02:00
routine-update: Rules-Requires-Root: no

- - - - -
ec35f278 by Andreas Tille at 2021-09-28T10:07:34+02:00
Add debian/source/include-binaries

- - - - -
b18a4fd6 by Andreas Tille at 2021-09-28T10:08:10+02:00
Upload to unstable

- - - - -


14 changed files:

- + CHANGES.txt
- debian/changelog
- − debian/compat
- debian/control
- debian/copyright
- + debian/docs/placnet_documentation.odt
- debian/get-orig-source
- debian/patches/fix_usage_output.patch
- debian/rules
- + debian/salsa-ci.yml
- + debian/source/include-binaries
- debian/watch
- + makeRefDB.pl
- placnet.pl


Changes:

=====================================
CHANGES.txt
=====================================
@@ -0,0 +1,4 @@
+VERSION 1.04
+
+- Fixed placnet to work with new NCBI specifications (remove GI identifiers)
+- Added new script to download and format the Reference Database for Placnet.


=====================================
debian/changelog
=====================================
@@ -1,9 +1,23 @@
-placnet (1.03-4) UNRELEASED; urgency=medium
+placnet (1.04-1) unstable; urgency=medium
 
+  [ Debian Janitor ]
   * Apply multi-arch hints.
     + placnet: Add Multi-Arch: foreign.
 
- -- Debian Janitor <janitor at jelmer.uk>  Wed, 28 Oct 2020 13:43:49 -0000
+  [ Andreas Tille ]
+  * New upstream version
+  * Use fake watch file
+  * Add some documentation for internal use at RKI
+  * Standards-Version: 4.6.0 (routine-update)
+  * debhelper-compat 13 (routine-update)
+  * Secure URI in copyright format (routine-update)
+  * d/rules: do not read d/changelog
+  * Fix d/get-orig-source
+  * Do not parse d/changelog (routine-update)
+  * Add salsa-ci file (routine-update)
+  * Rules-Requires-Root: no (routine-update)
+
+ -- Andreas Tille <tille at debian.org>  Tue, 28 Sep 2021 10:07:42 +0200
 
 placnet (1.03-3) unstable; urgency=medium
 


=====================================
debian/compat deleted
=====================================
@@ -1 +0,0 @@
-11


=====================================
debian/control
=====================================
@@ -3,11 +3,12 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.
 Uploaders: Andreas Tille <tille at debian.org>
 Section: science
 Priority: optional
-Build-Depends: debhelper (>= 11~)
-Standards-Version: 4.1.4
+Build-Depends: debhelper-compat (= 13)
+Standards-Version: 4.6.0
 Vcs-Browser: https://salsa.debian.org/med-team/placnet
 Vcs-Git: https://salsa.debian.org/med-team/placnet.git
 Homepage: http://sourceforge.net/projects/placnet/
+Rules-Requires-Root: no
 
 Package: placnet
 Architecture: all


=====================================
debian/copyright
=====================================
@@ -1,4 +1,4 @@
-Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
+Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
 Upstream-Name: Placnet
 Source: http://sourceforge.net/projects/placnet/files/
 


=====================================
debian/docs/placnet_documentation.odt
=====================================
Binary files /dev/null and b/debian/docs/placnet_documentation.odt differ


=====================================
debian/get-orig-source
=====================================
@@ -21,7 +21,9 @@ mkdir -p ../tarballs
 cd ../tarballs
 mkdir -p ${TARDIR}
 cd ${TARDIR}
-wget -q http://heanet.dl.sourceforge.net/project/placnet/placnet.pl
+wget -q http://master.dl.sourceforge.net/project/placnet/placnet.pl
+wget -q http://master.dl.sourceforge.net/project/placnet/makeRefDB.pl
+wget -q http://master.dl.sourceforge.net/project/placnet/CHANGES.txt
 DOWNLOADVERSION=`grep '^#v[0-9]' placnet.pl | sed 's/^#v//'`
 if [ "${VERSION}" != "${DOWNLOADVERSION}" ] ; then
     echo "The downloaded version $DOWNLOADVERSION does not fit the Debian version $VERSION."


=====================================
debian/patches/fix_usage_output.patch
=====================================
@@ -4,8 +4,8 @@ Description: Drop .pl extension in usage output
 
 --- a/placnet.pl
 +++ b/placnet.pl
-@@ -465,8 +465,8 @@ sub usage
- 	print "Placnet v1.03 10/06/2015\n";
+@@ -424,8 +424,8 @@ sub usage
+ 	print "Placnet v1.04 10/15/20116\n";
  	print "writen by: Val F. Lanza (valfernandez.vf\@gmail.com) and Maria de Toro (mdtorohernando\@gmail.com\n\n";\
  	print "Please cite PLACNET as: \nLanza VF, de Toro M, Garcillán-Barcia MP, Mora A, Blanco J, Coque TM, de la Cruz F: \nPlasmid Flux in Escherichia coli ST131 Sublineages, Analyzed by Plasmid Constellation Network (PLACNET),\na New Method for Plasmid Reconstruction from Whole Genome Sequences. \nPLoS Genet 2014, 10:e1004766\n\n";
 -	print "Write inputFile Template\n\nplacnet.pl -generate\n\n";


=====================================
debian/rules
=====================================
@@ -2,14 +2,14 @@
 
 # DH_VERBOSE := 1
 
-DEBPKGNAME     := $(shell dpkg-parsechangelog | awk '/^Source:/ {print $$2}')
+include /usr/share/dpkg/default.mk
 
 %:
 	dh $@
 
 override_dh_install:
-	mkdir -p debian/$(DEBPKGNAME)/usr/bin
-	cp -a placnet.pl debian/$(DEBPKGNAME)/usr/bin/placnet
+	mkdir -p debian/$(DEB_SOURCE)/usr/bin
+	cp -a placnet.pl debian/$(DEB_SOURCE)/usr/bin/placnet
 
 get-orig-source:
 	. debian/get-orig-source


=====================================
debian/salsa-ci.yml
=====================================
@@ -0,0 +1,4 @@
+---
+include:
+  - https://salsa.debian.org/salsa-ci-team/pipeline/raw/master/salsa-ci.yml
+  - https://salsa.debian.org/salsa-ci-team/pipeline/raw/master/pipeline-jobs.yml


=====================================
debian/source/include-binaries
=====================================
@@ -0,0 +1 @@
+debian/docs/placnet_documentation.odt


=====================================
debian/watch
=====================================
@@ -1 +1,7 @@
-# There is no tarball download location, this software is only available in SVN
+# There is no tarball download location, this software is only available at
+#  https://sourceforge.net/projects/placnet/files/
+# with unversioned files
+
+version=4
+opts=dversionmangle=s/.*/0.No-Track/ \
+https://people.debian.org/~eriberto/ FakeWatchNoUpstreamTrackingForThisPackage-(\d\S+)\.gz


=====================================
makeRefDB.pl
=====================================
@@ -0,0 +1,94 @@
+#!/usr/bin/perl
+
+
+########################################################################
+# Perl scritp for download the placnet RefDB database of genomes and   #
+# plasmids from NCBI databases. Script download all complete genomes   #
+# from RefSeq bacteria and all isolate Plamids (whitout associated     #
+# chromosome). Additionally script create a headersRefDB.txt file to   #
+# import description information in Placnet networks                   #
+#                                                                      #
+#                                                                      #
+# Just run: ./makeRefDB                                                #
+#                                                                      #
+# outputs: RefDB.XX.nXX (blast nucleotide database)                    #
+#          headersRefDB.txt (TAB file with genome description)         #
+########################################################################
+
+print("\n\nDownloading index of RefSeq Bacteria Database\n");
+system("wget -nv --show-progress ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt");
+
+open(SUM,"assembly_summary.txt");
+open(OUT,">down.list");
+ at summary = <SUM>;
+
+ at complete = grep(/Complete Genome/, at summary);
+
+foreach $l (@complete)
+{
+	chomp $l;
+	@c = split(/\t/,$l);
+	@c2 = split(/\//,$c[19]);
+	print OUT "$c[-1]/$c2[-1]_genomic.fna.gz\n";	
+}
+close SUM;
+close OUT;
+
+print ("\nDownloading complete genomes...\n");
+system("wget -nv --show-progress -i down.list");
+
+print ("\nDownloading complete plasmids...\n");
+system("wget ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plasmid/*genomic.fna*");
+
+print("Decompressing files...\n");
+system("gzip -d *.gz");
+system("cat plasmid*.fna > all_plasmid_tmp.fna");
+system("grep '>' GC*fna | cut -f2 --delimiter='>' | cut -f1 --delimiter=' ' > acc.txt");
+
+####removing duplicates between plasmids.*.fna and GCA_*.fna
+open(A,"acc.txt");
+ at acc = <A>;
+close A;
+
+foreach $l (@acc)
+{
+	chomp $l;
+	$hash{$l} =1;
+}
+
+
+$prt =1;
+open(F,"all_plasmid_tmp.fna");
+open(O,">all_plasmid_nr.fna");
+while ($l = <F>)
+{
+	if($l =~ />/)
+	{
+		@c = split(/\|/,$l);
+		if (exists($hash{$c[3]}))
+		{
+			$prt = 0;
+		}else{
+			$prt = 1;
+		}
+	}
+	
+	if($prt ==1)
+	{
+			print O $l;
+	}
+	
+}
+
+
+system("cat GC* all_plasmid_nr.fna > all.fasta");
+
+print ("Making Blast Datadase...\n");
+system("sed -i 's/>/>refDB|/' all.fasta");
+system("makeblastdb -in all.fasta -out RefDB -dbtype nucl");
+system("grep '>' all.fasta | sed 's/ /\t/' | sed 's/>//' > headersRefDB.txt");
+
+#system("rm all_plasmid_tmp.fna acc.txt plasmid*.fna");
+
+
+print("\n\nFINISHED\n");


=====================================
placnet.pl
=====================================
@@ -1,5 +1,5 @@
 #!/usr/bin/perl  
-#v1.03
+#v1.04
 
 use strict;
 #use warnings;
@@ -115,18 +115,9 @@ if($contigsFile eq "")
 }
 else{ 
 	
-	system("gmhmmp_heuristic.pl -s $contigsFile -out tmpGM -a");
-	$fastaProt = gm2fasta("tmpGM.lst");
-	system("gmhmmp_heuristic.pl -s $contigsFile -out tmpGM -d");
-	$fastaNucl = gm2fasta("tmpGM.lst");
+	system("prodigal -q -a $prefix.prod.faa -d $prefix.prod.cds -i $contigsFile ");
+
 	
-	#### CDS and ORF prediction ######
-	open(FPROT,">$prefix.gm.faa");
-	print FPROT $fastaProt;
-	close FPROT;
-	open(FNUCL, ">$prefix.gm.cds");
-	print FNUCL $fastaNucl;
-	close FNUCL;
 }
 		
 #}
@@ -199,11 +190,11 @@ sub blastRefDB     #### blastRefDB(type)
 	
 	if($type eq "blast")
 	{
-		system("blastn -query $contigsFile -db $refDBFile -out tmpMegaBlast.txt -num_alignments 0 -evalue 1e-25");
+		system("blastn -query $contigsFile -db $refDBFile -out tmpMegaBlast.txt -num_alignments 0 -evalue 1e-25 -num_threads 24");
 	}elsif ($type eq "fasta")
 	{
 		system("makeblastdb -in $refDBFile -out tmpRefDB -dbtype nucl");
-		system("blastn -query $contigsFile -db tmpRefDB -out tmpMegaBlast.txt -num_alignments 0 -evalue 1e-25");
+		system("blastn -query $contigsFile -db tmpRefDB -out tmpMegaBlast.txt -num_alignments 0 -evalue 1e-25 -num_threads 24");
 	}else{
 		print "Error in Reference DB format\n";
 		exit 0;
@@ -222,7 +213,7 @@ sub blastRefDB     #### blastRefDB(type)
 			$n=1;
 			#print "$node\n";
 		}
-		if ($l =~ /gi\|/) 
+		if ($l =~ /refDB\|/) 
 		{
 			#print $l;
 			@c = split(' ',$l);
@@ -253,8 +244,8 @@ sub sam2scaffold    #### attr:	sam2scaffold(SamDefinition)
 	@c = split('\t',$s);
 	
 	my $samFile = $c[0];
-	my $readLength = $c[1];
-	my $insert = $c[2];
+	my $insert = $c[1];
+	my $readLength = $c[2];
 	
 	print "$c[0]\t$c[1]\t$c[2]\n";
 	
@@ -393,7 +384,7 @@ sub database   #### Attributes name,fastaFile,type,threshold
 	
 	if($dbType eq "prot")
 	{
-		system("blastp -query $prefix.gm.faa -db $db -outfmt 6 -evalue $evalue -out tmp$name.blast -num_alignments 1"); 
+		system("blastp -query $prefix.prod.faa -db $db -outfmt 6 -evalue $evalue -out tmp$name.blast -num_alignments 1"); 
 	}elsif ($dbType eq "nucl")
 	{
 		system("blastn -query $contigsFile -db $db -outfmt 6 -evalue $evalue -out tmp$name.blast -num_alignments 1");
@@ -410,8 +401,9 @@ sub database   #### Attributes name,fastaFile,type,threshold
 		foreach $line (@dbText)
 		{
 			@fields1 = split('\t',$line);
-			@fields2 = split('\|',$fields1[0]);
-			print OUT "$fields2[1]\t$fields1[1]\n";
+			@fields2 = split('_',$fields1[0]);
+			#@node_name = "$fields2[0]_$fields2[1]_$fields2[2]_$fields2[3]_$fields2[4]_$fields2[5]";
+			print OUT "$fields2[0]_$fields2[1]\t$fields1[1]\n";
 		}
 		close OUT;
 	}else{
@@ -424,45 +416,12 @@ sub database   #### Attributes name,fastaFile,type,threshold
 	}	
 }
 	
-sub gm2fasta ############## attr: (geneMarkOutput.lst) return: fasta
-{
-	
-	open(A,"tmpGM");
-	my @txt = <A>;
-	close A;
-
 
-	my $out ="";
-	my $cond=0;
-	foreach $line (@txt)
-	{
-		if($line =~ />/)
-		{
-			$line =~ s/\|GeneMark\.hmm\|\d+_(aa|nt)\|(\-|\+)\|\d+\|\d+\t>/\|/;
-		}
-		if($line =~ /#===/)
-		{
-			$cond=0;
-		}
-		if ($cond==1)
-		{
-			$out .= $line;
-		}
-		if($line =~ /Predicted proteins:/ | $line =~ /Nucleotide sequence of predicted genes:/)
-		{
-		   $cond=1;
-		}
-	}
-	
-	
-	return $out;
-	
-}
 
 sub usage
 {
 	print "Usage:\n\n";
-	print "Placnet v1.03 10/06/2015\n";
+	print "Placnet v1.04 10/15/20116\n";
 	print "writen by: Val F. Lanza (valfernandez.vf\@gmail.com) and Maria de Toro (mdtorohernando\@gmail.com\n\n";\
 	print "Please cite PLACNET as: \nLanza VF, de Toro M, Garcillán-Barcia MP, Mora A, Blanco J, Coque TM, de la Cruz F: \nPlasmid Flux in Escherichia coli ST131 Sublineages, Analyzed by Plasmid Constellation Network (PLACNET),\na New Method for Plasmid Reconstruction from Whole Genome Sequences. \nPLoS Genet 2014, 10:e1004766\n\n";
 	print "Write inputFile Template\n\nplacnet.pl -generate\n\n";
@@ -492,7 +451,7 @@ SAM:	file2.sam	readLength2	insertSize2
 REFDB:	refdb	type(fasta/blast)
 
 
-##### Optional Attributes
+##### Optional Attibutes
 
 DB1:	name	file.fasta	type(nucl/prot)	threshold(E value)	format (fasta/blast)
 DB2:	name	file.fasta	type(nucl/prot)	threshold(E value)	format (fasta/blast)



View it on GitLab: https://salsa.debian.org/med-team/placnet/-/compare/c5cc7b26199e675910a3d025789bf6f92546662c...b18a4fd64c5a2c07fd9b16727d03580463a7365a

-- 
View it on GitLab: https://salsa.debian.org/med-team/placnet/-/compare/c5cc7b26199e675910a3d025789bf6f92546662c...b18a4fd64c5a2c07fd9b16727d03580463a7365a
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210928/bf9bfa94/attachment-0001.htm>


More information about the debian-med-commit mailing list