[med-svn] [Git][med-team/proteinortho][upstream] New upstream version 6.0.33+dfsg
Andreas Tille (@tille)
gitlab at salsa.debian.org
Mon Jan 24 18:29:53 GMT 2022
Andreas Tille pushed to branch upstream at Debian Med / proteinortho
Commits:
50d748b5 by Andreas Tille at 2022-01-24T18:17:27+01:00
New upstream version 6.0.33+dfsg
- - - - -
9 changed files:
- .gitignore
- CHANGELOG
- CHANGEUID
- proteinortho6.pl
- src/.gitlab-ci.yml
- src/proteinortho_grab_proteins.pl
- test/C.faa
- test/C.gff
- test/C2.faa
Changes:
=====================================
.gitignore
=====================================
@@ -9,6 +9,7 @@ test/*.psq
test.*
.log
+*cache*
dev/*
backup/*
=====================================
CHANGELOG
=====================================
@@ -83,31 +83,31 @@
added support for tblastx+ and tblastx legacy
thanks to Clemens Thölken
2018 ** Proteinortho6 **
- 20. Juni-4.Juli
+ 20. June-4.July
openmp support (max_of_diag,get_new_x,makeOrthogonal,normalize,getY)
bitscore integration in the convergence (weighted algebraic connectivity)
protein output is now sorted in descending degree-order (sort with comparator_pairDoubleUInt)
getConnectivity: special case checks now if the induced subgraph is complete (K_n)
added various test functions
- 5. Juli
+ 5. July
added kmere heuristic for splitting groups in proteinortho_clustering. After the calculation of an fiedler vector, the kmere heuristic splits the graph not only in the positive and negative entries of the vector but in k clusters. k=2 -> the original split (without the purity).
- 16. Juli
+ 16. July
added LAPACK support for CC with less than 2^15 nodes (since it uses quadratic space -> (2^15)^2=2^30) for the calculation of the algebraic connectivity.
added all other proteinortho files to this repository.
graphMinusRemoveGraph.cpp implements proteinortho5_clean_edges2.pl in c++
- 23.Juli
+ 23.July
openMP support for laplacian declaration (for lapack).
'make test' clean up.
jackhmmer, phmmer, diamond, usearch support.
- 24 Juli
+ 24 July
last integration.
phmmer+jackhmmer fix/workaround (there is no local identity in the output -> disabled).
proteinortho.pl : set cluster algorithm to weighted-mode as default.
- 30. Juli
+ 30. July
rapsearch integration.
proteinortho_clustering.cpp : -ramLapack is now -ram and is the threshold for laplace matrix + graph struct.
added dynamic memory management (proteinortho.pl + clustering.cpp) using the free -m command (if exists)
- 31. Juli
+ 31. July
rapsearch fix (wrong order of db and q)
purity is back, now 0.1 (and fallback function, if all nodes are below purity threshold -> remove purity for this connected component)
more options for proteinortho.pl -p=diamond-moresensitive|usearch-ublast
@@ -132,7 +132,7 @@
BUGfix in the DFS calculation (recursion in c++ failed with segmentation fault if the recursion was too deep) -> now iteratively (memory ineffciently) with Queue
10. Okt
DFS -> BFS since recursive calls can only be so deep.
- 18. Okt
+ 18. Okt
purity is now 1e-7, evalue 1e-8 (http://people.sc.fsu.edu/~jburkardt/c_src/power_method/power_method_prb.c)
kmere heuristic minNodes = 2^20 (~1e+6), kmere now checks if the "normal" split would result in a good partition.
30. Okt
@@ -206,11 +206,11 @@
-selfblast generates duplicated hits -> automatically calls cleanupblastgraph
3. Juni (uid:3142)
refined error messages on duplicated inputs, <2 inputs
- 27. Juni (uid:3492)
+ 27. June (uid:3492)
proteinortho6.pl now writes databases (-step=1) into -tmp directory if system call failed.
fixed small issue that tmp directories are created inside eachother.
better stderr outputs e.g. if blast fails -> try -check ...
- 1. Juli (uid:3511)
+ 1. July (uid:3511)
fixed the -ram issue (used free memory, now total memory) in case there is a swap using up all free memory (also proteinortho_clustering now throws a warning not a error)
10. Juli (uid:3697)
fixed proteinortho_grab_proteins.pl: -tofiles option now escapes if -exact, replaced chomp with s/[\r\n]+$//
@@ -250,13 +250,13 @@
29. April (4799)
improved error messages (e.g. for missmatching files with --synteny)
fixed small bugs for --synteny and the *.summary, *.html files
- 12. Juni (4999)
+ 12. June (4999)
reduced the IO work by directly importing the diamond results to proteinortho (no temporary file is generated, except if -keep is set).
same ^ for ncbi-blast+
added -mtune -march g++ compiler options for the clustering script
- 18. Juni (5000)
+ 18. June (5000)
the -mtune and -march options are now optional due to some incompatibility...
- 14. Juli (5029)
+ 14. July (5029)
fixed a small bug, s.t. if write permission is missing in the directory of the fasta files, then the database files are generated in the tmp directory properly.
4. Aug (5090)
proteinortho_clustering now properly displays progress (Clusterin: 10%, 20% ...)
@@ -297,5 +297,13 @@
6. April (5584)
fixed a bug, where generated databases are always overwritten (https://gitlab.com/paulklemm_PHD/proteinortho/-/issues/48). Thanks to Bjoern
fixed another bug that caused the compilation of proteinortho_clustering to fail on CentOS (OMP_PROC_BIND=true) (https://gitlab.com/paulklemm_PHD/proteinortho/-/issues/39).
- 20. Juli (5594)
- Makefile modification to account for issue 51+52: the -mtune and -march option are now totally optional.
+ 20. June (5594)
+ Makefile modification to account for issue 51+52 (https://gitlab.com/paulklemm_PHD/proteinortho/-/issues): the -mtune and -march option are now totally optional, use 'make MTUNEARCH=TRUE all' to enable those features.
+ 7. July (5630)
+ Improving error messages for step=1 and 2.
+ 23. Dec (5699)
+ Fixed a bug for fasta headers with comma (generate a _clean version with ; instead of ,)
+ 18. Jan (5733)
+ Fixed a bug if a faa sequence only contains fna symbols (#55), now diamond will automatically restart with the --ignore-warnings option.
+ Added a new option to add parameters to the database generation step 2 (--subparaMakeBlast), thanks to @kullrich
+ Fixed 2 small bugs (a) binaries could not be found in $PATH (changed whereis to which) and (b) the database generation never fails (now it does...)
=====================================
CHANGEUID
=====================================
@@ -1 +1 @@
-5594
+5733
=====================================
proteinortho6.pl
=====================================
@@ -183,6 +183,12 @@ additional parameters for the search tool (-p=blast,diamond,...) example -subpar
=back
+=item B<--subparaMakeBlast>='options'
+
+additional parameters for the database generation (-p=blast -> makeblastdb,diamond -> diamond makedb,...) example -subpara='--ignore-warnings' for a common diamond error
+
+=back
+
=head2 Synteny options (optional, step 2)
(output: <myproject>.ffadj-graph, <myproject>.poff-graph.tsv)
@@ -473,7 +479,7 @@ use POSIX;
##########################################################################################
# Variables
##########################################################################################
-our $version = "6.0.31";
+our $version = "6.0.33";
our $step = 0; # 0/1/2/3 -> do all / only apply step 1 / only apply step 2 / only apply step 3
our $verbose = 1; # 0/1 -> don't / be verbose
our $debug = 0; # 0/1 -> don't / show debug data
@@ -503,6 +509,7 @@ our $twilight = 0;
our $singles = 0;
our $clean = 0;
our $blastOptions = "";
+our $makeBlastOptions = "";
our $clusterOptions = "";
our $nograph = 0;
our $doxml = 0;
@@ -645,6 +652,7 @@ foreach my $option (@ARGV) {
elsif ($option =~ m/^--?desc$/) { $desc = 1; }
elsif ($option =~ m/^--?project=(.*)$/) { $project = $1; $project=~s/[\/* \t\:\~\&\%\$\§\"\(\)\[\]\{\}\^\\]//g; }
elsif ($option =~ m/^--?(subparaBlast|subpara)=(.*)$/i) { $blastOptions = $2;}
+ elsif ($option =~ m/^--?(subparaMakeBlast)=(.*)$/i) { $makeBlastOptions = $2;}
elsif ($option =~ m/^--?subparaCluster=(.*)$/i) { $clusterOptions = $1;}
elsif ($option =~ m/^--?v(ersion)?$/i) { print $version."\n"; exit 0;}
elsif ($option !~ /^-/) { if(!exists($files_map{$option})){$files_map{$option}=1;push(@files,$option);}else{print STDERR "$ORANGE"."[WARNING]$NC The input $option was is skipped, since it was allready given as input.$NC\nPress 'strg+c' to prevent me from proceeding or wait 10 seconds to continue...\n";sleep 10;print STDERR "Well then, proceeding...\n"} }
@@ -748,7 +756,12 @@ sub reset_locale{
}
sub get_parameter{
- return "Parameter-vector : (",'version',"=$version",",",'step',"=$step",",",'verbose',"=$verbose",",",'debug',"=$debug",",",'exactstep3',"=$exactstep3",",",'synteny',"=$synteny",",",'duplication',"=$duplication",",",'cs',"=$cs",",",'alpha',"=$alpha",",",'connectivity',"=$connectivity",",",'cpus',"=$cpus",",",'evalue',"=$evalue",",",'purity',"=$purity",",",'coverage',"=$coverage",",",'identity',"=$identity",",",'blastmode',"=$blastmode",",",'sim',"=$sim",",",'report',"=$report",",",'keep',"=$keep",",",'force',"=$force",",",'selfblast',"=$selfblast",",",'twilight',"=$twilight",",",'singles',"=$singles",",",'clean',"=$clean",",",'blastOptions',"=$blastOptions",",",'nograph',"=$nograph",",",'xml',"=$doxml",",",'desc',"=$desc",",",'tmp_path',"=$tmp_path,",'blastversion',"=$blastversion",",",'binpath',"=$binpath",",",'makedb',"=$makedb",",",'blast',"=$blast",",",'jobs_todo',"=$jobs_todo",",",'project',"=$project",",",'po_path',"=$po_path",",",'run_id',"=$run_id",",",'threads_per_process',"=$threads_per_process",",","useMcl","=$useMcl",',freemem',"=$freemem_inMB",")\n";
+ return "Parameter-vector : (",'version',"=$version",",",'step',"=$step",",",'verbose',"=$verbose",",",'debug',"=$debug",",",'exactstep3',"=$exactstep3",",",'synteny',"=$synteny",",",'duplication',"=$duplication",",",'cs',"=$cs",",",'alpha',"=$alpha",",",'connectivity',"=$connectivity",",",'cpus',"=$cpus",",",'evalue',"=$evalue",",",'purity',"=$purity",",",'coverage',"=$coverage",",",'identity',"=$identity",",",'blastmode',"=$blastmode",",",'sim',"=$sim",",",'report',"=$report",",",'keep',"=$keep",",",'force',"=$force",",",'selfblast',"=$selfblast",",",'twilight',"=$twilight",",",'singles',"=$singles",",",'clean',"=$clean",",",'blastOptions',"=$blastOptions",",",'makeBlastOptions',"=$makeBlastOptions",",",'nograph',"=$nograph",",",'xml',"=$doxml",",",'desc',"=$desc",",",'tmp_path',"=$tmp_path,",'blastversion',"=$blastversion",",",'binpath',"=$binpath",",",'makedb',"=$makedb",",",'blast',"=$blast",",",'jobs_todo',"=$jobs_todo",",",'project',"=$project",",",'po_path',"=$po_path",",",'run_id',"=$run_id",",",'threads_per_process',"=$threads_per_process",",","useMcl","=$useMcl",',freemem',"=$freemem_inMB",")\n";
+}
+
+sub uniq {
+ my %seen;
+ grep !$seen{$_}++, @_;
}
if (-e "$project.proteinortho.tsv" && $step!=1 && $step!=4 && ( ( scalar(@files) > 2 && $step!=3 ) || $step==3 ) ) {
@@ -820,6 +833,16 @@ if($step < 3){ # don't check blast-bins (e.g. diamond) for step 3=clustering (no
if( $keep && $cpus > 8 ){ print STDERR "!!!$ORANGE\n[Warning]:$NC The '-keep' option can result in a I/O bottleneck if used with many cores !!!\n$NC"; }
+# 6.0.32 check if there are _clean files available -> use it instead.
+sub search_clean_file{
+ my $file = shift;
+ my $file_clean = abs_path $file;
+ $file_clean=~s/(\.[^.]+)$/_clean$1/;
+ if(-e $file_clean){return $file_clean}
+ return $file;
+}
+if($step < 3 || $step==4){ $_ = search_clean_file($_) for @files; @files = uniq @files }
+
# do uniprot _additional* file appending for --isoform
if($isoform eq "uniprot"){
if($verbose){print STDERR "preparing files for isoform processing.\n";}
@@ -858,6 +881,9 @@ if($step < 3 || $step==4){ # don't check for step=3=clustering (not necessary) a
@files = ();
foreach my $file (sort { if ($gene_counter{$a} == $gene_counter{$b}) {$a cmp $b;} else {$gene_counter{$b} <=> $gene_counter{$a};} } keys %gene_counter) {push(@files,$file);} # Biggest first # Alphabet otherwise 5.16
+# 6.0.32 check if there are _clean files available -> use it instead.
+if($step < 3 || $step==4){ $_ = search_clean_file($_) for @files; @files = uniq @files }
+
if($verbose && $isoform ne ""){print STDERR "found ".(scalar keys %isoform_mapping)." isoforms in total.\n";}
if(scalar keys %isoform_mapping == 0 && $isoform ne ""){print STDERR "\n!!!!\nWARNING\n!!!!\n I did not found any isoforms as expected from the --isoform= option, please check your input data!\nProceeding anyway...\n";}
@@ -971,8 +997,8 @@ our @hmm_filenames;
our %hmm_filenames2colid;
our $hmm_num_cols = 0;
-if ($step == 4) {
- if($verbose){print STDERR "\n$GREEN**Step 4**$NC\n";}
+if ($step == 4 && $debug) {
+ if($verbose){print STDERR "\n$GREEN**Step 4 (experimental feature)**$NC\n";}
&hmmenriched;
}
@@ -1204,7 +1230,7 @@ sub cluster {
system ("OMP_PROC_BIND=$ompprocbind $po_path/proteinortho_clustering $cluster_verbose_level -minspecies $minspecies -ram ".$freemem_inMB." -kmere ".(1-$exactstep3)." -debug $debug -cpus $cpus -weighted 1 -conn $connectivity -purity $purity ".($clusterOptions ne "" ? "$clusterOptions" : "" )." -rmgraph '$rm_simgraph' '$simgraph'* >'$simtable' ".($verbose == 2 ? "" : "2>/dev/null"));
if ($? != 0) {
- if($verbose){print STDERR "$ORANGE"."[WARNING]$NC proteinortho_clustering failed. I will now retry without the OMP_PROC_BIND flag.$NC\n";} # minimum 5 MB
+ if($verbose){print STDERR "$ORANGE"."[NOTE]$NC I restart proteinortho_clustering, it seems like the OMP_PROC_BIND flag is not compatible with the system (this has no effect on the output)$NC\n";}
system ("$po_path/proteinortho_clustering $cluster_verbose_level -minspecies $minspecies -ram ".$freemem_inMB." -kmere ".(1-$exactstep3)." -debug $debug -cpus $cpus -weighted 1 -conn $connectivity -purity $purity ".($clusterOptions ne "" ? "$clusterOptions" : "" )." -rmgraph '$rm_simgraph' '$simgraph'* >'$simtable' ".($verbose == 2 ? "" : "2>/dev/null"));
if ($? != 0) {
&Error("'proteinortho_clustering' failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nMaybe your operating system does not support the statically compiled version, please try recompiling proteinortho with 'make clean' and 'make' (and 'make install PREFIX=...').");
@@ -1222,7 +1248,7 @@ sub cluster {
if($verbose){print STDERR "[OUTPUT] -> Orthologous groups are written to $simtable\n";}
if(scalar @files < 10){
- if($verbose){print STDERR "You can extract the fasta files of each orthology group with\nproteinortho_grab_proteins.pl ".($isoform ne "" ? "--isoform" : "")." -tofiles $simtable '".join("' '", at files)."'\n (Careful: This will generate a file foreach line in the file $simtable).\n";}
+ if($verbose){print STDERR "You can extract the fasta files of each orthology group with\nproteinortho_grab_proteins.pl ".($isoform ne "" ? "--isoform " : "")."-tofiles $simtable '".join("' '", at files)."'\n (Careful: This will generate a file foreach line in the file $simtable).\n";}
}
system("(head -n 1 '$simtable' && tail -n +2 '$simtable' | LC_ALL=C sort -k1,1nr -k2,2nr -k3,3nr ) > '$simtable.sort'; mv '$simtable.sort' '$simtable'");
@@ -1273,9 +1299,9 @@ sub cluster {
system ("OMP_PROC_BIND=$ompprocbind $po_path/proteinortho_clustering $cluster_verbose_level ".($clusterOptions ne "" ? "$clusterOptions" : "" )." -minspecies $minspecies -ram ".$freemem_inMB." -kmere ".(1-$exactstep3)." -debug $debug -cpus $cpus -weighted 1 -conn $connectivity -purity $purity -rmgraph '$rm_syngraph' '$syngraph'* >'$syntable' ".($verbose == 2 ? "" : "2>/dev/null"));
- if($verbose){print STDERR "$ORANGE"."[WARNING]$NC proteinortho_clustering failed. I will now retry without the OMP_PROC_BIND flag.$NC\n";} # minimum 5 MB
+ if($verbose){print STDERR "$ORANGE"."[NOTE]$NC I restart proteinortho_clustering, it seems like the OMP_PROC_BIND flag is not compatible with the system (this has no effect on the output)$NC\n";}
if ($? != 0) {
- if($verbose){print STDERR "$ORANGE"."[WARNING]$NC proteinortho_clustering failed. I will now retry without the OMP_PROC_BIND flag.$NC\n";} # minimum 5 MB
+ if($verbose){print STDERR "$ORANGE"."[NOTE]$NC I restart proteinortho_clustering, it seems like the OMP_PROC_BIND flag is not compatible with the system (this has no effect on the output)$NC\n";}
system ("$po_path/proteinortho_clustering $cluster_verbose_level ".($clusterOptions ne "" ? "$clusterOptions" : "" )." -minspecies $minspecies -ram ".$freemem_inMB." -kmere ".(1-$exactstep3)." -debug $debug -cpus $cpus -weighted 1 -conn $connectivity -purity $purity -rmgraph '$rm_syngraph' '$syngraph'* >'$syntable' ".($verbose == 2 ? "" : "2>/dev/null"));
if ($? != 0) {
&Error("proteinortho_clustering failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nDid you use a static version? Maybe your operating system does not support the static compiled version, please recompile 'make clean' and 'make' or 'make USEPRECOMPILEDLAPACK=FALSE'.");
@@ -1289,7 +1315,7 @@ sub cluster {
system("(head -n 1 '$syntable' && tail -n +2 '$syntable' | LC_ALL=C sort -k1,1nr -k2,2nr -k3,3nr ) > '$syntable.sort'; mv '$syntable.sort' '$syntable'");
- if($verbose){print STDERR "[OUTPUT] -> Orthologous groups are written to $syntable\nYou can extract the fasta files of each orthology group with\nproteinortho_grab_proteins.pl ".($isoform ne "" ? "--isoform" : "")." -tofiles $syntable '".join("' '", at files)."'\n(Careful: This will generate a file foreach line in the file $syntable).\n";}
+ if($verbose){print STDERR "[OUTPUT] -> Orthologous groups are written to $syntable\nYou can extract the fasta files of each orthology group with\nproteinortho_grab_proteins.pl ".($isoform ne "" ? "--isoform " : "")."-tofiles $syntable '".join("' '", at files)."'\n(Careful: This will generate a file foreach line in the file $syntable).\n";}
if ($singles) {
if($verbose){print STDERR "Adding singles...\n";}
@@ -1447,6 +1473,7 @@ Options:
-cov= min. coverage of best blast alignments in % [default: 50]
-subparaBlast= additional parameters for the search tool
example -subparaBlast='-seg no' or -subparaBlast='--more-sensitive' for diamond
+ -subparaMakeBlast= additional parameters for the database generation
[Synteny options]
-synteny activate PoFF extension to separate similar sequences print
@@ -1902,7 +1929,7 @@ sub synteny_matches {
my %track;
for my $file ($file_i, $file_j) {
# Get Coordinates for all genes
- my %coords = %{&read_details($file)};
+ my %coords = %{&read_details($file,1)};
my $counter = 0;
# Number them according to their order
foreach my $id (sort
@@ -2191,6 +2218,7 @@ sub auto_cpus {
sub generate_indices {
my $oldkeep=$keep;
+ my $cmd="";
if($verbose){print STDERR "Generating indices";if($force){print STDERR " anyway (forced).\n"}else{print STDERR ".\n";}}
if ($blastmode eq "rapsearch") {
foreach my $file (@_) {
@@ -2198,15 +2226,15 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb -d '$file' -n '$file.$blastmode' >\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
-
+ $cmd="$makedb -d '$file' -n '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb -d '$file' -n '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb -d '$file' -n '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb -d '$file' -n '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2217,15 +2245,20 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb '$file' -d '$file.$blastmode' --quiet >\/dev\/null 2>\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
-
+ $cmd="$makedb '$file' -d '$file.$blastmode' --quiet";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0 && $makedb_ret =~/--ignore-warnings/) {
+ print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file' -d '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now retry with the '--ignore-warnings' option!\n");
+ $cmd="$makedb '$file' -d '$file.$blastmode' --quiet";
+ my $makedb_ret = `$cmd --ignore-warnings 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ }elsif($? != 0){print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file' -d '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$file' -d '$tmp_path/DB/".basename($file).".$blastmode' --quiet >\/dev\/null 2>\/dev\/null");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb '$file' -d '$tmp_path/DB/".basename($file).".$blastmode' --quiet";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2234,20 +2267,20 @@ sub generate_indices {
foreach my $file (@_) {
#if ($file =~ /\s/) {print STDERR ("$ORANGE\n[WARNING]$NC : File name '$file' contains whitespaces. This might lead to undesired effects. If you encounter unusual behavior, please change the file name!$NC\n");}
-
if(!$force && `ls '${file}'.${blastmode}* 2>/dev/null` ne ""){
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb index -f '$file' -p '$file.$blastmode' >\/dev\/null 2>\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb index -f '$file' -p '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb index -f '$file' -p '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb index -f '$file' -p '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null 2>\/dev\/null");
+ $cmd = "$makedb index -f '$file' -p '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
}
}
}
@@ -2258,15 +2291,16 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb --dbtype 1 '$file' '$file.$blastmode' >\/dev\/null 2>\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb --dbtype 1 '$file' '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb --dbtype 1 '$file' '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb --dbtype 1 '$file' '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null 2>\/dev\/null");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb --dbtype 1 '$file' '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2277,15 +2311,16 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb --dbtype 2 '$file' '$file.$blastmode' >\/dev\/null 2>\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb --dbtype 2 '$file' '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb --dbtype 2 '$file' '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb --dbtype 2 '$file' '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null 2>\/dev\/null");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb --dbtype 2 '$file' '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2296,15 +2331,16 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- system("$makedb '$file' -output '$file.$blastmode' >\/dev\/null 2>\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb '$file' -output '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file' -output '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$file' -output '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null 2>\/dev\/null");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb '$file' -output '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2316,26 +2352,28 @@ sub generate_indices {
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
if($blastmode eq "lastp"){
- system("$makedb -p '$file.$blastmode' '$file'");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb -p '$file.$blastmode' '$file'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb -p '$file.$blastmode' '$file'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb -p '$tmp_path/DB/".basename($file).".$blastmode' '$file'");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is }something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So try to update $blastmode, consider another blast algorithm (-p) or consider to submitting this case to .");}
+ $cmd = "$makedb -p '$tmp_path/DB/".basename($file).".$blastmode' '$file'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ &Error("The database generation failed once again, please investigate the output from above. There is probably }something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So try to update $blastmode, consider another blast algorithm (-p) or consider to submitting this case to .");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}else{
- system("$makedb '$file.$blastmode' '$file'");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ $cmd="$makedb '$file.$blastmode' '$file'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file.$blastmode' '$file'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$tmp_path/DB/".basename($file).".$blastmode' '$file'");
- system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."'");
-
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ $cmd = "$makedb '$tmp_path/DB/".basename($file).".$blastmode' '$file'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
}
}
}
@@ -2350,18 +2388,21 @@ sub generate_indices {
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- if ($debug) {print STDERR "$makedb '$file' -out '$file.$blastmode' >\/dev\/null\n";}
- system("$makedb '$file' -out '$file.$blastmode' >\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ if ($debug) {print STDERR "$makedb '$file' -out '$file.$blastmode'\n";}
+ $cmd="$makedb '$file' -out '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file' -out '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$file' -out '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null");
+ $cmd = "$makedb '$file' -out '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
}
}
- unlink('formatdb.log');
+ unlink('formatdb.log') if -e 'formatdb.log';
}elsif ($blastmode =~ m/autoblast/) { # new blast+
foreach my $file (@_) {
if(!$force && `ls '${file}'.${blastmode}* 2>/dev/null` ne ""){
@@ -2372,36 +2413,42 @@ sub generate_indices {
print STDERR "$ORANGE\n[WARNING]$NC I could not detect the type of '$file', i assume aminoacid sequences...\n";
$autoblast_fileis{$file}="prot";
}
- if ($debug) {print STDERR "$makedb -dbtype ".$autoblast_fileis{$file}." -in '$file' -out '$file.$blastmode' >\/dev\/null\n";}
- system("$makedb -dbtype ".$autoblast_fileis{$file}." -in '$file' -out '$file.$blastmode' >\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ if ($debug) {print STDERR "$makedb -dbtype ".$autoblast_fileis{$file}." -in '$file' -out '$file.$blastmode'\n";}
+ $cmd="$makedb -dbtype ".$autoblast_fileis{$file}." -in '$file' -out '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb -dbtype ".$autoblast_fileis{$file}." -in '$file' -out '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$file' -out '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null");
+ $cmd = "$makedb '$file' -out '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
}
}
- unlink('formatdb.log');
+ unlink('formatdb.log') if -e 'formatdb.log';
}else { # old blastall
foreach my $file (@_) {
if(!$force && `ls '${file}'.${blastmode}* 2>/dev/null` ne ""){
if ($verbose) {print STDERR "The database for '$file' is present and will be used\n";}
}else{
if ($verbose) {print STDERR "Building database for '$file'\t(".$gene_counter{$file}." sequences)\n";}
- if ($debug) {print STDERR "$makedb '$file' -out '$file.$blastmode' >\/dev\/null\n";}
- system("$makedb '$file' -n '$file.$blastmode' >\/dev\/null");
- if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database. Most likely you don't have write permissions in the directory of the fasta files. I will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
+ if ($debug) {print STDERR "$makedb '$file' -out '$file.$blastmode'\n";}
+ $cmd="$makedb '$file' -n '$file.$blastmode'";
+ my $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ if ($? != 0) {print STDERR ("$ORANGE\n[WARNING]$NC ".$blastmode." failed to create a database ($makedb '$file' -n '$file.$blastmode'). The output is:\n-----------------\n$makedb_ret-----------------\nI will now proceed with writing the database files to the DB/ directory in $tmp_path (-tmp)."); if($step==1){print STDERR "$ORANGE Please ensure that you use -tmp=$tmp_path -keep (and use the same -project= name) for future analysis.$NC";}print "\n";
mkdir("$tmp_path/DB");
if($step==1){$oldkeep=$keep;$keep=1;}
- system("$makedb '$file' -n '$tmp_path/DB/".basename($file).".$blastmode' >\/dev\/null");
+ $cmd = "$makedb '$file' -n '$tmp_path/DB/".basename($file).".$blastmode'";
+ $makedb_ret = `$cmd 3>&1 1>&2 2>&3`;
+ system("ln -s '".abs_path($file)."' '$tmp_path/DB/".basename($file)."' 2>/dev/null");
- if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please retry with 'sudo' or move the fasta files to a directory with write permissions. If this fails too, then there is something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- consider to submitting (mailing) this case to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
+ if($?!=0){ $keep=$oldkeep; &Error("The database generation failed once again, please investigate the output from above. There is probably something wrong with the fasta files or the version of $blastmode cannot handle the database generation. So please try one of the following:\n- update $blastmode\n- consider another blast algorithm (-p)\n- send this issue to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com.");}}
}
}
- unlink('formatdb.log');
+ unlink('formatdb.log') if -e 'formatdb.log';
}
##MARK_FOR_NEW_BLAST_ALGORITHM
@@ -2431,9 +2478,7 @@ sub blast {
}
my $printSTDERR="";
- if($verbose != 2){
- $printSTDERR='2>/dev/null';
- }
+ if($verbose != 2){ $printSTDERR='2>/dev/null'; }
my ($fileType) = $a =~ /\.(\w*)$/;
@@ -2488,15 +2533,18 @@ sub blast {
if ($debug || $verbose==2) {print STDERR "$command\n";} # 5.16
if ($blastmode eq "diamond") {
- @data=`$command`; # run diamond
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ @data=`$command`;
+ if ($? != 0) {
+ @data=`$command --ignore-warnings`;
+ if ($? != 0) { &Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.") }
+ }
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla\n";&reset_locale();die;}}}
if($keep){system("mv '$bla.tmp' '$bla'");}
}elsif ($blastmode eq "rapsearch") {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
# NOTE -s f does not work everytime, therefore here is a conversion of -s t to f (ln -> evalues)
open(OUTM,">>$bla.m82");open(INM,"<$bla.tmp.m8");
while (<INM>){if(length($_)==0 || substr($_,0,1)eq"#"){next;}my @arr=split("\t",$_); print OUTM $arr[0]."\t".$arr[1]."\t".$arr[2]."\t".$arr[3]."\t".$arr[4]."\t".$arr[5]."\t".$arr[6]."\t".$arr[7]."\t".$arr[8]."\t".$arr[9]."\t".exp($arr[10])."\t".$arr[11];}
@@ -2510,7 +2558,7 @@ sub blast {
}elsif ($blastmode eq "usearch" || $blastmode eq "ublast") {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla\n";&reset_locale();die;}}}
system("perl $po_path/proteinortho_formatUsearch.pl '$bla.tmp' >'$bla'"); # problem with ublast/usearch: gene names include the description..
unlink "$bla.tmp";
@@ -2518,7 +2566,7 @@ sub blast {
}elsif ($blastmode eq "topaz") {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla\n";&reset_locale();die;}}}
system("tail -n +2 '$bla.tmp' > '$bla'");
unlink "timing.txt";
@@ -2527,14 +2575,14 @@ sub blast {
}elsif ($blastmode eq "lastp" || $blastmode eq "lastn") {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla\n";&reset_locale();die;}}}
system("mv '$bla.tmp' '$bla'");
}elsif ($blastmode =~ m/.*blat.*/) {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla\n";&reset_locale();die;}}}
system('awk -F\'\t\' \'{if($11<'.$evalue.')print $0}\' \''.$bla.'.tmp\' > \''.$bla.'\'');
unlink "$bla.tmp";
@@ -2542,7 +2590,7 @@ sub blast {
}elsif ($blastmode eq "mmseqsn" || $blastmode eq "mmseqsp") {
system("$command");
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
system($binpath."mmseqs convertalis '$b.$blastmode' '$a.$blastmode' '$bla.tmp' '$bla.tmp2' >\/dev\/null 2>\/dev\/null");
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla.tmp\n";&reset_locale();die;}}}
@@ -2556,7 +2604,7 @@ sub blast {
}else { # -p=blastp,blastn,autoblast, ...
@data=`$command`;
- if ($? != 0) {&Error($blastmode." failed with code $?.$NC (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
+ if ($? != 0) {&Error($blastmode." failed with code $?$NC ($command). (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code)\nThe most common sources of this error are:\n- no space left on device error.\n- outdated $blastmode, please update $blastmode or consider another -p algorithm.\n- the databases are missing. Maybe you ran --step=1 and removed the databases afterwards? Please rerun 'proteinortho --step=1 --force /path/to/fastas'\n- maybe the fasta files are mixed nucleotide and aminoacid sequences or just not suited for $blastmode? (For example diamond only processes protein sequences) Try 'proteinortho --step=1 --check --force /path/to/fastas'.");}
if($debug eq "test_sort"){while (<"$bla.tmp">){if ($_ =~ /[^\t]+([,])[^\t]+[eE]/) {print "found forbidden symbol '$1' at $_ in $bla.tmp\n";&reset_locale();die;}}}
if($keep){system("mv '$bla.tmp' '$bla'");}
@@ -2614,8 +2662,8 @@ sub check_bins {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
# Commands
- if ($blastmode eq "blastp+") {$makedb = $binpath."makeblastdb -dbtype prot -in";}
- elsif ($blastmode eq "blastn+" || $blastmode eq "tblastx+") {$makedb = $binpath."makeblastdb -dbtype nucl -in";}
+ if ($blastmode eq "blastp+") {$makedb = $binpath."makeblastdb $makeBlastOptions -dbtype prot -in";}
+ elsif ($blastmode eq "blastn+" || $blastmode eq "tblastx+") {$makedb = $binpath."makeblastdb $makeBlastOptions -dbtype nucl -in";}
else {&Error("This should not happen! Please submit the FASTA file(s) and the parameter vector (above) to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com to help fixing this issue.");}
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
@@ -2634,7 +2682,7 @@ sub check_bins {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
# Commands
- $makedb = $binpath."makeblastdb ";
+ $makedb = $binpath."makeblastdb $makeBlastOptions ";
if($verbose){print STDERR "Detected 'blast+' version $versionnumber\n";$blastversion=$versionnumber;}
return;
@@ -2650,9 +2698,9 @@ sub check_bins {
$_=~s/[\r\n]+$//;
if ($_ =~ /blastall.+?([^\s]+)/) {
my $versionnumber = $1;
- if ($blastmode eq "blastp_legacy") {$makedb = $binpath."formatdb -p T -o F -i";}
- elsif ($blastmode eq "blastn") {$makedb = $binpath."formatdb -p F -o F -i";}
- elsif ($blastmode eq "tblastx") {$makedb = $binpath."formatdb -p F -o F -i";}
+ if ($blastmode eq "blastp_legacy") {$makedb = $binpath."formatdb $makeBlastOptions -p T -o F -i";}
+ elsif ($blastmode eq "blastn") {$makedb = $binpath."formatdb $makeBlastOptions -p F -o F -i";}
+ elsif ($blastmode eq "tblastx") {$makedb = $binpath."formatdb $makeBlastOptions -p F -o F -i";}
else {&Error("This should not happen! Please submit the FASTA file(s) and the parameter vector (above to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com to help fixing this issue.");}
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
@@ -2669,7 +2717,7 @@ sub check_bins {
foreach (@topazv) {
$_=~s/[\r\n]+$//;
if ($_ =~ /usage: TOPAZ(.+)/) {
- $makedb = $binpath."topaz";
+ $makedb = $binpath."topaz $makeBlastOptions";
if($verbose){print STDERR "Detected '$blastmode'\n";}
return;
}
@@ -2719,7 +2767,7 @@ sub check_bins {
if (defined($out) && $out =~ /rapsearch\sv([\d\.]*)/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."prerapsearch";
+ $makedb = $binpath."prerapsearch $makeBlastOptions";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
return;
}else{
@@ -2738,7 +2786,7 @@ sub check_bins {
if (defined($out) && $out =~ /([a-zA-Z0-9]+)/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."mmseqs createdb";
+ $makedb = $binpath."mmseqs createdb $makeBlastOptions";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
return;
}else{
@@ -2759,7 +2807,7 @@ sub check_bins {
if (defined($out) && $out =~ /diamond\sversion\s(.+)\n/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."diamond makedb --in";
+ $makedb = $binpath."diamond makedb $makeBlastOptions --in";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
if($versionnumber =~ m/^0\.9\.(\d+)/){ if($1 < 29){
print STDERR "\n!!!! \nWARNING '$blastmode' version $versionnumber has a known bug that incorrectly computes the length of an alignment, thus the coverage threshold can produce wrong results leading in false negatives. See https://gitlab.com/paulklemm_PHD/proteinortho/issues/24 for more details.\n\n >>> Please update diamond to 0.9.29 or higher <<<\n";
@@ -2783,7 +2831,7 @@ sub check_bins {
if (defined($out) && $out =~ /usearch\sv(.+)\n/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."usearch -makeudb_ublast";
+ $makedb = $binpath."usearch $makeBlastOptions -makeudb_ublast";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
return;
}else{
@@ -2802,7 +2850,7 @@ sub check_bins {
if (defined($out) && $out =~ /usearch\sv(.+)\n/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."usearch -makeudb_usearch";
+ $makedb = $binpath."usearch $makeBlastOptions -makeudb_usearch";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
return;
}else{
@@ -2821,7 +2869,7 @@ sub check_bins {
if (defined($out) && $out =~ /lastal\s(.+)\n/) {
my @version = split(/\s+/,$1);
my $versionnumber = pop @version;
- $makedb = $binpath."lastdb";
+ $makedb = $binpath."lastdb $makeBlastOptions";
if($verbose){print STDERR "Detected '$blastmode' version $versionnumber\n";$blastversion=$versionnumber;}
return;
}else{
@@ -2843,15 +2891,25 @@ sub check_bins {
# Check plausibility of files
sub check_files {
+ RESTART_check_files:
+
if ( ( scalar(@files) == 0 || (scalar(@files) == 1 && $selfblast==0) ) && $step != 3) {&print_usage; &Error("I need at least two files to compare something!");}
if($verbose){print STDERR "Checking input files";if($checkfasta){print STDERR " carefully (-checkfasta).\n"}else{print STDERR ".\n";}}
+ my %files_dedup;
+
foreach my $file (@files) {
- if ($verbose) {print STDERR "Checking $file... ";}
- &read_details($file,1);
- if ($blastmode eq "autoblast") {print STDERR " (".$autoblast_fileis{$file}.") ";}
- if ($verbose) {print STDERR "ok\n";}
+ if(-e $file && exists $files_dedup{abs_path($file)}){
+ print STDERR "found duplicated entry $file, I will ignore it ... \n";
+ }else{
+ $files_dedup{abs_path($file)}=$file;
+ if ($verbose) {print STDERR "Checking $file... ";}
+ &read_details($file,0);
+ if ($blastmode eq "autoblast") {print STDERR " (".$autoblast_fileis{$file}.") ";}
+ if ($verbose) {print STDERR "ok\n";}
+ }
}
+ @files=values(%files_dedup);
}
sub convertUniprotAndNCBI {
@@ -2868,9 +2926,10 @@ sub convertUniprotAndNCBI {
}
sub read_details {
+ my $file = shift;
+ my $from_synteny_call = shift;
my %ids; # local test for duplicated IDs
my %genes;
- my $file = shift;
my $test = 0;
my $lastgenename="";
my $cur_gene_is_valid=1;
@@ -2882,6 +2941,8 @@ sub read_details {
my %isoform_mapping_ncbiuniprot_correction; # the ncbi uniprot isoforms are not correspoding to the correct ID !
+ my $found_comma_in_file=0;
+
my $did_found_emptyline=0;
if (!-e $file) {&Error("File '$file' not found!");}
open(FASTA,"<$file") || &Error("Could not open '$file': $!");
@@ -2915,7 +2976,6 @@ sub read_details {
$iso =~ s/^>//;
$isoform_mapping{$file." ".$curLine}=$iso;
-#print STDERR "DEBUG::1) ".$file." ".$curLine."\n";
if($debug){print STDERR "found isoform '$file $curLine' => '$iso'\n";}
}
@@ -2951,10 +3011,18 @@ sub read_details {
$curLine =~ s/^>//;
$curLine =~ s/\s.*//;
$lastgenename=$curLine;
- if ($test && $isoform eq "") { # disable for -isoform, since then the
+ if ($test && $isoform eq "") { # disable for -isoform
if (defined($ids{$curLine})) {&Error("Gene ID '$curLine' is defined at least twice in $file");}
$ids{$curLine} = $file;
}
+ if(index($curLine, ",") != -1){
+ # 6.0.32 : check if gene name contains a comma -> this will cause problems with the proteinortho.tsv output (gene cluster speparator)
+ my $file_clean = abs_path $file;
+ $file_clean=~s/(\.[^.]+)$/_clean$1/;
+ $found_comma_in_file=1;
+ print STDERR "\n$ORANGE [WARNING]$NC input '$file' contains a gene-name with a comma, this causes problems with the proteinortho.tsv output, I will clean the file ('$file_clean') and restart the analysis !!!\n";
+ last;
+ }
if ($synteny) {$genes{$curLine} = 1;}
$cur_gene_is_valid=1;
@@ -2990,6 +3058,35 @@ sub read_details {
}
}
}
+ close(FASTA);
+
+ if($found_comma_in_file && !$from_synteny_call){
+ # 6.0.32 : replace the , with ; -> write output in a *_clean.* file
+ my $file_clean = abs_path $file;
+ $file_clean=~s/(\.[^.]+)$/_clean$1/;
+ open(FASTA,"<$file") || &Error("Could not open '$file': $!");
+ open(FASTA_CLEAN,">$file_clean") || &Error("Could not open '$file_clean': $!");
+ while (<FASTA>) {
+ $_=~s/[\n\r]+$//g;
+ if(/^>/){
+ my @arr = split(" ",$_);
+ $arr[0]=~s/,/;/g;
+ print FASTA_CLEAN join(" ", at arr)."\n";
+ }elsif($_ ne ""){
+ print FASTA_CLEAN "$_\n";
+ }
+ }
+ close(FASTA);
+ close(FASTA_CLEAN);
+ if($step==2){
+ print STDERR ("\n$ORANGE [WARNING]$NC Restarting the indices generation.$NC");
+ my @arr=($file_clean);
+ &generate_indices(@arr);
+ }
+ $_ eq $file and $_ = $file_clean for @files;
+ $_ eq $file and $_ = $file_clean for @files_cleanup;
+ goto RESTART_check_files;
+ }
if($isoform eq "uniprot" || $isoform eq "ncbi"){
foreach my $key (keys %isoform_mapping){
@@ -3032,9 +3129,9 @@ sub read_details {
if( exists($blastmode_pendant->{$blastmode}) && $restart_counter==0 && $step <2){ # only for step = 0 and step 1 you can do a rerun else the DB are missing
$blastmode = $blastmode_pendant->{$blastmode};
print STDERR ("\n!!!\n[WARNING]$NC Switching now to $blastmode and restarting...\n");
- print STDERR "\nPress 'strg+c' to prevent me from proceeding or wait 10 seconds to continue...\n!!!\n";
- sleep 10;
- print STDERR "\nWell then, proceeding...\n\n";
+ print STDERR "\nPress 'strg+c' to prevent me from proceeding or wait 10 seconds to continue...\n!!!\n";
+ sleep 10;
+ print STDERR "\nWell then, proceeding...\n\n";
goto RESTART;
}
@@ -3060,7 +3157,6 @@ sub read_details {
&Error("\nThe algorithm (-p=$blastmode) does not support the given input files (use --force to skip this behaviour)...");
}
}
- close(FASTA);
unless ($synteny) {return;}
@@ -3074,17 +3170,26 @@ sub read_details {
# e.g. NC_009925.1 RefSeq CDS 9275 10096 . - 0 ID=cds8;Name=YP_001514414.1;Parent=gene9;Dbxref=Genbank:YP_001514414.1,GeneID:5678848;gbkey=CDS;product=signal peptide peptidase SppA;protein_id=YP_001514414.1;transl_table=11
my @col = split(/\t+/,$_);
if ($col[2] ne "CDS") {next;}
- if ($col[8] =~ /Name=([^;]+)/i && defined($genes{$1})) {
- delete $genes{$1};
-# if (!$test) {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
- if (!$test && $col[6] eq "+") {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
- if (!$test && $col[6] eq "-") {$coordinates{$1} = "$col[0]\t$col[6]\t$col[4]";} # store
+ if ($col[8] =~ /Name=([^;]+)/i) {
+ my $gene_nam=$1;
+ $gene_nam =~ s/,/;/g; # 6.0.32 ,->;
+ if(defined($genes{$gene_nam})){
+ delete $genes{$gene_nam};
+# if (!$test) {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
+ if (!$test && $col[6] eq "+") {$coordinates{$gene_nam} = "$col[0]\t$col[6]\t$col[3]";} # store
+ if (!$test && $col[6] eq "-") {$coordinates{$gene_nam} = "$col[0]\t$col[6]\t$col[4]";} # store
+ next;
+ }
}
- elsif ($col[8] =~ /ID=([^;]+)/i && defined($genes{$1})) {
- delete $genes{$1};
-# if (!$test) {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
- if (!$test && $col[6] eq "+") {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
- if (!$test && $col[6] eq "-") {$coordinates{$1} = "$col[0]\t$col[6]\t$col[4]";} # store
+ if ($col[8] =~ /ID=([^;]+)/i) {
+ my $gene_nam=$1;
+ $gene_nam =~ s/,/;/g; # 6.0.32 ,->;
+ if(defined($genes{$gene_nam})){
+ delete $genes{$gene_nam};
+# if (!$test) {$coordinates{$1} = "$col[0]\t$col[6]\t$col[3]";} # store
+ if (!$test && $col[6] eq "+") {$coordinates{$gene_nam} = "$col[0]\t$col[6]\t$col[3]";} # store
+ if (!$test && $col[6] eq "-") {$coordinates{$gene_nam} = "$col[0]\t$col[6]\t$col[4]";} # store
+ }
}
}
close(GFF);
@@ -3105,7 +3210,7 @@ sub Error {
print STDERR "\n\n$RED"."[Error]$NC $ORANGE ".$_[0]." $NC \n\n";
- if($_[0] ne "I need at least two files to compare something!"){print STDERR "If you cannot solve this error, please send a report to incoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com including the parameter-vector above or visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Codes for more help.\nFurther more all mails to lechner\@staff.uni-marburg.de are welcome\n\n\n";}
+ if($_[0] ne "I need at least two files to compare something!"){print STDERR "Please visit the proteinortho-wiki, where the most common errors are documented:\nhttps://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Codes\n\nIf you cannot solve this error, please file a report (including the input files, the error code and the above 'Parameter-vector'):\nincoming+paulklemm-phd-proteinortho-7278443-issue-\@incoming.gitlab.com\n\nFurther more all mails to lechner\@staff.uni-marburg.de are welcome.\n\n\n";}
&reset_locale();
if (!$keep && $tmp_path =~ m/\/proteinortho_cache_[^\/]+\d*\/$/ && $step!=1 ){system("rm -r $tmp_path >/dev/null 2>&1");}
@@ -3120,6 +3225,10 @@ sub gff4fasta {
if (-e $gff) {return $gff;}
if (-e $gff."3") {return $gff."3";}
+ $gff =~ s/_clean\.gff$/.gff/;
+ if (-e $gff) {return $gff;}
+ if (-e $gff."3") {return $gff."3";}
+
my $ncbi_gff = $gff;
if ($ncbi_gff =~ s/_cds_from//) {
if (-e $ncbi_gff) {return $ncbi_gff;}
@@ -3154,8 +3263,8 @@ sub get_po_path {
$tmppath[1]="$binpath/";
if($debug){print STDERR "Detected ".$tmppath[1]."\n";}
}else{
- my $p=`whereis proteinortho_clustering`;
- $p=~s/^proteinortho_clustering: *([^ ]+)\/proteinortho_clustering.*$/$1/;
+ my $p=`which proteinortho_clustering`;
+ $p=~s/^([^ ]+)\/proteinortho_clustering.*$/$1/;
chomp($p);
$tmppath[1]=$p;
if($debug){print STDERR "Detected (PATH enviroment variable)\n";}
=====================================
src/.gitlab-ci.yml
=====================================
@@ -13,7 +13,7 @@ gcc-latest-manyoptions-together:
stage: test-precompiled-bins
script:
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -27,7 +27,7 @@ gcc-latest-someoptions-one-by-one:
stage: test-precompiled-bins
script:
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -61,17 +61,17 @@ gcc-latest-all-p:
script:
- export CWD=$(pwd)
- echo "installing last"
- - wget http://last.cbrc.jp/last-982.zip && unzip last*zip 2>/dev/null && cd last*/ && make && cp src/last* $HOME
+ - wget http://last.cbrc.jp/last-982.zip 2>/dev/null && unzip last*zip 2>/dev/null && cd last*/ && make && cp src/last* $HOME
- cd $CWD && echo "installing usearch"
- curl https://drive5.com/cgi-bin/upload3.py?license=2019070410321731111 --output $HOME/usearch && chmod +x $HOME/usearch
- cd $CWD && echo "installing mmseqs2"
- git clone https://github.com/soedinglab/MMseqs2 && cd MMs* && cmake . && make && cp src/mmseqs $HOME && cd ..
- cd $CWD && echo "installing blat"
- - wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v369/blat/blat && cp blat $HOME && chmod +x $HOME/blat
+ - wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v369/blat/blat 2>/dev/null && cp blat $HOME && chmod +x $HOME/blat
- cd $CWD && echo "installing topaz"
- git clone https://github.com/ajm/topaz && cd topaz/src && make && cp topaz $HOME && cd ../..
- cd $CWD && echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz && tar xzf diamond-linux64.tar.gz && cp diamond $HOME
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null && tar xzf diamond-linux64.tar.gz && cp diamond $HOME
- export PATH="$PATH:$HOME"
- echo "start proteinortho tests"
- gcc --version
@@ -87,7 +87,7 @@ gcc-latest-diamond:
stage: test-precompiled-bins
script:
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -103,7 +103,7 @@ nolapack-gcc-latest:
stage: recompile-and-test
script:
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -126,7 +126,7 @@ nolapack-gcc-latest:
# - make
# - cp topaz $HOME
# - echo "installing diamond"
-# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
# - tar xzf diamond-linux64.tar.gz
# - cp diamond $HOME
# - export PATH="$PATH:$HOME"
@@ -149,7 +149,7 @@ ubuntu-latest0:
- cp topaz $HOME
- cd ../..
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -171,7 +171,7 @@ ubuntu-latest:
- cp topaz $HOME
- cd ../..
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -194,7 +194,7 @@ debian-latest:
- make
- cp topaz $HOME
- echo "installing diamond"
- - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+ - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
- tar xzf diamond-linux64.tar.gz
- cp diamond $HOME
- export PATH="$PATH:$HOME"
@@ -224,12 +224,12 @@ debian-latest:
# - yum -y install python
# - yum -y install ncbi-blast+
# - cpan Thread::Queue
-# - wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast*-x64-linux.tar.gz
+# - wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast*-x64-linux.tar.gz 2>/dev/null
# - tar -xzvf ncbi-blast*-x64-linux.tar.gz
# - cp ncbi-blast*/bin/blastp $HOME
# - cp ncbi-blast*/bin/makeblastdb $HOME
# - echo "installing diamond"
-# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
# - tar xzf diamond-linux64.tar.gz
# - cp diamond $HOME
# - export PATH="$PATH:$HOME"
@@ -247,12 +247,12 @@ debian-latest:
# - yum -y install which
# - yum -y install wget
# - yum -y install gcc-gfortran python3 atlas atlas-devel lapack blas
-# - wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast*-x64-linux.tar.gz
+# - wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast*-x64-linux.tar.gz 2>/dev/null
# - tar -xzvf ncbi-blast*-x64-linux.tar.gz
# - cp ncbi-blast*/bin/blastp $HOME
# - cp ncbi-blast*/bin/makeblastdb $HOME
# - echo "installing diamond"
-# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz
+# - wget http://github.com/bbuchfink/diamond/releases/download/v2.0.6/diamond-linux64.tar.gz 2>/dev/null
# - tar xzf diamond-linux64.tar.gz
# - cp diamond $HOME
# - export PATH="$PATH:$HOME"
=====================================
src/proteinortho_grab_proteins.pl
=====================================
@@ -27,9 +27,9 @@
#
# @author Paul Klemm
# @email klemmp at staff.uni-marburg.de
-# @company Bioinformatics, University of Leipzig
-# @version 6
-# @date 2-10-2021
+# @company Bioinformatics, University of Marburg
+# @version 7
+# @date 23-12-2021
#
##########################################################################################
@@ -39,6 +39,7 @@ use threads::shared;
use POSIX;
use File::Basename;
use Thread::Queue;
+use Cwd 'abs_path';
our $QUEUE = Thread::Queue->new(); # A new empty queue
my $usage = <<'ENDUSAGE';
@@ -142,7 +143,15 @@ for(my $v = 0 ; $v < scalar @ARGV_copy ; $v++){
elsif($ARGV_copy[$v] =~ m/^--?exact$/){$exact=1;}
elsif($ARGV_copy[$v] =~ m/^-.+/){print $usage; print STDERR "ERROR: invalid option ".$ARGV_copy[$v]."!\n\n";exit(1);}
elsif(!defined($query)){$query = $ARGV_copy[$v];}
- else{$ARGV_copyiddone[$v]=0;$ARGV_copyiddone_counter--;}
+ else{
+ # 6.0.32 replace files if there is a *_clean* present
+ if(-e $ARGV_copy[$v]){
+ my $file_clean = abs_path $ARGV_copy[$v];
+ $file_clean=~s/(\.[^.]+)$/_clean$1/;
+ if(-e $file_clean){$ARGV_copy[$v] = $file_clean}
+ }
+ $ARGV_copyiddone[$v]=0;$ARGV_copyiddone_counter--;
+ }
}
if ($help){
print $usage;
@@ -223,6 +232,10 @@ unless(open(my $FH,'<',$query)) {
if(length($sp[$v])==0 || $sp[$v] eq ""){next;}
$sp[$v]=~s/^\(//;$sp[$v]=~s/\)$//;
+ my @arr = split(" ",$sp[$v]); # 6.0.32 , -> ;
+ $arr[0]=~s/,/;/g;
+ $sp[$v]=join(" ", at arr);
+
$qdata{"STDIN"}{$sp[$v]}=$sp[$v];
$genecounter++;
}
@@ -335,7 +348,11 @@ sub worker {
if(scalar(@arr)>0){$genename=$arr[0];}
$genename=~s/^>//;
- if(exists $qdata{$basename}{$genename}){
+ my @arr = split(" ",$genename); # 6.0.32 , -> ;
+ $arr[0]=~s/,/;/g;
+ my $genename_no_comma=join(" ", at arr);
+
+ if(exists $qdata{$basename}{$genename_no_comma}){
my $headerstr=$curLine;
if($source){$headerstr=$headerstr." ".$basename;}
@@ -355,10 +372,13 @@ sub worker {
if(!defined $qdata{$filename}{$key}){next}
my $regexv = $key;
+
my $curLine_test = $curLine;
-
+
if(!$doregex && !$exact){$regexv=quotemeta($regexv);}
+ $regexv=~s/;/[,;]/g; # 6.0.32 , -> ;
+
my $test_match = 0;
if( !$exact ){
=====================================
test/C.faa
=====================================
@@ -1,4 +1,4 @@
->C_10
+>C_10,test
VVLCRYEIGGLAQVLDTQFDMYTNCHKMCSADSQVTYKEAANLTARVTTDRQKEPLTGGY
HGAKLGFLGCSLLRSRDYGYPEQNFHAKTDLFALPMGDHYCGDEGSGNAYLCDFDNQYGR
SVRSPLKKLLGFGYNPTYGKSALGDELRLGLVFREEFRKINKALLTGGANVVKAGVSYKD
=====================================
test/C.gff
=====================================
@@ -207,7 +207,7 @@ gi|12345678|ref|NC_012345.1| sim CDS 206 206 . + . ID=C_32;
gi|12345678|ref|NC_012345.1| sim CDS 207 207 . + . ID=C_33;
gi|12345678|ref|NC_012345.1| sim CDS 208 208 . + . ID=C_34;
gi|12345678|ref|NC_012345.1| sim CDS 209 209 . + . ID=C_9;
-gi|12345678|ref|NC_012345.1| sim CDS 210 210 . + . ID=C_10;
+gi|12345678|ref|NC_012345.1| sim CDS 210 210 . + . ID=C_10,test;
gi|12345678|ref|NC_012345.1| sim CDS 211 211 . - . ID=C_50;
gi|12345678|ref|NC_012345.1| sim CDS 212 212 . - . ID=C_61;
gi|12345678|ref|NC_012345.1| sim CDS 213 213 . - . ID=C_221;
=====================================
test/C2.faa
=====================================
@@ -12,4 +12,4 @@ AYRNIKKKGYDGGKAGTLVTLMEFVAQGRVANALFDWGSCNEEGAGLSKQCSETVVGFLQ
QSSDYHRLFPKGYGEVPPRCTLGPFPAFHMLMQAALKGSFRTAQQPSVLFSCKCVKLKYS
SCKYAL
>C_11
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAV
View it on GitLab: https://salsa.debian.org/med-team/proteinortho/-/commit/50d748b5c10ee4ec1db1c0ec09e45d0f0dddb9cf
--
View it on GitLab: https://salsa.debian.org/med-team/proteinortho/-/commit/50d748b5c10ee4ec1db1c0ec09e45d0f0dddb9cf
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220124/32c5e080/attachment-0001.htm>
More information about the debian-med-commit
mailing list