[med-svn] [Git][med-team/python-nanomath][master] 5 commits: New upstream version 1.4.0+ds

Sat Oct 25 11:28:15 BST 2025


Nilesh Patra pushed to branch master at Debian Med / python-nanomath


Commits:
bb79bfa9 by Nilesh Patra at 2025-10-25T15:54:06+05:30
New upstream version 1.4.0+ds
- - - - -
9c387e68 by Nilesh Patra at 2025-10-25T15:54:06+05:30
Update upstream source from tag 'upstream/1.4.0+ds'

Update to upstream version '1.4.0+ds'
with Debian dir 348d87d7eaf23f5b2bc697afa3ec3ea66efe3860
- - - - -
79245108 by Nilesh Patra at 2025-10-25T15:54:08+05:30
Bump Standards-Version to 4.7.2 (no changes needed)

- - - - -
8c73cf97 by Nilesh Patra at 2025-10-25T15:55:05+05:30
Drop Redundant "Rules-Requires-Root: no"

- - - - -
2a4f3f88 by Nilesh Patra at 2025-10-25T15:55:27+05:30
Upload to unstable

- - - - -


8 changed files:

- PKG-INFO
- README.md
- README.rst
- debian/changelog
- debian/control
- nanomath.egg-info/PKG-INFO
- nanomath/nanomath.py
- nanomath/version.py


Changes:

=====================================
PKG-INFO
=====================================
@@ -1,52 +1,12 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: nanomath
-Version: 1.2.1
+Version: 1.4.0
 Summary: A few simple math function for other Oxford Nanopore processing scripts
 Home-page: https://github.com/wdecoster/nanomath
 Author: Wouter De Coster
 Author-email: decosterwouter at gmail.com
 License: GPLv3
-Description: # nanomath
-        This module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data
-        
-        [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster)](https://twitter.com/wouter_decoster)
-        [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
-        [![install with Debian](https://www.debian.org/logos/button-mini.png)](https://tracker.debian.org/pkg/python-nanomath)
-        [![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-        [![Code Health](https://landscape.io/github/wdecoster/nanomath/master/landscape.svg?style=flat)](https://landscape.io/github/wdecoster/nanomath/master)
-        
-        
-        ## FUNCTIONS
-        * Calculate read N50 from a set of lengths `get_N50(readlenghts)`  
-        * Remove extreme length outliers from a dataset `remove_length_outliers(dataframe, columname)`  
-        * Calculate the average Phred quality of a read `ave_qual(qualscores)`  
-        * Write out the statistics report after calling readstats function `write_stats(dataframe, outputname)`  
-        * Compute a number of statistics, return a dictionary `calc_read_stats(dataframe)`  
-        
-        
-        ## INSTALLATION
-        ```bash
-        pip install nanomath
-        ```
-        or  
-        [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
-        ```
-        conda install -c bioconda nanomath
-        ```
-        
-        ## STATUS 
-        [![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-        
-        
-        ## CONTRIBUTORS
-        [@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
-        
-        
-        ## CITATION
-        If you use this tool, please consider citing our [publication](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty149/4934939).
-        
 Keywords: nanopore sequencing plotting quality control
-Platform: UNKNOWN
 Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Science/Research
 Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
@@ -56,3 +16,46 @@ Classifier: Programming Language :: Python :: 3.3
 Classifier: Programming Language :: Python :: 3.4
 Classifier: Programming Language :: Python :: 3.5
 Requires-Python: >=3
+License-File: LICENSE
+Requires-Dist: pandas
+Requires-Dist: numpy>1.8
+Requires-Dist: Python-Deprecated
+
+# nanomath
+
+This module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data
+
+[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster)](https://twitter.com/wouter_decoster)
+[![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
+[![install with Debian](https://www.debian.org/logos/button-mini.png)](https://tracker.debian.org/pkg/python-nanomath)
+
+## FUNCTIONS
+
+* Calculate read N50 from a set of lengths `get_N50(readlenghts)`  
+* Remove extreme length outliers from a dataset `remove_length_outliers(dataframe, columname)`  
+* Calculate the average Phred quality of a read `ave_qual(qualscores)`  
+* Write out the statistics report after calling readstats function `write_stats(dataframe, outputname)`  
+* Compute a number of statistics, return a dictionary `calc_read_stats(dataframe)`  
+
+As of **v1.3.0**, nanomath calculates the average quality differently, by first converting per-read phred scale averages to error rates, take the average, and converting back ([nanostat#40](<https://github.com/wdecoster/nanostat/issues/40>))
+
+## INSTALLATION
+
+```bash
+pip install nanomath
+```
+
+or  
+[![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
+
+```
+conda install -c bioconda nanomath
+```
+
+## CONTRIBUTORS
+
+[@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
+
+## CITATION
+
+If you use this tool, please consider citing our [publication](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty149/4934939).


=====================================
README.md
=====================================
@@ -1,38 +1,38 @@
 # nanomath
+
 This module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data
 
 [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster)](https://twitter.com/wouter_decoster)
 [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
 [![install with Debian](https://www.debian.org/logos/button-mini.png)](https://tracker.debian.org/pkg/python-nanomath)
-[![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-[![Code Health](https://landscape.io/github/wdecoster/nanomath/master/landscape.svg?style=flat)](https://landscape.io/github/wdecoster/nanomath/master)
-
 
 ## FUNCTIONS
+
 * Calculate read N50 from a set of lengths `get_N50(readlenghts)`  
 * Remove extreme length outliers from a dataset `remove_length_outliers(dataframe, columname)`  
 * Calculate the average Phred quality of a read `ave_qual(qualscores)`  
 * Write out the statistics report after calling readstats function `write_stats(dataframe, outputname)`  
 * Compute a number of statistics, return a dictionary `calc_read_stats(dataframe)`  
 
+As of **v1.3.0**, nanomath calculates the average quality differently, by first converting per-read phred scale averages to error rates, take the average, and converting back ([nanostat#40](<https://github.com/wdecoster/nanostat/issues/40>))
 
 ## INSTALLATION
+
 ```bash
 pip install nanomath
 ```
+
 or  
 [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
+
 ```
 conda install -c bioconda nanomath
 ```
 
-## STATUS 
-[![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-
-
 ## CONTRIBUTORS
-[@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
 
+[@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
 
 ## CITATION
+
 If you use this tool, please consider citing our [publication](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty149/4934939).


=====================================
README.rst
=====================================
@@ -4,8 +4,7 @@ nanomath
 This module provides a few simple math and statistics functions for
 other scripts processing Oxford Nanopore sequencing data
 
-|Twitter URL| |install with conda| |install with Debian| |Build Status|
-|Code Health|
+|Twitter URL| |install with conda| |install with Debian|
 
 FUNCTIONS
 ---------
@@ -20,6 +19,11 @@ FUNCTIONS
 -  Compute a number of statistics, return a dictionary
    ``calc_read_stats(dataframe)``
 
+As of **v1.3.0**, nanomath calculates the average quality differently,
+by first converting per-read phred scale averages to error rates, take
+the average, and converting back
+(`nanostat#40 <https://github.com/wdecoster/nanostat/issues/40>`__)
+
 INSTALLATION
 ------------
 
@@ -34,11 +38,6 @@ INSTALLATION
 
    conda install -c bioconda nanomath
 
-STATUS
-------
-
-|Build Status|
-
 CONTRIBUTORS
 ------------
 
@@ -57,7 +56,3 @@ If you use this tool, please consider citing our
    :target: https://anaconda.org/bioconda/nanomath
 .. |install with Debian| image:: https://www.debian.org/logos/button-mini.png
    :target: https://tracker.debian.org/pkg/python-nanomath
-.. |Build Status| image:: https://travis-ci.org/wdecoster/nanomath.svg?branch=master
-   :target: https://travis-ci.org/wdecoster/nanomath
-.. |Code Health| image:: https://landscape.io/github/wdecoster/nanomath/master/landscape.svg?style=flat
-   :target: https://landscape.io/github/wdecoster/nanomath/master


=====================================
debian/changelog
=====================================
@@ -1,3 +1,13 @@
+python-nanomath (1.4.0+ds-1) unstable; urgency=medium
+
+  * Team Upload.
+  * Remove myself from uploaders
+  * New upstream version 1.4.0+ds
+  * Bump Standards-Version to 4.7.2 (no changes needed)
+  * Drop Redundant "Rules-Requires-Root: no"
+
+ -- Nilesh Patra <nilesh at debian.org>  Sat, 25 Oct 2025 15:55:08 +0530
+
 python-nanomath (1.2.1+ds-1) unstable; urgency=medium
 
   * New upstream version 1.2.1+ds


=====================================
debian/control
=====================================
@@ -9,11 +9,10 @@ Build-Depends: debhelper-compat (= 13),
                python3-setuptools,
                python3-deprecated,
                python3-numpy <!nocheck>,
-Standards-Version: 4.5.1
+Standards-Version: 4.7.2
 Vcs-Browser: https://salsa.debian.org/med-team/python-nanomath
 Vcs-Git: https://salsa.debian.org/med-team/python-nanomath.git
 Homepage: https://github.com/wdecoster/nanomath
-Rules-Requires-Root: no
 
 Package: python3-nanomath
 Architecture: all


=====================================
nanomath.egg-info/PKG-INFO
=====================================
@@ -1,52 +1,12 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: nanomath
-Version: 1.2.1
+Version: 1.4.0
 Summary: A few simple math function for other Oxford Nanopore processing scripts
 Home-page: https://github.com/wdecoster/nanomath
 Author: Wouter De Coster
 Author-email: decosterwouter at gmail.com
 License: GPLv3
-Description: # nanomath
-        This module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data
-        
-        [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster)](https://twitter.com/wouter_decoster)
-        [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
-        [![install with Debian](https://www.debian.org/logos/button-mini.png)](https://tracker.debian.org/pkg/python-nanomath)
-        [![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-        [![Code Health](https://landscape.io/github/wdecoster/nanomath/master/landscape.svg?style=flat)](https://landscape.io/github/wdecoster/nanomath/master)
-        
-        
-        ## FUNCTIONS
-        * Calculate read N50 from a set of lengths `get_N50(readlenghts)`  
-        * Remove extreme length outliers from a dataset `remove_length_outliers(dataframe, columname)`  
-        * Calculate the average Phred quality of a read `ave_qual(qualscores)`  
-        * Write out the statistics report after calling readstats function `write_stats(dataframe, outputname)`  
-        * Compute a number of statistics, return a dictionary `calc_read_stats(dataframe)`  
-        
-        
-        ## INSTALLATION
-        ```bash
-        pip install nanomath
-        ```
-        or  
-        [![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
-        ```
-        conda install -c bioconda nanomath
-        ```
-        
-        ## STATUS 
-        [![Build Status](https://travis-ci.org/wdecoster/nanomath.svg?branch=master)](https://travis-ci.org/wdecoster/nanomath)
-        
-        
-        ## CONTRIBUTORS
-        [@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
-        
-        
-        ## CITATION
-        If you use this tool, please consider citing our [publication](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty149/4934939).
-        
 Keywords: nanopore sequencing plotting quality control
-Platform: UNKNOWN
 Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Science/Research
 Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
@@ -56,3 +16,46 @@ Classifier: Programming Language :: Python :: 3.3
 Classifier: Programming Language :: Python :: 3.4
 Classifier: Programming Language :: Python :: 3.5
 Requires-Python: >=3
+License-File: LICENSE
+Requires-Dist: pandas
+Requires-Dist: numpy>1.8
+Requires-Dist: Python-Deprecated
+
+# nanomath
+
+This module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data
+
+[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster)](https://twitter.com/wouter_decoster)
+[![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
+[![install with Debian](https://www.debian.org/logos/button-mini.png)](https://tracker.debian.org/pkg/python-nanomath)
+
+## FUNCTIONS
+
+* Calculate read N50 from a set of lengths `get_N50(readlenghts)`  
+* Remove extreme length outliers from a dataset `remove_length_outliers(dataframe, columname)`  
+* Calculate the average Phred quality of a read `ave_qual(qualscores)`  
+* Write out the statistics report after calling readstats function `write_stats(dataframe, outputname)`  
+* Compute a number of statistics, return a dictionary `calc_read_stats(dataframe)`  
+
+As of **v1.3.0**, nanomath calculates the average quality differently, by first converting per-read phred scale averages to error rates, take the average, and converting back ([nanostat#40](<https://github.com/wdecoster/nanostat/issues/40>))
+
+## INSTALLATION
+
+```bash
+pip install nanomath
+```
+
+or  
+[![install with conda](https://anaconda.org/bioconda/nanomath/badges/installer/conda.svg)](https://anaconda.org/bioconda/nanomath)
+
+```
+conda install -c bioconda nanomath
+```
+
+## CONTRIBUTORS
+
+[@alexomics](https://github.com/alexomics) for fixing the indentation of the printed stats
+
+## CITATION
+
+If you use this tool, please consider citing our [publication](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty149/4934939).


=====================================
nanomath/nanomath.py
=====================================
@@ -27,12 +27,21 @@ from math import log
 
 class Stats(object):
     def __init__(self, df):
+        if len(df) < 5:
+            sys.stderr.write("\n\nWARNING: less than 5 reads in the dataset!\n")
+            sys.stderr.write("WARNING: some stats might be unexpected or missing\n")
+            sys.stderr.write("WARNING: or a crash might happen, who knows\n")
+            sys.stderr.write(
+                "WARNING: this code is not intended for such small datasets\n\n\n"
+            )
         self.number_of_reads = len(df)
         self.number_of_bases = np.sum(df["lengths"])
         self._with_readIDs = "readIDs" in df
         if "aligned_lengths" in df:
             self.number_of_bases_aligned = np.sum(df["aligned_lengths"])
-            self.fraction_bases_aligned = self.number_of_bases_aligned / self.number_of_bases
+            self.fraction_bases_aligned = (
+                self.number_of_bases_aligned / self.number_of_bases
+            )
         self.median_read_length = np.median(df["lengths"])
         self.mean_read_length = np.mean(df["lengths"])
         self.read_length_stdev = np.std(df["lengths"])
@@ -43,36 +52,52 @@ class Stats(object):
         if "channelIDs" in df:
             self.active_channels = np.unique(df["channelIDs"]).size
         if "quals" in df:
-            self._qualgroups = [5, 7, 10, 12, 15]  # needs 5 elements in current implementation
-            self.mean_qual = np.mean(df["quals"])
+            self._qualgroups = [
+                10,
+                15,
+                20,
+                25,
+                30,
+            ]  # needs 5 elements in current implementation
+            self.mean_qual = ave_qual(df["quals"].astype("int").to_list())
             self.median_qual = np.median(df["quals"])
-            self._top5_lengths = get_top_5(df=df,
-                                           col="lengths",
-                                           values=["lengths", "quals"])
-            self._top5_quals = get_top_5(df=df,
-                                         col="quals",
-                                         values=["quals", "lengths"])
+            self._top5_lengths = get_top_5(
+                df=df, col="lengths", values=["lengths", "quals"]
+            )
+            self._top5_quals = get_top_5(
+                df=df, col="quals", values=["quals", "lengths"]
+            )
             self._reads_above_qual = [reads_above_qual(df, q) for q in self._qualgroups]
         else:
-            self._top5_lengths = get_top_5(df=df,
-                                           col="lengths",
-                                           values=["lengths"],
-                                           fill='quals')
+            self._top5_lengths = get_top_5(
+                df=df, col="lengths", values=["lengths"], fill="quals"
+            )
 
     def long_features_as_string(self):
         """formatting long features to a string to print for legacy stats output"""
         self.top5_lengths = self.long_feature_as_string_top5(self._top5_lengths)
         self.top5_quals = self.long_feature_as_string_top5(self._top5_quals)
-        self.reads_above_qual = self.long_feature_as_string_above_qual(self._reads_above_qual)
+        self.reads_above_qual = self.long_feature_as_string_above_qual(
+            self._reads_above_qual
+        )
 
     def long_feature_as_string_top5(self, field):
         """for legacy stats output"""
         if self._with_readIDs:
-            return [str(round(i, ndigits=1)) + " (" +
-                    str(round(j, ndigits=1)) + "; " + k + ")" for i, j, k in field]
+            return [
+                str(round(i, ndigits=1))
+                + " ("
+                + str(round(j, ndigits=1))
+                + "; "
+                + k
+                + ")"
+                for i, j, k in field
+            ]
         else:
-            return [str(round(i, ndigits=1)) + " (" +
-                    str(round(j, ndigits=1)) + ")" for i, j in field]
+            return [
+                str(round(i, ndigits=1)) + " (" + str(round(j, ndigits=1)) + ")"
+                for i, j in field
+            ]
 
     def long_feature_as_string_above_qual(self, field):
         """for legacy stats output"""
@@ -81,42 +106,49 @@ class Stats(object):
     def format_above_qual_line(self, entry):
         """for legacy stats output"""
         numberAboveQ, megAboveQ = entry
-        return "{} ({}%) {}Mb".format(numberAboveQ,
-                                      round(100 * (numberAboveQ / self.number_of_reads),
-                                            ndigits=1),
-                                      round(megAboveQ, ndigits=1))
+        return "{} ({}%) {}Mb".format(
+            numberAboveQ,
+            round(100 * (numberAboveQ / self.number_of_reads), ndigits=1),
+            round(megAboveQ, ndigits=1),
+        )
 
     def to_dict(self):
         """for tsv stats output"""
         statdict = self.__dict__
         for key, value in statdict.items():
-            if not key.startswith('_'):
+            if not key.startswith("_"):
                 if not isinstance(value, int):
-                    statdict[key] = '{:.1f}'.format(value)
-        self.unwind_long_features_top5(feature='_top5_lengths', name='longest_read_(with_Q)')
-        self.unwind_long_features_top5(feature='_top5_quals', name='highest_Q_read_(with_length)')
-        self.unwind_long_features_above_qual(feature='_reads_above_qual', name='Reads')
-        return {k: v for k, v in statdict.items() if not k.startswith('_')}
+                    statdict[key] = "{:.1f}".format(value)
+        self.unwind_long_features_top5(
+            feature="_top5_lengths", name="longest_read_(with_Q)"
+        )
+        self.unwind_long_features_top5(
+            feature="_top5_quals", name="highest_Q_read_(with_length)"
+        )
+        self.unwind_long_features_above_qual(feature="_reads_above_qual", name="Reads")
+        return {k: v for k, v in statdict.items() if not k.startswith("_")}
 
     def unwind_long_features_top5(self, feature, name):
         """for tsv stats output"""
         if feature not in self.__dict__:
             return
         for entry, label in zip(self.__dict__[feature], range(1, 6)):
-            self.__dict__[name + ':' + str(label)] = '{} ({})'.format(round(entry[0], ndigits=1),
-                                                                      round(entry[1], ndigits=1))
+            self.__dict__[name + ":" + str(label)] = "{} ({})".format(
+                round(entry[0], ndigits=1), round(entry[1], ndigits=1)
+            )
 
     def unwind_long_features_above_qual(self, feature, name):
         """for tsv stats output"""
         if feature not in self.__dict__:
             return
-        for entry, label in zip(self.__dict__[feature],
-                                ['>Q{}:'.format(q) for q in self._qualgroups]):
+        for entry, label in zip(
+            self.__dict__[feature], [">Q{}:".format(q) for q in self._qualgroups]
+        ):
             numberAboveQ, megAboveQ = entry
             percentage = 100 * (numberAboveQ / float(self.number_of_reads))
-            self.__dict__[name + ' ' + label] = "{} ({}%) {}Mb".format(numberAboveQ,
-                                                                       round(percentage, ndigits=1),
-                                                                       round(megAboveQ, ndigits=1))
+            self.__dict__[name + " " + label] = "{} ({}%) {}Mb".format(
+                numberAboveQ, round(percentage, ndigits=1), round(megAboveQ, ndigits=1)
+            )
 
 
 def get_N50(readlengths):
@@ -124,7 +156,9 @@ def get_N50(readlengths):
 
     Based on https://github.com/PapenfussLab/Mungo/blob/master/bin/fasta_stats.py
     """
-    return readlengths[np.where(np.cumsum(readlengths) >= 0.5 * np.sum(readlengths))[0][0]]
+    return readlengths[
+        np.where(np.cumsum(readlengths) >= 0.5 * np.sum(readlengths))[0][0]
+    ]
 
 
 @deprecated
@@ -135,10 +169,9 @@ def remove_length_outliers(df, columnname):
 
 def errs_tab(n):
     """Generate list of error rates for qualities less than equal than n."""
-    return [10**(q / -10) for q in range(n+1)]
+    return [10 ** (q / -10) for q in range(n + 1)]
 
 
- at deprecated
 def ave_qual(quals, qround=False, tab=errs_tab(128)):
     """Calculate average basecall quality of a read.
 
@@ -161,16 +194,20 @@ def get_top_5(df, col, values, fill=False):
     if "readIDs" in df:
         values.append("readIDs")
     if fill:
-        return df.sort_values(col, ascending=False) \
-            .head(5)[values] \
-            .assign(fill=[0]*5) \
-            .reset_index(drop=True) \
+        return (
+            df.sort_values(col, ascending=False)
+            .head(5)[values]
+            .assign(fill=[0] * 5)
+            .reset_index(drop=True)
             .itertuples(index=False, name=None)
+        )
     else:
-        return df.sort_values(col, ascending=False) \
-            .head(5)[values] \
-            .reset_index(drop=True) \
+        return (
+            df.sort_values(col, ascending=False)
+            .head(5)[values]
+            .reset_index(drop=True)
             .itertuples(index=False, name=None)
+        )
 
 
 def reads_above_qual(df, qual):
@@ -185,22 +222,23 @@ def write_stats(datadfs, outputfile, names=[], as_tsv=False):
     This function takes a list of DataFrames,
     and will create a column for each in the tab separated output.
     """
-    if outputfile == 'stdout':
+    if outputfile == "stdout":
         output = sys.stdout
     else:
-        output = open(outputfile, 'wt')
+        output = open(outputfile, "wt")
 
     stats = [Stats(df) for df in datadfs]
 
     if as_tsv:
         import pandas as pd
+
         df = pd.DataFrame([s.to_dict() for s in stats]).transpose()
-        df.index.name = 'Metrics'
+        df.index.name = "Metrics"
         if names:
             df.columns = names
         else:
-            df.columns = ['dataset']
-        output.write(df.to_csv(sep='\t'))
+            df.columns = ["dataset"]
+        output.write(df.to_csv(sep="\t"))
         return df
     else:
         write_stats_legacy(stats, names, output, datadfs)
@@ -228,44 +266,74 @@ def write_stats_legacy(stats, names, output, datadfs):
     }
     max_len = max([len(k) for k in features.keys()])
     try:
-        max_num = max(max([len(str(s.number_of_bases)) for s in stats]),
-                      max([len(str(n)) for n in names])) + 6
+        max_num = (
+            max(
+                max([len(str(s.number_of_bases)) for s in stats]),
+                max([len(str(n)) for n in names]),
+            )
+            + 6
+        )
     except ValueError:
         max_num = max([len(str(s.number_of_bases)) for s in stats]) + 6
-    output.write("{:<{}}{}\n".format('General summary:', max_len,
-                                     " ".join(['{:>{}}'.format(n, max_num) for n in names])))
+    output.write(
+        "{:<{}}{}\n".format(
+            "General summary:",
+            max_len,
+            " ".join(["{:>{}}".format(n, max_num) for n in names]),
+        )
+    )
     for f in sorted(features.keys()):
         try:
-            output.write("{f:{pad}}{v}\n".format(
-                f=f + ':',
-                pad=max_len,
-                v=feature_list(stats, features[f], padding=max_num)))
+            output.write(
+                "{f:{pad}}{v}\n".format(
+                    f=f + ":",
+                    pad=max_len,
+                    v=feature_list(stats, features[f], padding=max_num),
+                )
+            )
         except KeyError:
             pass
     if all(["quals" in df for df in datadfs]):
         for s in stats:
             s.long_features_as_string()
         long_features = {
-            "Top 5 longest reads and their mean basecall quality score":
-            ["top5_lengths", range(1, 6)],
-            "Top 5 highest mean basecall quality scores and their read lengths":
-            ["top5_quals", range(1, 6)],
-            "Number, percentage and megabases of reads above quality cutoffs":
-            ["reads_above_qual", [">Q" + str(q) for q in stats[0]._qualgroups]],
+            "Top 5 longest reads and their mean basecall quality score": [
+                "top5_lengths",
+                range(1, 6),
+            ],
+            "Top 5 highest mean basecall quality scores and their read lengths": [
+                "top5_quals",
+                range(1, 6),
+            ],
+            "Number, percentage and megabases of reads above quality cutoffs": [
+                "reads_above_qual",
+                [">Q" + str(q) for q in stats[0]._qualgroups],
+            ],
         }
         for lf in sorted(long_features.keys()):
             output.write(lf + "\n")
             for index in range(5):
-                output.write("{}:\t{}\n".format(
-                    long_features[lf][1][index], feature_list(stats=stats,
-                                                              feature=long_features[lf][0],
-                                                              index=index)))
+                output.write(
+                    "{}:\t{}\n".format(
+                        long_features[lf][1][index],
+                        feature_list(
+                            stats=stats, feature=long_features[lf][0], index=index
+                        ),
+                    )
+                )
 
 
 def feature_list(stats, feature, index=None, padding=15):
     if index is None:
-        return ' '.join(['{:>{},.1f}'.format(s.__dict__[feature], padding) for s in stats])
+        return " ".join(
+            ["{:>{},.1f}".format(s.__dict__[feature], padding) for s in stats]
+        )
     else:
-        return '\t'.join([str(s.__dict__[feature][index]) if len(s.__dict__[feature]) > index
-                          else "NA"
-                          for s in stats])
+        return "\t".join(
+            [
+                str(s.__dict__[feature][index])
+                if len(s.__dict__[feature]) > index
+                else "NA"
+                for s in stats
+            ]
+        )


=====================================
nanomath/version.py
=====================================
@@ -1 +1 @@
-__version__ = "1.2.1"
+__version__ = "1.4.0"



View it on GitLab: https://salsa.debian.org/med-team/python-nanomath/-/compare/53f0191e569045329d1e9a0941eaca94670b69bf...2a4f3f8849a15c0484068627448d9b55e6d5210e

-- 
View it on GitLab: https://salsa.debian.org/med-team/python-nanomath/-/compare/53f0191e569045329d1e9a0941eaca94670b69bf...2a4f3f8849a15c0484068627448d9b55e6d5210e
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20251025/03e824e7/attachment-0001.htm>