[Pkg-privacy-commits] [Git][pkg-privacy-team/mat2][upstream] New upstream version 0.13.1

Georg Faerber (@georg) georg at debian.org
Sun Jan 8 13:09:19 GMT 2023



Georg Faerber pushed to branch upstream at Privacy Maintainers / mat2


Commits:
3b9e28fc by Georg Faerber at 2023-01-08T12:48:53+00:00
New upstream version 0.13.1
- - - - -


25 changed files:

- .gitlab-ci.yml
- CHANGELOG.md
- INSTALL.md
- README.md
- doc/mat2.1
- libmat2/__init__.py
- libmat2/abstract.py
- libmat2/archive.py
- libmat2/audio.py
- libmat2/bubblewrap.py
- libmat2/epub.py
- libmat2/exiftool.py
- libmat2/harmless.py
- libmat2/images.py
- libmat2/office.py
- libmat2/parser_factory.py
- libmat2/pdf.py
- libmat2/torrent.py
- libmat2/video.py
- libmat2/web.py
- mat2
- − nautilus/README.md
- − nautilus/mat2.py
- setup.py
- tests/test_libmat2.py


Changes:

=====================================
.gitlab-ci.yml
=====================================
@@ -18,7 +18,6 @@ linting:bandit:
   stage: linting
   script:  # TODO: remove B405 and B314
     - bandit ./mat2 --format txt --skip B101
-    - bandit -r ./nautilus/ --format txt --skip B101
     - bandit -r ./libmat2 --format txt --skip B101,B404,B603,B405,B314,B108,B311
 
 linting:codespell:
@@ -35,20 +34,12 @@ linting:pylint:
   stage: linting
   script:
     - pylint --disable=no-else-return,no-else-raise,no-else-continue,unnecessary-comprehension,raise-missing-from,unsubscriptable-object,use-dict-literal,unspecified-encoding,consider-using-f-string,use-list-literal,too-many-statements --extension-pkg-whitelist=cairo,gi ./libmat2 ./mat2
-    # Once nautilus-python is in Debian, decomment it form the line below
-    - pylint --disable=no-else-return,no-else-raise,no-else-continue,unnecessary-comprehension,raise-missing-from,unsubscriptable-object,use-list-literal --extension-pkg-whitelist=Nautilus,GObject,Gtk,Gio,GLib,gi ./nautilus/mat2.py
-
-linting:pyflakes:
-  image: $CONTAINER_REGISTRY:linting
-  stage: linting
-  script:
-    - pyflakes3 ./libmat2 ./mat2 ./tests/ ./nautilus
 
 linting:mypy:
   image: $CONTAINER_REGISTRY:linting
   stage: linting
   script:
-    - mypy --ignore-missing-imports mat2 libmat2/*.py ./nautilus/mat2.py
+    - mypy --ignore-missing-imports mat2 libmat2/*.py
 
 tests:archlinux:
   image: $CONTAINER_REGISTRY:archlinux


=====================================
CHANGELOG.md
=====================================
@@ -1,6 +1,11 @@
+# 0.13.1 - 2023-01-07
+
+- Improve xlsx support
+- Remove the Nautilus extension
+
 # 0.13.0 - 2022-07-06
 
-- Fix an arbitrary file read
+- Fix an arbitrary file read (CVE-2022-35410)
 - Add support for heic files 
 
 # 0.12.4 - 2022-04-30


=====================================
INSTALL.md
=====================================
@@ -18,32 +18,32 @@ installed, mat2 uses it to sandbox any external processes it invokes.
 
 ## Arch Linux
 
-Thanks to [Francois_B](https://www.sciunto.org/), there is an package available on
-[Arch linux's AUR](https://aur.archlinux.org/packages/mat2/).
+Thanks to [kpcyrd](https://archlinux.org/packages/?maintainer=kpcyrd), there is an package available on
+[Arch linux's AUR](https://archlinux.org/packages/community/any/mat2/).
 
 ## Debian
 
-There is a package available in [Debian](https://packages.debian.org/search?keywords=mat2&searchon=names&section=all).
+There is a package available in [Debian](https://packages.debian.org/search?keywords=mat2&searchon=names&section=all) and you can install mat2 with:
+
+```
+apt install mat2
+```
 
 ## Fedora
 
 Thanks to [atenart](https://ack.tf/), there is a package available on
 [Fedora's copr]( https://copr.fedorainfracloud.org/coprs/atenart/mat2/ )..
 
-We use copr (cool other packages repo) as the Mat2 Nautilus plugin depends on
-python3-nautilus, which isn't available yet in Fedora (but is distributed
-through this copr).
-
-First you need to enable Mat2's copr:
+First you need to enable mat2's copr:
 
 ```
 dnf -y copr enable atenart/mat2
 ```
 
-Then you can install both the Mat2 command and Nautilus extension:
+Then you can install mat2:
 
 ```
-dnf -y install mat2 mat2-nautilus
+dnf -y install mat2
 ```
 
 ## Gentoo


=====================================
README.md
=====================================
@@ -1,8 +1,8 @@
 ```
  _____ _____ _____ ___
 |     |  _  |_   _|_  |  Keep your data,
-| | | |     | | | |  _|     trash your meta!
-|_|_|_|__|__| |_| |___|
+| | | | |_| | | | |  _|     trash your meta!
+|_|_|_|_| |_| |_| |___|
 
 ```
 
@@ -22,9 +22,14 @@ Maybe you don't want to disclose those information.
 This is precisely the job of mat2: getting rid, as much as possible, of
 metadata.
 
-mat2 provides a command line tool, and graphical user interfaces via a service
-menu for Dolphin, the default file manager of KDE, and an extension for
-Nautilus, the default file manager of GNOME.
+mat2 provides:
+- a library called `libmat2`;
+- a command line tool called `mat2`,
+- a service menu for Dolphin, KDE's default file manager
+
+If you prefer a regular graphical user interface, you might be interested in
+[Metadata Cleaner](https://metadatacleaner.romainvigier.fr/), which is using
+`mat2` under the hood.
 
 # Requirements
 


=====================================
doc/mat2.1
=====================================
@@ -1,4 +1,4 @@
-.TH mat2 "1" "July 2022" "mat2 0.13.0" "User Commands"
+.TH mat2 "1" "January 2023" "mat2 0.13.1" "User Commands"
 
 .SH NAME
 mat2 \- the metadata anonymisation toolkit 2


=====================================
libmat2/__init__.py
=====================================
@@ -2,12 +2,11 @@
 
 import enum
 import importlib
-from typing import Dict, Optional, Union
+from typing import Optional, Union
 
 from . import exiftool, video
 
 # make pyflakes happy
-assert Dict
 assert Optional
 assert Union
 
@@ -67,8 +66,8 @@ CMD_DEPENDENCIES = {
     },
 }
 
-def check_dependencies() -> Dict[str, Dict[str, bool]]:
-    ret = dict()  # type: Dict[str, dict]
+def check_dependencies() -> dict[str, dict[str, bool]]:
+    ret = dict()  # type: dict[str, dict]
 
     for key, value in DEPENDENCIES.items():
         ret[key] = {


=====================================
libmat2/abstract.py
=====================================
@@ -1,9 +1,7 @@
 import abc
 import os
 import re
-from typing import Set, Dict, Union
-
-assert Set  # make pyflakes happy
+from typing import Union
 
 
 class AbstractParser(abc.ABC):
@@ -11,8 +9,8 @@ class AbstractParser(abc.ABC):
     It might yield `ValueError` on instantiation on invalid files,
     and `RuntimeError` when something went wrong in `remove_all`.
     """
-    meta_list = set()  # type: Set[str]
-    mimetypes = set()  # type: Set[str]
+    meta_list = set()  # type: set[str]
+    mimetypes = set()  # type: set[str]
 
     def __init__(self, filename: str) -> None:
         """
@@ -35,7 +33,7 @@ class AbstractParser(abc.ABC):
         self.sandbox = True
 
     @abc.abstractmethod
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         """Return all the metadata of the current file"""
 
     @abc.abstractmethod


=====================================
libmat2/archive.py
=====================================
@@ -7,14 +7,10 @@ import tempfile
 import os
 import logging
 import shutil
-from typing import Dict, Set, Pattern, Union, Any, List
+from typing import Pattern, Union, Any
 
 from . import abstract, UnknownMemberPolicy, parser_factory
 
-# Make pyflakes happy
-assert Set
-assert Pattern
-
 # pylint: disable=not-callable,assignment-from-no-return,too-many-branches
 
 # An ArchiveClass is a class representing an archive,
@@ -53,11 +49,11 @@ class ArchiveBasedAbstractParser(abstract.AbstractParser):
 
         # Those are the files that have a format that _isn't_
         # supported by mat2, but that we want to keep anyway.
-        self.files_to_keep = set()  # type: Set[Pattern]
+        self.files_to_keep = set()  # type: set[Pattern]
 
         # Those are the files that we _do not_ want to keep,
         # no matter if they are supported or not.
-        self.files_to_omit = set()  # type: Set[Pattern]
+        self.files_to_omit = set()  # type: set[Pattern]
 
         # what should the parser do if it encounters an unknown file in
         # the archive?
@@ -73,25 +69,25 @@ class ArchiveBasedAbstractParser(abstract.AbstractParser):
     def _specific_cleanup(self, full_path: str) -> bool:
         """ This method can be used to apply specific treatment
         to files present in the archive."""
-        # pylint: disable=unused-argument,no-self-use
+        # pylint: disable=unused-argument
         return True  # pragma: no cover
 
-    def _specific_get_meta(self, full_path: str, file_path: str) -> Dict[str, Any]:
+    def _specific_get_meta(self, full_path: str, file_path: str) -> dict[str, Any]:
         """ This method can be used to extract specific metadata
         from files present in the archive."""
-        # pylint: disable=unused-argument,no-self-use
+        # pylint: disable=unused-argument
         return {}  # pragma: no cover
 
     def _final_checks(self) -> bool:
         """ This method is invoked after the file has been cleaned,
         allowing to run final verifications.
         """
-        # pylint: disable=unused-argument,no-self-use
+        # pylint: disable=unused-argument
         return True
 
     @staticmethod
     @abc.abstractmethod
-    def _get_all_members(archive: ArchiveClass) -> List[ArchiveMember]:
+    def _get_all_members(archive: ArchiveClass) -> list[ArchiveMember]:
         """Return all the members of the archive."""
 
     @staticmethod
@@ -101,7 +97,7 @@ class ArchiveBasedAbstractParser(abstract.AbstractParser):
 
     @staticmethod
     @abc.abstractmethod
-    def _get_member_meta(member: ArchiveMember) -> Dict[str, str]:
+    def _get_member_meta(member: ArchiveMember) -> dict[str, str]:
         """Return all the metadata of a given member."""
 
     @staticmethod
@@ -132,8 +128,8 @@ class ArchiveBasedAbstractParser(abstract.AbstractParser):
         # pylint: disable=unused-argument
         return member
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
-        meta = dict()  # type: Dict[str, Union[str, dict]]
+    def get_meta(self) -> dict[str, Union[str, dict]]:
+        meta = dict()  # type: dict[str, Union[str, dict]]
 
         with self.archive_class(self.filename) as zin:
             temp_folder = tempfile.mkdtemp()
@@ -174,7 +170,7 @@ class ArchiveBasedAbstractParser(abstract.AbstractParser):
 
             # Sort the items to process, to reduce fingerprinting,
             # and keep them in the `items` variable.
-            items = list()  # type: List[ArchiveMember]
+            items = list()  # type: list[ArchiveMember]
             for item in sorted(self._get_all_members(zin), key=self._get_member_name):
                 # Some fileformats do require to have the `mimetype` file
                 # as the first file in the archive.
@@ -340,7 +336,7 @@ class TarParser(ArchiveBasedAbstractParser):
         return member
 
     @staticmethod
-    def _get_member_meta(member: ArchiveMember) -> Dict[str, str]:
+    def _get_member_meta(member: ArchiveMember) -> dict[str, str]:
         assert isinstance(member, tarfile.TarInfo)  # please mypy
         metadata = {}
         if member.mtime != 0:
@@ -362,7 +358,7 @@ class TarParser(ArchiveBasedAbstractParser):
         archive.add(full_path, member.name, filter=TarParser._clean_member)  # type: ignore
 
     @staticmethod
-    def _get_all_members(archive: ArchiveClass) -> List[ArchiveMember]:
+    def _get_all_members(archive: ArchiveClass) -> list[ArchiveMember]:
         assert isinstance(archive, tarfile.TarFile)  # please mypy
         return archive.getmembers()  # type: ignore
 
@@ -416,7 +412,7 @@ class ZipParser(ArchiveBasedAbstractParser):
         return member
 
     @staticmethod
-    def _get_member_meta(member: ArchiveMember) -> Dict[str, str]:
+    def _get_member_meta(member: ArchiveMember) -> dict[str, str]:
         assert isinstance(member, zipfile.ZipInfo)  # please mypy
         metadata = {}
         if member.create_system == 3:  # this is Linux
@@ -443,7 +439,7 @@ class ZipParser(ArchiveBasedAbstractParser):
                              compress_type=member.compress_type)
 
     @staticmethod
-    def _get_all_members(archive: ArchiveClass) -> List[ArchiveMember]:
+    def _get_all_members(archive: ArchiveClass) -> list[ArchiveMember]:
         assert isinstance(archive, zipfile.ZipFile)  # please mypy
         return archive.infolist()  # type: ignore
 


=====================================
libmat2/audio.py
=====================================
@@ -2,7 +2,7 @@ import mimetypes
 import os
 import shutil
 import tempfile
-from typing import Dict, Union
+from typing import Union
 
 import mutagen
 
@@ -18,7 +18,7 @@ class MutagenParser(abstract.AbstractParser):
         except mutagen.MutagenError:
             raise ValueError
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         f = mutagen.File(self.filename)
         if f.tags:
             return {k:', '.join(map(str, v)) for k, v in f.tags.items()}
@@ -38,8 +38,8 @@ class MutagenParser(abstract.AbstractParser):
 class MP3Parser(MutagenParser):
     mimetypes = {'audio/mpeg', }
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
-        metadata = {}  # type: Dict[str, Union[str, dict]]
+    def get_meta(self) -> dict[str, Union[str, dict]]:
+        metadata = {}  # type: dict[str, Union[str, dict]]
         meta = mutagen.File(self.filename).tags
         if not meta:
             return metadata
@@ -68,7 +68,7 @@ class FLACParser(MutagenParser):
         f.save(deleteid3=True)
         return True
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         meta = super().get_meta()
         for num, picture in enumerate(mutagen.File(self.filename).pictures):
             name = picture.desc if picture.desc else 'Cover %d' % num


=====================================
libmat2/bubblewrap.py
=====================================
@@ -11,7 +11,8 @@ import os
 import shutil
 import subprocess
 import tempfile
-from typing import List, Optional
+import functools
+from typing import Optional
 
 
 __all__ = ['PIPE', 'run', 'CalledProcessError']
@@ -21,6 +22,7 @@ CalledProcessError = subprocess.CalledProcessError
 # pylint: disable=subprocess-run-check
 
 
+ at functools.lru_cache
 def _get_bwrap_path() -> str:
     which_path = shutil.which('bwrap')
     if which_path:
@@ -31,7 +33,7 @@ def _get_bwrap_path() -> str:
 
 def _get_bwrap_args(tempdir: str,
                     input_filename: str,
-                    output_filename: Optional[str] = None) -> List[str]:
+                    output_filename: Optional[str] = None) -> list[str]:
     ro_bind_args = []
     cwd = os.getcwd()
 
@@ -76,7 +78,7 @@ def _get_bwrap_args(tempdir: str,
     return args
 
 
-def run(args: List[str],
+def run(args: list[str],
         input_filename: str,
         output_filename: Optional[str] = None,
         **kwargs) -> subprocess.CompletedProcess:


=====================================
libmat2/epub.py
=====================================
@@ -3,7 +3,7 @@ import re
 import uuid
 import zipfile
 import xml.etree.ElementTree as ET  # type: ignore
-from typing import Dict, Any
+from typing import Any
 
 from . import archive, office
 
@@ -37,7 +37,7 @@ class EPUBParser(archive.ZipParser):
                 if member_name.endswith('META-INF/encryption.xml'):
                     raise ValueError('the file contains encrypted fonts')
 
-    def _specific_get_meta(self, full_path, file_path) -> Dict[str, Any]:
+    def _specific_get_meta(self, full_path, file_path) -> dict[str, Any]:
         if not file_path.endswith('.opf'):
             return {}
 


=====================================
libmat2/exiftool.py
=====================================
@@ -4,23 +4,20 @@ import logging
 import os
 import shutil
 import subprocess
-from typing import Dict, Union, Set
+from typing import Union
 
 from . import abstract
 from . import bubblewrap
 
-# Make pyflakes happy
-assert Set
-
 
 class ExiftoolParser(abstract.AbstractParser):
     """ Exiftool is often the easiest way to get all the metadata
     from a import file, hence why several parsers are re-using its `get_meta`
     method.
     """
-    meta_allowlist = set()  # type: Set[str]
+    meta_allowlist = set()  # type: set[str]
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         try:
             if self.sandbox:
                 out = bubblewrap.run([_get_exiftool_path(), '-json',
@@ -70,7 +67,7 @@ class ExiftoolParser(abstract.AbstractParser):
             return False
         return True
 
- at functools.lru_cache()
+ at functools.lru_cache
 def _get_exiftool_path() -> str:  # pragma: no cover
     which_path = shutil.which('exiftool')
     if which_path:


=====================================
libmat2/harmless.py
=====================================
@@ -1,5 +1,5 @@
 import shutil
-from typing import Dict, Union
+from typing import Union
 from . import abstract
 
 
@@ -7,7 +7,7 @@ class HarmlessParser(abstract.AbstractParser):
     """ This is the parser for filetypes that can not contain metadata. """
     mimetypes = {'text/plain', 'image/x-ms-bmp'}
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         return dict()
 
     def remove_all(self) -> bool:


=====================================
libmat2/images.py
=====================================
@@ -1,7 +1,7 @@
 import imghdr
 import os
 import re
-from typing import Set, Dict, Union, Any
+from typing import Union, Any
 
 import cairo
 
@@ -13,7 +13,6 @@ from gi.repository import GdkPixbuf, GLib, Rsvg
 from . import exiftool, abstract
 
 # Make pyflakes happy
-assert Set
 assert Any
 
 class SVGParser(exiftool.ExiftoolParser):
@@ -50,7 +49,7 @@ class SVGParser(exiftool.ExiftoolParser):
         surface.finish()
         return True
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         meta = super().get_meta()
 
         # The namespace is mandatory, but only the …/2000/svg is valid.
@@ -165,8 +164,8 @@ class TiffParser(GdkPixbufAbstractParser):
 class PPMParser(abstract.AbstractParser):
     mimetypes = {'image/x-portable-pixmap'}
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
-        meta = {}  # type: Dict[str, Union[str, Dict[Any, Any]]]
+    def get_meta(self) -> dict[str, Union[str, dict]]:
+        meta = {}  # type: dict[str, Union[str, dict[Any, Any]]]
         with open(self.filename) as f:
             for idx, line in enumerate(f):
                 if line.lstrip().startswith('#'):


=====================================
libmat2/office.py
=====================================
@@ -4,7 +4,7 @@ import logging
 import os
 import re
 import zipfile
-from typing import Dict, Set, Pattern, Tuple, Any
+from typing import Pattern, Any
 
 import xml.etree.ElementTree as ET  # type: ignore
 
@@ -12,11 +12,7 @@ from .archive import ZipParser
 
 # pylint: disable=line-too-long
 
-# Make pyflakes happy
-assert Set
-assert Pattern
-
-def _parse_xml(full_path: str) -> Tuple[ET.ElementTree, Dict[str, str]]:
+def _parse_xml(full_path: str) -> tuple[ET.ElementTree, dict[str, str]]:
     """ This function parses XML, with namespace support. """
     namespace_map = dict()
     for _, (key, value) in ET.iterparse(full_path, ("start-ns", )):
@@ -92,6 +88,10 @@ class MSOfficeParser(ZipParser):
             r'^(?:word|ppt|xl)/_rels/document\.xml\.rels$',
             r'^(?:word|ppt|xl)/_rels/footer[0-9]*\.xml\.rels$',
             r'^(?:word|ppt|xl)/_rels/header[0-9]*\.xml\.rels$',
+            r'^(?:word|ppt|xl)/charts/_rels/chart[0-9]+\.xml\.rels$',
+            r'^(?:word|ppt|xl)/charts/colors[0-9]+\.xml$',
+            r'^(?:word|ppt|xl)/charts/style[0-9]+\.xml$',
+            r'^(?:word|ppt|xl)/drawings/_rels/drawing[0-9]+\.xml\.rels$',
             r'^(?:word|ppt|xl)/styles\.xml$',
             # TODO: randomize axId ( https://docs.microsoft.com/en-us/openspecs/office_standards/ms-oi29500/089f849f-fcd6-4fa0-a281-35aa6a432a16 )
             r'^(?:word|ppt|xl)/charts/chart[0-9]*\.xml$',
@@ -148,7 +148,7 @@ class MSOfficeParser(ZipParser):
                 return False
             xml_data = zin.read('[Content_Types].xml')
 
-        self.content_types = dict()  # type: Dict[str, str]
+        self.content_types = dict()  # type: dict[str, str]
         try:
             tree = ET.fromstring(xml_data)
         except ET.ParseError:
@@ -431,7 +431,7 @@ class MSOfficeParser(ZipParser):
 
         return True
 
-    def _specific_get_meta(self, full_path: str, file_path: str) -> Dict[str, Any]:
+    def _specific_get_meta(self, full_path: str, file_path: str) -> dict[str, Any]:
         """
         Yes, I know that parsing xml with regexp ain't pretty,
         be my guest and fix it if you want.
@@ -512,7 +512,7 @@ class LibreOfficeParser(ZipParser):
                 return False
         return True
 
-    def _specific_get_meta(self, full_path: str, file_path: str) -> Dict[str, Any]:
+    def _specific_get_meta(self, full_path: str, file_path: str) -> dict[str, Any]:
         """
         Yes, I know that parsing xml with regexp ain't pretty,
         be my guest and fix it if you want.


=====================================
libmat2/parser_factory.py
=====================================
@@ -2,7 +2,7 @@ import glob
 import os
 import mimetypes
 import importlib
-from typing import TypeVar, List, Tuple, Optional
+from typing import TypeVar, Optional
 
 from . import abstract, UNSUPPORTED_EXTENSIONS
 
@@ -34,7 +34,7 @@ def __load_all_parsers():
 __load_all_parsers()
 
 
-def _get_parsers() -> List[T]:
+def _get_parsers() -> list[T]:
     """ Get all our parsers!"""
     def __get_parsers(cls):
         return cls.__subclasses__() + \
@@ -42,7 +42,7 @@ def _get_parsers() -> List[T]:
     return __get_parsers(abstract.AbstractParser)
 
 
-def get_parser(filename: str) -> Tuple[Optional[T], Optional[str]]:
+def get_parser(filename: str) -> tuple[Optional[T], Optional[str]]:
     """ Return the appropriate parser for a given filename.
 
         :raises ValueError: Raised if the instantiation of the parser went wrong.


=====================================
libmat2/pdf.py
=====================================
@@ -7,8 +7,7 @@ import re
 import logging
 import tempfile
 import io
-from typing import Dict, Union
-from distutils.version import LooseVersion
+from typing import Union
 
 import cairo
 import gi
@@ -17,11 +16,6 @@ from gi.repository import Poppler, GLib
 
 from . import abstract
 
-poppler_version = Poppler.get_version()
-if LooseVersion(poppler_version) < LooseVersion('0.46'):  # pragma: no cover
-    raise ValueError("mat2 needs at least Poppler version 0.46 to work. \
-The installed version is %s." % poppler_version)  # pragma: no cover
-
 FIXED_PDF_VERSION = cairo.PDFVersion.VERSION_1_5
 
 class PDFParser(abstract.AbstractParser):
@@ -146,13 +140,13 @@ class PDFParser(abstract.AbstractParser):
         return True
 
     @staticmethod
-    def __parse_metadata_field(data: str) -> Dict[str, str]:
+    def __parse_metadata_field(data: str) -> dict[str, str]:
         metadata = {}
         for (_, key, value) in re.findall(r"<(xmp|pdfx|pdf|xmpMM):(.+)>(.+)</\1:\2>", data, re.I):
             metadata[key] = value
         return metadata
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         """ Return a dict with all the meta of the file
         """
         metadata = {}


=====================================
libmat2/torrent.py
=====================================
@@ -1,5 +1,5 @@
 import logging
-from typing import Union, Tuple, Dict
+from typing import Union
 
 from . import abstract
 
@@ -15,7 +15,7 @@ class TorrentParser(abstract.AbstractParser):
         if self.dict_repr is None:
             raise ValueError
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         metadata = {}
         for key, value in self.dict_repr.items():
             if key not in self.allowlist:
@@ -56,7 +56,7 @@ class _BencodeHandler:
         }
 
     @staticmethod
-    def __decode_int(s: bytes) -> Tuple[int, bytes]:
+    def __decode_int(s: bytes) -> tuple[int, bytes]:
         s = s[1:]
         next_idx = s.index(b'e')
         if s.startswith(b'-0'):
@@ -66,7 +66,7 @@ class _BencodeHandler:
         return int(s[:next_idx]), s[next_idx+1:]
 
     @staticmethod
-    def __decode_string(s: bytes) -> Tuple[bytes, bytes]:
+    def __decode_string(s: bytes) -> tuple[bytes, bytes]:
         colon = s.index(b':')
         # FIXME Python3 is broken here, the call to `ord` shouldn't be needed,
         # but apparently it is. This is utterly idiotic.
@@ -76,7 +76,7 @@ class _BencodeHandler:
         s = s[1:]
         return s[colon:colon+str_len], s[colon+str_len:]
 
-    def __decode_list(self, s: bytes) -> Tuple[list, bytes]:
+    def __decode_list(self, s: bytes) -> tuple[list, bytes]:
         ret = list()
         s = s[1:]  # skip leading `l`
         while s[0] != ord('e'):
@@ -84,7 +84,7 @@ class _BencodeHandler:
             ret.append(value)
         return ret, s[1:]
 
-    def __decode_dict(self, s: bytes) -> Tuple[dict, bytes]:
+    def __decode_dict(self, s: bytes) -> tuple[dict, bytes]:
         ret = dict()
         s = s[1:]  # skip leading `d`
         while s[0] != ord(b'e'):


=====================================
libmat2/video.py
=====================================
@@ -3,7 +3,7 @@ import functools
 import shutil
 import logging
 
-from typing import Dict, Union
+from typing import Union
 
 from . import exiftool
 from . import bubblewrap
@@ -12,7 +12,7 @@ from . import bubblewrap
 class AbstractFFmpegParser(exiftool.ExiftoolParser):
     """ Abstract parser for all FFmpeg-based ones, mainly for video. """
     # Some fileformats have mandatory metadata fields
-    meta_key_value_allowlist = {}  # type: Dict[str, Union[str, int]]
+    meta_key_value_allowlist = {}  # type: dict[str, Union[str, int]]
 
     def remove_all(self) -> bool:
         if self.meta_key_value_allowlist:
@@ -45,10 +45,10 @@ class AbstractFFmpegParser(exiftool.ExiftoolParser):
             return False
         return True
 
-    def get_meta(self) -> Dict[str, Union[str, dict]]:
+    def get_meta(self) -> dict[str, Union[str, dict]]:
         meta = super().get_meta()
 
-        ret = dict()  # type: Dict[str, Union[str, dict]]
+        ret = dict()  # type: dict[str, Union[str, dict]]
         for key, value in meta.items():
             if key in self.meta_key_value_allowlist:
                 if value == self.meta_key_value_allowlist[key]:
@@ -91,11 +91,11 @@ class AVIParser(AbstractFFmpegParser):
                       'VideoFrameRate', 'VideoFrameCount', 'Quality',
                       'SampleSize', 'BMPVersion', 'ImageWidth', 'ImageHeight',
                       'Planes', 'BitDepth', 'Compression', 'ImageLength',
-                      'PixelsPerMeterX', 'PixelsPerMeterY', 'NumColors',
-                      'NumImportantColors', 'NumColors', 'NumImportantColors',
+                      'PixelsPerMeterX', 'PixelsPerMeterY',
+                      'NumImportantColors', 'NumColors',
                       'RedMask', 'GreenMask', 'BlueMask', 'AlphaMask',
                       'ColorSpace', 'AudioCodec', 'AudioCodecRate',
-                      'AudioSampleCount', 'AudioSampleCount',
+                      'AudioSampleCount',
                       'AudioSampleRate', 'Encoding', 'NumChannels',
                       'SampleRate', 'AvgBytesPerSec', 'BitsPerSample',
                       'Duration', 'ImageSize', 'Megapixels'}


=====================================
libmat2/web.py
=====================================
@@ -1,11 +1,10 @@
 from html import parser, escape
-from typing import Dict, Any, List, Tuple, Set, Optional
+from typing import  Any, Optional
 import re
 import string
 
 from . import abstract
 
-assert Set
 
 # pylint: disable=too-many-instance-attributes
 
@@ -26,7 +25,7 @@ class CSSParser(abstract.AbstractParser):
             f.write(cleaned)
         return True
 
-    def get_meta(self) -> Dict[str, Any]:
+    def get_meta(self) -> dict[str, Any]:
         metadata = {}
         with open(self.filename, encoding='utf-8') as f:
             try:
@@ -45,10 +44,10 @@ class CSSParser(abstract.AbstractParser):
 
 
 class AbstractHTMLParser(abstract.AbstractParser):
-    tags_blocklist = set()  # type: Set[str]
+    tags_blocklist = set()  # type: set[str]
     # In some html/xml-based formats some tags are mandatory,
     # so we're keeping them, but are discarding their content
-    tags_required_blocklist = set()  # type: Set[str]
+    tags_required_blocklist = set()  # type: set[str]
 
     def __init__(self, filename):
         super().__init__(filename)
@@ -58,7 +57,7 @@ class AbstractHTMLParser(abstract.AbstractParser):
             self.__parser.feed(f.read())
         self.__parser.close()
 
-    def get_meta(self) -> Dict[str, Any]:
+    def get_meta(self) -> dict[str, Any]:
         return self.__parser.get_meta()
 
     def remove_all(self) -> bool:
@@ -92,7 +91,7 @@ class _HTMLParser(parser.HTMLParser):
         self.filename = filename
         self.__textrepr = ''
         self.__meta = {}
-        self.__validation_queue = []  # type: List[str]
+        self.__validation_queue = []  # type: list[str]
 
         # We're using counters instead of booleans, to handle nested tags
         self.__in_dangerous_but_required_tag = 0
@@ -104,7 +103,6 @@ class _HTMLParser(parser.HTMLParser):
         self.tag_required_blocklist = required_blocklisted_tags
         self.tag_blocklist = blocklisted_tags
 
-    # pylint: disable=R0201
     def error(self, message):  # pragma: no cover
         """ Amusingly, Python's documentation doesn't mention that this
         function needs to be implemented in subclasses of the parent class
@@ -114,7 +112,7 @@ class _HTMLParser(parser.HTMLParser):
         """
         raise ValueError(message)
 
-    def handle_starttag(self, tag: str, attrs: List[Tuple[str, Optional[str]]]):
+    def handle_starttag(self, tag: str, attrs: list[tuple[str, Optional[str]]]):
         # Ignore the type, because mypy is too stupid to infer
         # that get_starttag_text() can't return None.
         original_tag = self.get_starttag_text()  # type: ignore
@@ -161,7 +159,7 @@ class _HTMLParser(parser.HTMLParser):
                     self.__textrepr += escape(data)
 
     def handle_startendtag(self, tag: str,
-                           attrs: List[Tuple[str, Optional[str]]]):
+                           attrs: list[tuple[str, Optional[str]]]):
         if tag in self.tag_required_blocklist | self.tag_blocklist:
             meta = {k:v for k, v in attrs}
             name = meta.get('name', 'harmful metadata')
@@ -186,7 +184,7 @@ class _HTMLParser(parser.HTMLParser):
             f.write(self.__textrepr)
         return True
 
-    def get_meta(self) -> Dict[str, Any]:
+    def get_meta(self) -> dict[str, Any]:
         if self.__validation_queue:
             raise ValueError("Some tags (%s) were left unclosed in %s" % (
                 ', '.join(self.__validation_queue),


=====================================
mat2
=====================================
@@ -2,7 +2,6 @@
 
 import os
 import shutil
-from typing import Tuple, List, Union, Set
 import sys
 import mimetypes
 import argparse
@@ -17,12 +16,7 @@ except ValueError as ex:
     print(ex)
     sys.exit(1)
 
-__version__ = '0.13.0'
-
-# Make pyflakes happy
-assert Set
-assert Tuple
-assert Union
+__version__ = '0.13.1'
 
 logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.WARNING)
 
@@ -41,7 +35,7 @@ def __check_file(filename: str, mode: int = os.R_OK) -> bool:
         __print_without_chars("[-] %s is not a regular file." % filename)
         return False
     elif not os.access(filename, mode):
-        mode_str = []  # type: List[str]
+        mode_str = []  # type: list[str]
         if mode & os.R_OK:
             mode_str += 'readable'
         if mode & os.W_OK:
@@ -157,10 +151,10 @@ def clean_meta(filename: str, is_lightweight: bool, inplace: bool, sandbox: bool
 
 def show_parsers():
     print('[+] Supported formats:')
-    formats = set()  # Set[str]
+    formats = set()  # set[str]
     for parser in parser_factory._get_parsers():  # type: ignore
         for mtype in parser.mimetypes:
-            extensions = set()  # Set[str]
+            extensions = set()  # set[str]
             for extension in mimetypes.guess_all_extensions(mtype):
                 if extension not in UNSUPPORTED_EXTENSIONS:
                     extensions.add(extension)
@@ -172,8 +166,8 @@ def show_parsers():
     __print_without_chars('\n'.join(sorted(formats)))
 
 
-def __get_files_recursively(files: List[str]) -> List[str]:
-    ret = set()  # type: Set[str]
+def __get_files_recursively(files: list[str]) -> list[str]:
+    ret = set()  # type: set[str]
     for f in files:
         if os.path.isdir(f):
             for path, _, _files in os.walk(f):


=====================================
nautilus/README.md deleted
=====================================
@@ -1,15 +0,0 @@
-# mat2's Nautilus extension
-
-# Dependencies
-
-- Nautilus (now known as [Files](https://wiki.gnome.org/action/show/Apps/Files))
-- [nautilus-python](https://gitlab.gnome.org/GNOME/nautilus-python) >= 2.10
-
-# Installation
-
-Simply copy the `mat2.py` file to `~/.local/share/nautilus-python/extensions`,
-and launch Nautilus; you should now have a "Remove metadata" item in the
-right-click menu on supported files.
-
-Please note: This is not needed if using a distribution provided package. It
-only applies if installing from source.


=====================================
nautilus/mat2.py deleted
=====================================
@@ -1,247 +0,0 @@
-#!/usr/bin/env python3
-
-"""
-Because writing GUI is non-trivial (cf. https://0xacab.org/jvoisin/mat2/issues/3),
-we decided to write a Nautilus extension instead
-(cf. https://0xacab.org/jvoisin/mat2/issues/2).
-
-The code is a little bit convoluted because Gtk isn't thread-safe,
-so we're not allowed to call anything Gtk-related outside of the main
-thread, so we'll have to resort to using a `queue` to pass "messages" around.
-"""
-
-# pylint: disable=no-name-in-module,unused-argument,no-self-use,import-error
-
-import queue
-import threading
-from typing import Tuple, Optional, List
-from urllib.parse import unquote
-import gettext
-
-import gi
-gi.require_version('Nautilus', '3.0')
-gi.require_version('Gtk', '3.0')
-gi.require_version('GdkPixbuf', '2.0')
-from gi.repository import Nautilus, GObject, Gtk, Gio, GLib, GdkPixbuf
-
-from libmat2 import parser_factory
-
-_ = gettext.gettext
-
-
-def _remove_metadata(fpath) -> Tuple[bool, Optional[str]]:
-    """ This is a simple wrapper around libmat2, because it's
-    easier and cleaner this way.
-    """
-    parser, mtype = parser_factory.get_parser(fpath)
-    if parser is None:
-        return False, mtype
-    return parser.remove_all(), mtype
-
-class Mat2Extension(GObject.GObject, Nautilus.MenuProvider, Nautilus.LocationWidgetProvider):
-    """ This class adds an item to the right-click menu in Nautilus. """
-
-    def __init__(self):
-        super().__init__()
-        self.infobar_hbox = None
-        self.infobar = None
-        self.failed_items = list()
-
-    def __infobar_failure(self):
-        """ Add an hbox to the `infobar` warning about the fact that we didn't
-        manage to remove the metadata from every single file.
-        """
-        self.infobar.set_show_close_button(True)
-        self.infobar_hbox = Gtk.Box(orientation=Gtk.Orientation.HORIZONTAL)
-
-        btn = Gtk.Button(_("Show"))
-        btn.connect("clicked", self.__cb_show_failed)
-        self.infobar_hbox.pack_end(btn, False, False, 0)
-
-        infobar_msg = Gtk.Label(_("Failed to clean some items"))
-        self.infobar_hbox.pack_start(infobar_msg, False, False, 0)
-
-        self.infobar.get_content_area().pack_start(self.infobar_hbox, True, True, 0)
-        self.infobar.show_all()
-
-    def get_widget(self, uri, window) -> Gtk.Widget:
-        """ This is the method that we have to implement (because we're
-        a LocationWidgetProvider) in order to show our infobar.
-        """
-        self.infobar = Gtk.InfoBar()
-        self.infobar.set_message_type(Gtk.MessageType.ERROR)
-        self.infobar.connect("response", self.__cb_infobar_response)
-
-        return self.infobar
-
-    def __cb_infobar_response(self, infobar, response):
-        """ Callback for the infobar close button.
-        """
-        if response == Gtk.ResponseType.CLOSE:
-            self.infobar_hbox.destroy()
-            self.infobar.hide()
-
-    def __cb_show_failed(self, button):
-        """ Callback to show a popup containing a list of files
-        that we didn't manage to clean.
-        """
-
-        # FIXME this should be done only once the window is destroyed
-        self.infobar_hbox.destroy()
-        self.infobar.hide()
-
-        window = Gtk.Window()
-        headerbar = Gtk.HeaderBar()
-        window.set_titlebar(headerbar)
-        headerbar.props.title = _("Metadata removal failed")
-
-        close_buton = Gtk.Button(_("Close"))
-        close_buton.connect("clicked", lambda _: window.close())
-        headerbar.pack_end(close_buton)
-
-        box = Gtk.Box(orientation=Gtk.Orientation.VERTICAL)
-        window.add(box)
-
-        box.add(self.__create_treeview())
-        window.show_all()
-
-    @staticmethod
-    def __validate(fileinfo) -> Tuple[bool, str]:
-        """ Validate if a given file FileInfo `fileinfo` can be processed.
-        Returns a boolean, and a textreason why"""
-        if fileinfo.get_uri_scheme() != "file" or fileinfo.is_directory():
-            return False, _("Not a file")
-        elif not fileinfo.can_write():
-            return False, _("Not writeable")
-        return True, ""
-
-    def __create_treeview(self) -> Gtk.TreeView:
-        liststore = Gtk.ListStore(GdkPixbuf.Pixbuf, str, str)
-        treeview = Gtk.TreeView(model=liststore)
-
-        renderer_pixbuf = Gtk.CellRendererPixbuf()
-        column_pixbuf = Gtk.TreeViewColumn("Icon", renderer_pixbuf, pixbuf=0)
-        treeview.append_column(column_pixbuf)
-
-        for idx, name in enumerate([_('File'), _('Reason')]):
-            renderer_text = Gtk.CellRendererText()
-            column_text = Gtk.TreeViewColumn(name, renderer_text, text=idx+1)
-            treeview.append_column(column_text)
-
-        for (fname, mtype, reason) in self.failed_items:
-            # This part is all about adding mimetype icons to the liststore
-            icon = Gio.content_type_get_icon('text/plain' if not mtype else mtype)
-            # in case we don't have the corresponding icon,
-            # we're adding `text/plain`, because we have this one for sure™
-            names = icon.get_names() + ['text/plain', ]
-            icon_theme = Gtk.IconTheme.get_default()
-            for name in names:
-                try:
-                    img = icon_theme.load_icon(name, Gtk.IconSize.BUTTON, 0)
-                    break
-                except GLib.GError:
-                    pass
-
-            liststore.append([img, fname, reason])
-
-        treeview.show_all()
-        return treeview
-
-    def __create_progressbar(self) -> Gtk.ProgressBar:
-        """ Create the progressbar used to notify that files are currently
-        being processed.
-        """
-        self.infobar.set_show_close_button(False)
-        self.infobar.set_message_type(Gtk.MessageType.INFO)
-        self.infobar_hbox = Gtk.Box(orientation=Gtk.Orientation.HORIZONTAL)
-
-        progressbar = Gtk.ProgressBar()
-        self.infobar_hbox.pack_start(progressbar, True, True, 0)
-        progressbar.set_show_text(True)
-
-        self.infobar.get_content_area().pack_start(self.infobar_hbox, True, True, 0)
-        self.infobar.show_all()
-
-        return progressbar
-
-    def __update_progressbar(self, processing_queue, progressbar) -> bool:
-        """ This method is run via `Glib.add_idle` to update the progressbar."""
-        try:
-            fname = processing_queue.get(block=False)
-        except queue.Empty:
-            return True
-
-        # `None` is the marker put in the queue to signal that every selected
-        # file was processed.
-        if fname is None:
-            self.infobar_hbox.destroy()
-            self.infobar.hide()
-            if self.failed_items:
-                self.__infobar_failure()
-            if not processing_queue.empty():
-                print("Something went wrong, the queue isn't empty :/")
-            return False
-
-        progressbar.pulse()
-        progressbar.set_text(_("Cleaning %s") % fname)
-        progressbar.show_all()
-        self.infobar_hbox.show_all()
-        self.infobar.show_all()
-        return True
-
-    def __clean_files(self, files: list, processing_queue: queue.Queue) -> bool:
-        """ This method is threaded in order to avoid blocking the GUI
-        while cleaning up the files.
-        """
-        for fileinfo in files:
-            fname = fileinfo.get_name()
-            processing_queue.put(fname)
-
-            valid, reason = self.__validate(fileinfo)
-            if not valid:
-                self.failed_items.append((fname, None, reason))
-                continue
-
-            fpath = unquote(fileinfo.get_uri()[7:])  # `len('file://') = 7`
-            success, mtype = _remove_metadata(fpath)
-            if not success:
-                self.failed_items.append((fname, mtype, _('Unsupported/invalid')))
-        processing_queue.put(None)  # signal that we processed all the files
-        return True
-
-    def __cb_menu_activate(self, menu, files):
-        """ This method is called when the user clicked the "clean metadata"
-        menu item.
-        """
-        self.failed_items = list()
-        progressbar = self.__create_progressbar()
-        progressbar.set_pulse_step = 1.0 / len(files)
-        self.infobar.show_all()
-
-        processing_queue = queue.Queue()
-        GLib.idle_add(self.__update_progressbar, processing_queue, progressbar)
-
-        thread = threading.Thread(target=self.__clean_files, args=(files, processing_queue))
-        thread.daemon = True
-        thread.start()
-
-    def get_background_items(self, window, file):
-        """ https://bugzilla.gnome.org/show_bug.cgi?id=784278 """
-        return None
-
-    def get_file_items(self, window, files) -> Optional[List[Nautilus.MenuItem]]:
-        """ This method is the one allowing us to create a menu item.
-        """
-        # Do not show the menu item if not a single file has a chance to be
-        # processed by mat2.
-        if not any((is_valid for (is_valid, _) in map(self.__validate, files))):
-            return None
-
-        item = Nautilus.MenuItem(
-            name="mat2::Remove_metadata",
-            label=_("Remove metadata"),
-            tip=_("Remove metadata")
-        )
-        item.connect('activate', self.__cb_menu_activate, files)
-
-        return [item, ]


=====================================
setup.py
=====================================
@@ -5,7 +5,7 @@ with open("README.md", encoding='utf-8') as fh:
 
 setuptools.setup(
     name="mat2",
-    version='0.13.0',
+    version='0.13.1',
     author="Julien (jvoisin) Voisin",
     author_email="julien.voisin+mat2 at dustri.org",
     description="A handy tool to trash your metadata",


=====================================
tests/test_libmat2.py
=====================================
@@ -481,6 +481,8 @@ class TestCleaning(unittest.TestCase):
                 'AverageBitrate': 465641,
                 'BufferSize': 0,
                 'CompatibleBrands': ['isom', 'iso2', 'avc1', 'mp41'],
+                'ColorProfiles': 'nclx',
+                'ColorPrimaries': 'BT.709',
                 'ColorRepresentation': 'nclx 1 1 1',
                 'CompressorID': 'avc1',
                 'GraphicsMode': 'srcCopy',
@@ -488,6 +490,7 @@ class TestCleaning(unittest.TestCase):
                 'HandlerType': 'Metadata',
                 'HandlerVendorID': 'Apple',
                 'MajorBrand': 'Base Media v1 [IS0 14496-12:2003]',
+                'MatrixCoefficients': 'BT.709',
                 'MaxBitrate': 465641,
                 'MediaDataOffset': 48,
                 'MediaDataSize': 379872,
@@ -501,7 +504,9 @@ class TestCleaning(unittest.TestCase):
                 'TimeScale': 1000,
                 'TrackHeaderVersion': 0,
                 'TrackID': 1,
-                'TrackLayer': 0},
+                'TrackLayer': 0,
+                'TransferCharacteristics': 'BT.709',
+            },
         },{
             'name': 'wmv',
             'ffmpeg': 1,



View it on GitLab: https://salsa.debian.org/pkg-privacy-team/mat2/-/commit/3b9e28fc031353f1e86d3174eeba8fb342e19bc6

-- 
View it on GitLab: https://salsa.debian.org/pkg-privacy-team/mat2/-/commit/3b9e28fc031353f1e86d3174eeba8fb342e19bc6
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-privacy-commits/attachments/20230108/4e8b0611/attachment-0001.htm>


More information about the Pkg-privacy-commits mailing list