Bug#884095: Correctly identify Android APK/DEX files

Chris Lamb lamby at debian.org
Mon Apr 13 16:39:05 BST 2020


tags 884095 + pending
thanks

This should be fixed in:


commit f1297ac09c77387568e58c6e8dfb4009a1a7c4cf
Author: Chris Lamb <lamby at debian.org>
Date:   Mon Apr 13 16:30:22 2020 +0100

    Dalvik .dex files can also serve as APK containers so restrict the
    narrower identification of .dex files to files ending with this
    extension, and widen the identification of APK files to when
    file(1) discovers a Dalvik file. (Closes: Debian:#884095,
    reproducible-builds/diffoscope#28)
    
    This is essentially functionally equivalent (yet superior to) always
    identifying files named *.apk as APK containers or adding a command-line switch
    to "force" identification.
    
    This is not a bug in file(1) as the files do legitimately differ in their
    magic numbers:
    
        $ xxd HelloWorld.apk | head -n1
        00000000: 504b 0304 0a00 0000 0800 c907 c37d 8dea  PK...........}..
    
        $ xxd helloworld-janus.apk | head -n1
        00000000: 6465 780a 3033 3500 b5f7 4441 c609 399e  dex.035...DA..9.

diff --git a/diffoscope/comparators/apk.py b/diffoscope/comparators/apk.py
index 2c20a35..119cd9b 100644
--- a/diffoscope/comparators/apk.py
+++ b/diffoscope/comparators/apk.py
@@ -188,7 +188,7 @@ class ApkContainer(Archive):
 class ApkFile(File):
     DESCRIPTION = "Android APK files"
     FILE_TYPE_HEADER_PREFIX = b"PK\x03\x04"
-    FILE_TYPE_RE = re.compile(r'^(Java|Zip) archive data.*\b')
+    FILE_TYPE_RE = re.compile(r'^((Java|Zip) archive data|Dalvik dex file)\b')
     FILE_EXTENSION_SUFFIX = '.apk'
     CONTAINER_CLASSES = [ApkContainer, ZipContainer]
 
diff --git a/diffoscope/comparators/dex.py b/diffoscope/comparators/dex.py
index 6a15098..09dca3e 100644
--- a/diffoscope/comparators/dex.py
+++ b/diffoscope/comparators/dex.py
@@ -60,4 +60,5 @@ class DexContainer(Archive):
 class DexFile(File):
     DESCRIPTION = "Dalvik .dex files"
     FILE_TYPE_RE = re.compile(r'^Dalvik dex file .*\b')
+    FILE_EXTENSION_SUFFIX = '.dex'
     CONTAINER_CLASSES = [DexContainer]


Regards,

-- 
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby at debian.org / chris-lamb.co.uk
       `-



More information about the Reproducible-builds mailing list