evil RAR files

Simon McVittie smcv at debian.org
Thu Apr 30 15:09:53 UTC 2015


On 30/04/15 14:50, Alexandre Detiste wrote:
> I had this working, but didn't pushed it...
> wouldn't-it be considered outright evil
> to accept all "unknown", "non/official" RAR file.

I would rather not do this, but perhaps not for the reasons you think.

I don't want to unpack unknown files (those without a known-safe
cryptographic hash) in general, for a few reasons:

* they might contain directory traversal attacks ("../../../payload")
* they might contain exploits for other bugs in decompressors/unpackers,
  e.g. buffer overflows
* they might simply contain a lot of unwanted stuff (e.g. Windows
  executables that we will replace with a DFSG engine, or content
  from a different game entirely) that we don't want to spend time
  unpacking if we don't have to

Zip files:

* can be read with the Python stdlib, which is (presumably)
  high-quality and actively-maintained; this mitigates the points
  about directory traversal and buffer overflows
* are seekable, so you don't have to spend time decompressing
  unwanted files like you would with .tar.whatever; this mitigates
  the point about unwanted stuff
* have a built-in index which is understood by Python, so you can
  make a shortlist of potentially relevant files by name/size,
  and only spend time checksumming things that are on the shortlist;
  this mitigates the points about directory traversal and unwanted stuff
* have a Python API that can pick out individual files, without
  necessarily using the intended filename; this mitigates the point
  about directory traversal

which is why I didn't object to you adding this code for them. However,
rar files don't have all of those advantages.

The fact that rar files are (for some reason that I've never understood)
strongly correlated with copyright infringement is not actually such a
concern for me. If our users want to copy games illegally, they'll do
it, with or without our help. We're not writing DRM software here :-)
However, if they accidentally run malware as a result of unpacking
warez'd game data that contains a successful exploit for the unpacker, I
want it to be unambiguously not our fault.

Regarding the performance point specifically, here is something to bear
in mind. One day, I would like to be able to do something like

    game-data-packager anything ~/Downloads ~/GameMedia

or the GUI equivalent, and have it produce any sensible .deb(s) that it
possibly can from those inputs.

Recursing into zip files that were explicitly specified is not a problem
for that, because recursing into an unwanted zip file only takes as long
as determining that we don't want any of its contents; even recursing
into zip files that were merely found in the directory is not a big
problem, because again, they're seekable and indexed; but if we recurse
into containers that don't have an index (like .tar.gz), or have an
index but we don't know how to read it natively (like .rar?), or are
just really slow (anything LZMA-based), then we'd potentially waste a
lot of time analyzing the data files of games that g-d-p doesn't (e.g.
those without Free engines).

    S




More information about the Pkg-games-devel mailing list