[Python-modules-team] Bug#640303: pdfminer: please provide a python3 version

Julian Gilbey jdg at debian.org
Sun Sep 18 10:32:35 UTC 2011


On Sun, Sep 11, 2011 at 05:51:46PM +0200, Daniele Tricoli wrote:
> Hello Julian,
> 
> On Sunday 04 September 2011 10:34:41 you wrote:
> > I have successfully run 2to3 over the source code, and there does not
> > seem to be any obvious reason why this would not be possible.  The
> > [...]
> Did you try, after the conversion, the test suite? Right now the Debian 
> package for pdfminer is only for python 2.x because upstream declares that 
> python 3 is not supported.
> 
> I will try to build a python3 package using 2to3 soon. Thanks for the 
> report.

After digging deeper, it seems that the conversion is going to be
significantly harder due to the bytes/text distinction in python3.
The main sticking point so far appears to be in psparser.py, which
reads files and then tries to interpret them.  So for example, the
compiled regexs starting on line 128 need to be converted into byte
expressions, replacing r'...' by br'...', and similarly with
ESC_STRING.  I'm not sure I currently know enough Python to be able to
be able to perform the necessary conversions reliably :-(

   Julian





More information about the Python-modules-team mailing list