Bug#1077836: latexdiff: processing 13-page document takes indefinite time (11 hours so far)

Frederik Tilmann tilmann at gfz-potsdam.de
Sun Aug 4 06:50:50 BST 2024


Dear reporter.  Without access to the report-old.tex and report_new.tex 
files this bug report is impossible to address. I have not encountered 
this behaviour before, not have I seen it reported in other ways.

Actually even better if you can whittle down to the paragraph that 
causes the problem.  Probably, it is a single expression on which a 
regex tripped.
11-page documents should not take more than 10 seconds or so if there is 
a moderate amount of changes. Unless you are processing a whole book, I 
would not expect running time of more than a minute.

The described work-around is impossible to implement, unfortunately, as 
in pre- and post-processing many small tasks are done in sequence on the 
whole document; mostly these are complex regular expression 
substitutions, so the hard work is done by perl's RegEx engine, and this 
cannot be micro-managed.

Frederik


On 03/08/2024 09:17, Manny wrote:
> Package: latexdiff
> Version: 1.3.2-1
> Severity: important
> Tags: upstream
> X-Debbugs-Cc: frederik.tilmann at gfz-potsdam.de, debbug.latexdiff at sideload.33mail.com
> 
> This was executed:
> 
>    $ latexdiff report_old.tex report_new.tex > report_diff.tex
> 
> After 11 hours the process is still running hard with CPU pegged
> around 99% according to /top/. CPU fan is running which also indicates
> hard work is being done. There is no output to indicate how much
> progress has been made.
> 
> When compiled, the document yields 13 pages in PDF form. I do not
> imagine that 11+ hours is reasonable for that volume. Bug fixes and
> enhancements are needed.
> 
>   ① There is likely some kind of faulty logic such as an endless loop
>   ② A progress indicator is needed
>   ③ A detailed debug log is needed
>   ④ Periodic assessments should be made throughout the processing as to
>      whether reasonable progress is being made. If an hour is spent on a
>      normal sized paragraph, the tool should abort and perhaps give an
>      indication of which segment of text is exceeding time
>      thresholds. This should be configurable but many users don’t know
>      what to expect so there should be a reasonable default.
> 
> I’ve seen latexdiff take forever in past executions and had to give up
> and kill it. The document latexdiff struggles with at the moment is a
> bilingual document that uses parcolumns to produce a left and right
> column.
> 
> -- System Information:
> Debian Release: 12.5
>    APT prefers stable-updates
>    APT policy: (990, 'stable-updates'), (990, 'stable-security'), (990, 'stable'), (500, 'oldstable')
> Architecture: amd64 (x86_64)
> Foreign Architectures: i386
> 
> Kernel: Linux 5.10.0-28-amd64 (SMP w/2 CPU threads)
> Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
> Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
> Shell: /bin/sh linked to /usr/bin/dash
> Init: systemd (via /run/systemd/system)
> LSM: AppArmor: enabled
> 
> Versions of packages latexdiff depends on:
> ii  perl  5.36.0-7+deb12u1
> 
> Versions of packages latexdiff recommends:
> ii  texlive-latex-base         2022.20230122-3
> ii  texlive-latex-extra        2022.20230122-4
> ii  texlive-latex-recommended  2022.20230122-3
> ii  texlive-plain-generic      2022.20230122-4
> 
> Versions of packages latexdiff suggests:
> ii  git  1:2.39.2-1.1
> 
> -- no debconf information



More information about the pkg-perl-maintainers mailing list