PDF document takes 3 minutes to process

Started by felixge, May 06, 2013, 05:21:45 AM

Previous topic - Next topic

felixge

System: Ubuntu 12.04
Exiftool: 9.28
Command line: exiftool -v 34749_ePrint_2.pdf
Output: see output.txt

Exiftool takes a very long time to process the attached PDF. I suspect this is because the PDF contains a lot of history events and similar data. I don't need this and tried to exclude the data via '--History*' and '-x History*', but unfortunately that only seems to exclude the history events from the output, but all the processing is still happening.

Many thanks for any advice in advance!

Phil Harvey

Thanks for the bug report and sample.  I can reproduce this on my system here.

The problem is that this document contains a very large encrypted metadata stream.  Unfortunately, the entire stream must be decrupted to extract any of the information.  Complain to Adobe about this (and about storing editing information inside the metadata, which is stupid).  I have had to implement the AES decryption myself (in Perl, which is slow) because I couldn't find a standard library to do this.  From my AES module documentation:

        BUGS

        This code is blindingly slow.  But in truth, slowing down processing is the
        main purpose of encryption, so this really can't be considered a bug.


I have no idea why PDF information is encrypted like this when there is no password protection.  Again, stupid Adobe.

So the bottom line is that I don't think there is anything I can do about this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).