EXIF tool takes loads of time to extract metadata for particular large EPS files

Started by sapandaga, April 16, 2013, 03:17:49 AM

Previous topic - Next topic

sapandaga

Hello,

I am using EXIF tool to extract metadata of Images.

I am using following command for extraction

exiftool.exe -G "D:\Test Data\C1.eps"

On executing, I get the output but it took almost 7 minutes to extract the metadata.
The file is an EPS file with size of 213 MB.

Exif Tool Version : 8.29

Output extracted is


[ExifTool]      ExifTool Version Number         : 8.29
[File]          File Name                       : C1.eps
[File]          Directory                       : D:/Test Data/
[File]          File Size                       : 214 MB
[File]          File Modification Date/Time     : 2013:02:06 16:28:46+05:30
[File]          File Permissions                : rw-rw-rw-
[File]          File Type                       : EPS
[File]          MIME Type                       : application/postscript
[PostScript]    Creator                         : Adobe Illustrator(R) 13.0
[PostScript]    For                             : xyz
[PostScript]    Create Date                     : 3/23/10
[PostScript]    Bounding Box                    : 0 0 2792 6722
[PostScript]    Pages                           : 1
[PostScript]    Version                         : 1.0 0
[PostScript]    Copyright                       : Copyright(C)2000-2006 Adobe Systems, Inc. All Rights Reserved.
[XMP]           XMP Toolkit                     : Adobe XMP Core 4.1-c036 46.277092, Fri Feb 23 2007 14:16:18
[XMP]           Format                          : application/postscript
[XMP]           Title                           : Print
[XMP]           Creator Tool                    : Adobe Illustrator CS3
[XMP]           Modify Date                     : 2010:03:23 14:59:56-04:00
[XMP]           Metadata Date                   : 2010:03:23 14:59:56-04:00
[XMP]           Thumbnail Width                 : 108
[XMP]           Thumbnail Height                : 256
[XMP]           Thumbnail Format                : JPEG
[XMP]           Thumbnail Image                 : (Binary data 8459 bytes, use -b option to extract)
[XMP]           Document ID                     : uuid:7A3F01EA9F32DF11A28DB1B51907C548
[XMP]           Instance ID                     : uuid:1DE45E3D1E38DF119067A61C03564095
[XMP]           Derived From Instance ID        : uuid:35ECBFB3832FDF119770FC5A252EAE0A
[XMP]           Derived From Document ID        : uuid:34ECBFB3832FDF119770FC5A252EAE0A
[XMP]           Manifest Link Form              : EmbedByReference
[XMP]           Manifest Reference File Path    : /Users/xyz/Documents/3842SP POS Booth Graphics ¦Æ/APP band 1 flat.psd
[XMP]           Manifest Reference Instance ID  : uuid:14A0201E1738DF11BF3DF4D6C849A7D0
[XMP]           Manifest Reference Document ID  : uuid:13A0201E1738DF11BF3DF4D6C849A7D0
[XMP]           Startup Profile                 : Print
[XMP]           Max Page Size W                 : 36.000000
[XMP]           Max Page Size H                 : 96.000000
[XMP]           Max Page Size Unit              : Inches
[XMP]           N Pages                         : 1
[XMP]           Has Visible Transparency        : False
[XMP]           Has Visible Overprint           : False
[XMP]           Font Name                       : HelveticaNeue-ThinExt, HelveticaNeue-LightExt
[XMP]           Font Family                     : Helvetica Neue, Helvetica Neue
[XMP]           Font Face                       : 33 Thin Extended, 43 Light Extended
[XMP]           Font Type                       : Type 1, Type 1
[XMP]           Font Version                    : 001.000, 001.000
[XMP]           Font Composite                  : False, False
[XMP]           Font File Name                  : HelveNeuThiExt; HelveticaNeue ThinExt.sc, HelveNeuLigExt; HelveticaNeue LightExt.sc
[XMP]           Plate Names                     : Cyan, Magenta, Yellow, Black
[XMP]           Swatch Groups Colorants Tint    : 100.000000
[XMP]           Swatch Groups Colorants Cyan    : 80.000000
[XMP]           Swatch Groups Colorants Magenta : 5.000001
[XMP]           Swatch Groups Colorants Yellow  : 10.000002
[XMP]           Swatch Groups Colorants Black   : 0.000000
[XMP]           Swatch Groups Group Name        : Grayscale
[XMP]           Swatch Groups Group Type        : 1
[XMP]           Swatch Groups Colorants Swatch Name: K=5
[XMP]           Swatch Groups Colorants Mode    : GRAY
[XMP]           Swatch Groups Colorants Type    : PROCESS
[XMP]           Swatch Groups Colorants Gray    : 12
[Composite]     Image Height                    : 6722
[Composite]     Image Width                     : 2792
[Composite]     Image Size                      : 2792x6722


The file can be downloaded from https://mvdevasia.blob.core.windows.net/testdata/C1.eps

Let me know if there something wrong  I am doing or with the command or with the file itself.
How can I improve the performance ?

Thanks and Regards,
Swapnil
Senior Software Developer

Phil Harvey

Hi Swapnil,

Interesting.  I downloaded your file and ran this command on my 2.7 GHz Intel Core i5 Mac, and it took 5.6 seconds.

What type of computer are you using?  I'll see if I can find a Windows system to try this on.

- Phil

Edit:  Oh, wait.  You're using ExifTool 8.29.  I'll wait until you try this with a more recent version.  That's almost 100 updates ago... The current version is 9.27.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sapandaga

Thanks Phil for the response.

I will try it out on the latest version. Will update you with the results.

In the meantime following are my machine details:
OS: Windows 7 64 bit machine
CPU: 3.4GHz Core i7
RAM : 16GB

Regards,
Swapnil

sapandaga

Hi Phil,

Even with 9.27. It is taking almost 7 minutes.
I have attached the verbose output of the command to the thread.
Let me know what you find.

Regards,
Swapnil

Phil Harvey

I haven't found a PC to run this on yet, but I've analyzed the EPS file and the problem is surely due to the memory requirements.  The Windows version of ExifTool seems constrained to about 200 MB of RAM for some reason, and going over this causes big slow-downs.  ExifTool reads EPS files one line at a time, but this EPS file changes newline characters mid way through, so close to the 2nd half of the file is loaded into memory at once (about 100 MB).  This could definitely cause the problem you are seeing.  I'll see if there are any things I can do to reduce the memory requirements for EPS files like this, and post back here with any developments.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

I've looked at this problem in some detail now.  I already have a Windows-specific patch in the PostScript reader to handle this particular problem... The thing is that it must still read ~100MB of data into memory before it can determine that the linefeeds have changed.  Changing this to read a bit at a time would severely impact performance for the typical case.

I did manage to reproduce the extremely long processing times that you observed for your file when running the Windows EXE version of ExifTool.  Interestingly, there is no problem running on the same platform using ActivePerl and the Perl version of ExifTool.  This points to a memory limitation specific to PAR-packaged EXE version.  So a work around (although I agree, not very satisfying), is to install ActivePerl and run the Perl version of ExifTool.

But I'll keep thinking about this in hopes that I might come up with a better idea.

- Phil

Edit: Until I can come up for a solution for this, I have added it to the list of known problems.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sapandaga

Thanks Phil!
Would check if it is feasible for me to incorporate the Perl version into my application.

Let me know, if you find a fix to this issue.

Regards,
Swapnil