ExifTool Forum

ExifTool => Bug Reports / Feature Requests => Topic started by: TSM on May 30, 2013, 06:28:37 AM

Title: Very slow reading of files with crs raw metadata
Post by: TSM on May 30, 2013, 06:28:37 AM
We finding that it is very slow to read some files we get though which have CRS RAW metadata in the XMP section. Below is an example file but we had a folder with 30 of these and it was taking about 0.5-1s per file to read vs 0.1-0.2s normally.
We use batching to speed up read times on bulk reads but this does not have any real added benefit to these pictures.
In the end i just stripped the XMP section then the files were read super fast.

Version: 9.02 (production) but also tested on latest 9.30
OS: Centos 6.3
CMD: exiftool -use MWG -fast2 -q -g -j

Source: https://docs.google.com/file/d/0B7Gftc42CL6WUTFiUzFicU0wZlk/edit?usp=sharing
Output: attached

Title: Re: Very slow reading of files with crs raw metadata
Post by: Phil Harvey on May 30, 2013, 07:35:34 AM
It certainly looks like the bulk of the processing time is spent parsing XMP.  I get this:

> time exiftool allpix_0019148_0001.jpg -use MWG -fast2 -q -g -j > t1
0.417u 0.011s 0:00.43 97.6% 0+0k 0+3io 0pf+0w
> exiftool allpix_0019148_0001.jpg -xmp -b -a > out.xmp
> exiftool allpix_0019148_0001.jpg -xmp:all=
    1 image files updated
> time exiftool allpix_0019148_0001.jpg -use MWG -fast2 -q -g -j > t1
0.136u 0.009s 0:00.14 92.8% 0+0k 0+4io 0pf+0w
> time exiftool out.xmp -use MWG -fast2 -q -g -j > t1
0.370u 0.008s 0:00.38 97.3% 0+0k 0+0io 0pf+0w


So that's 0.43 seconds parsing the original file, 0.14 seconds without XMP, and 0.38 seconds parsing the (110 kB of) XMP alone.  (And it looks like about 0.09 seconds overhead just to load ExifTool and its XMP library.)

Parsing string-based data is time consuming because the entire string must be scanned for matching patterns, and Perl isn't the fastest of languages.  I have always strongly disagreed with Adobe's strategy of mixing image editing data with the metadata in XMP, and this is one reason why.  I don't know if there is much I can do about this.  I have complained to Adobe, but that didn't help.

You can save a bit of time (0.04 seconds) by not outputting the XMP-crs, but the effect isn't large since this doesn't stop ExifTool from parsing it:

> time exiftool ../testpics/xmp/allpix_0019148_0001.jpg -use MWG -fast2 -q -g -j --xmp-crs:all > t1
0.378u 0.009s 0:00.39 94.8% 0+0k 0+0io 0pf+0w


- Phil
Title: Re: Very slow reading of files with crs raw metadata
Post by: TSM on May 30, 2013, 01:02:30 PM
Hmmmm, annoying.

Thanks anyway for a brilliant program we use it to process about 10k images a day, mostly though a custom PHP Class (ZF1) with your JSON output in batches to get around loading times.