daemon mode for exiftool

Started by Archive, May 12, 2010, 08:54:06 AM

Previous topic - Next topic

Archive

[Originally posted by mjc on 2007-07-13 13:09:20-07]

Hi all,

I want to call exiftool from a c image viewer program (gThumb), to get the metadata for hundreds of images (as part of the thumbnailing process).

exiftool works extremely well, except that it takes ~0.3s per file. I think that most of the time is actually spent initializing the program and perl, rather than actually generating the metadata.

Phil, have you considered adding a "daemon mode" to exiftool, so that it could be launched once and run persistently? Has anyone experimented with that?

- Mike

Archive

[Originally posted by andyarmstrong on 2007-07-13 13:11:51-07]

How about embedding Perl in gThumb?

Failing that I'm sure an exif extraction server could be written in Perl.

Archive

[Originally posted by exiftool on 2007-07-13 13:19:41-07]

Hi Mike,

The exiftool script does have powerful multi-file processing abilities.
A daemon would only be useful if you want interactive processing
of multiple files.  Right now to do this you would need to write your
own script using calls to the ExifTool functions.  It wouldn't be too
difficult to set this up if you know a bit of Perl, and would definitely
avoid the start-up cost of loading Perl, ExifTool and all the associated
libraries.  But in the end I'm not sure how much this will speed things
up.  You could run some tests on your system by processing a large
number of files in a single directory to see if the speed benefits
will be worth it.

Exiftool used to be a lot quicker, but for each new piece of information
that it extracts, it slows down just a little bit more.  And now the
amount of information extracted from some images is really crazy.

- Phil

Archive

[Originally posted by mjc on 2007-07-13 13:22:40-07]

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 13:35:41-07]

Hi Mike,

Ah, so you are using the stand-alone Windows version.  Right.

Try running the tests I suggested, and if you see enough speed benefit then
I will look into adding a CLI option for you.

- Phil

Archive

[Originally posted by mjc on 2007-07-13 13:54:30-07]

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Archive

[Originally posted by mjc on 2007-07-13 14:00:13-07]

Using a directory of mixed files (jpgs, tiffs, RAWs, other), I get these benchmarks:

ls -1 | xargs -n1 exiftool -S -a -e -G1

~ 35 seconds

ls -1 | xargs -n100 exiftool -S -a -e -G1

~ 13 seconds

Calling exiftool once per file is 2-3 times slower than passing a list of 100 files. So the initialization overhead is pretty severe.

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 14:32:23-07]

Hi Mike,

Thanks.  I'll let you know what I come up with.
(it may be a few days as I'm quite busy at the
moment.)

- Phil

Archive

[Originally posted by mjc on 2007-07-13 14:47:13-07]

Hmm, I did another benchmark on a more typical folder of jpgs and avis from a digital camera (217 files, 2007-06 folder) and the benchmarks were:

one at a time: 1m 30s

batch mode: 1m 10s

which suggests there is not that much to be gained by a daemon mode. I'm a little puzzled why this folder shows much less improvement than the other one, which had a strange mix of raw/tiff/jpg/pdf/other files... I guess more research is needed before bothering with daemons.

-Mike

Archive

[Originally posted by mjc on 2007-07-13 14:54:55-07]

One last insight... the two benchmarks may differ because of the processing speed with unknown file types.

For instance, if I run exiftool one-at-a-time on 47 *.eml (Thunderbird email) files, it takes 50 seconds. In batch mode, it takes 8 seconds!

Food for thought...

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 15:03:53-07]

I should have thought of this.

When exiftool finds an unknown file, it must load all the modules one-by-one
until it discovers what the file format is.  And if it isn't a supported format,
it has to load ALL modules.  This gives you the highest possible initial overhead.

- Phil

Archive

[Originally posted by exiftool on 2007-07-13 15:13:04-07]

(of course, exiftool avoids this overhead by only processing recognized
file extensions in a directory unless you specifically tell it to process
another type of file.)

Archive

[Originally posted by mjc on 2007-07-13 15:26:47-07]

Hmm. Interesting. gThumb already knows the mime type of the file when it wants metadata. Could the known mime type be supplied to exiftool as a parameter to speed up processing (ideally exiftool would quickly quit if it didn't recognize the specified mime type)?

I could add an array of mime types that exiftool is known to support, but then I'd have to manually update the program each time new file support was added.

We'd have to agree on mime type names for RAW files - I'm not sure if they are all standardized (some are).

- Mike

Archive

[Originally posted by exiftool on 2007-07-13 15:45:01-07]

You can get a list of supported extensions by typing

Code:
exiftool -listf

I don't give a list of MIME types, but there would be problems with
doing this:

ExifTool reports all RAW image formats as "image/x-raw" MIME type.  If there
are accepted MIME types that are more specific than this, perhaps I
should be using them...

- Phil

Archive

[Originally posted by mjc on 2007-07-13 18:08:10-07]

Thanks for all the comments, Phil!

I can gain speed by being more careful about not feeding exiftool unsupported formats. The only formats that gThumb supports that exiftool doesn't are the high dynamic range types (OpenEXR and Radiance rgbe - any plans for those? I don't even know if they carry metadata...).

The daemon idea probably wouldn't gain much speed after all.

- Mike