daemon mode for exiftool

Started by Archive, May 12, 2010, 08:54:06 AM

Previous topic - Next topic

Archive

[Originally posted by mjc on 2007-07-13 13:09:20-07]

Hi all,

I want to call exiftool from a c image viewer program (gThumb), to get the metadata for hundreds of images (as part of the thumbnailing process).

exiftool works extremely well, except that it takes ~0.3s per file. I think that most of the time is actually spent initializing the program and perl, rather than actually generating the metadata.

Phil, have you considered adding a "daemon mode" to exiftool, so that it could be launched once and run persistently? Has anyone experimented with that?

- Mike

Archive

[Originally posted by andyarmstrong on 2007-07-13 13:11:51-07]

How about embedding Perl in gThumb?

Failing that I'm sure an exif extraction server could be written in Perl.

Archive

[Originally posted by exiftool on 2007-07-13 13:19:41-07]

Hi Mike,

The exiftool script does have powerful multi-file processing abilities.
A daemon would only be useful if you want interactive processing
of multiple files.  Right now to do this you would need to write your
own script using calls to the ExifTool functions.  It wouldn't be too
difficult to set this up if you know a bit of Perl, and would definitely
avoid the start-up cost of loading Perl, ExifTool and all the associated
libraries.  But in the end I'm not sure how much this will speed things
up.  You could run some tests on your system by processing a large
number of files in a single directory to see if the speed benefits
will be worth it.

Exiftool used to be a lot quicker, but for each new piece of information
that it extracts, it slows down just a little bit more.  And now the
amount of information extracted from some images is really crazy.

- Phil

Archive

[Originally posted by mjc on 2007-07-13 13:22:40-07]

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 13:35:41-07]

Hi Mike,

Ah, so you are using the stand-alone Windows version.  Right.

Try running the tests I suggested, and if you see enough speed benefit then
I will look into adding a CLI option for you.

- Phil

Archive

[Originally posted by mjc on 2007-07-13 13:54:30-07]

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Archive

[Originally posted by mjc on 2007-07-13 14:00:13-07]

Using a directory of mixed files (jpgs, tiffs, RAWs, other), I get these benchmarks:

ls -1 | xargs -n1 exiftool -S -a -e -G1

~ 35 seconds

ls -1 | xargs -n100 exiftool -S -a -e -G1

~ 13 seconds

Calling exiftool once per file is 2-3 times slower than passing a list of 100 files. So the initialization overhead is pretty severe.

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 14:32:23-07]

Hi Mike,

Thanks.  I'll let you know what I come up with.
(it may be a few days as I'm quite busy at the
moment.)

- Phil

Archive

[Originally posted by mjc on 2007-07-13 14:47:13-07]

Hmm, I did another benchmark on a more typical folder of jpgs and avis from a digital camera (217 files, 2007-06 folder) and the benchmarks were:

one at a time: 1m 30s

batch mode: 1m 10s

which suggests there is not that much to be gained by a daemon mode. I'm a little puzzled why this folder shows much less improvement than the other one, which had a strange mix of raw/tiff/jpg/pdf/other files... I guess more research is needed before bothering with daemons.

-Mike

Archive

[Originally posted by mjc on 2007-07-13 14:54:55-07]

One last insight... the two benchmarks may differ because of the processing speed with unknown file types.

For instance, if I run exiftool one-at-a-time on 47 *.eml (Thunderbird email) files, it takes 50 seconds. In batch mode, it takes 8 seconds!

Food for thought...

-Mike

Archive

[Originally posted by exiftool on 2007-07-13 15:03:53-07]

I should have thought of this.

When exiftool finds an unknown file, it must load all the modules one-by-one
until it discovers what the file format is.  And if it isn't a supported format,
it has to load ALL modules.  This gives you the highest possible initial overhead.

- Phil

Archive

[Originally posted by exiftool on 2007-07-13 15:13:04-07]

(of course, exiftool avoids this overhead by only processing recognized
file extensions in a directory unless you specifically tell it to process
another type of file.)

Archive

[Originally posted by mjc on 2007-07-13 15:26:47-07]

Hmm. Interesting. gThumb already knows the mime type of the file when it wants metadata. Could the known mime type be supplied to exiftool as a parameter to speed up processing (ideally exiftool would quickly quit if it didn't recognize the specified mime type)?

I could add an array of mime types that exiftool is known to support, but then I'd have to manually update the program each time new file support was added.

We'd have to agree on mime type names for RAW files - I'm not sure if they are all standardized (some are).

- Mike

Archive

[Originally posted by exiftool on 2007-07-13 15:45:01-07]

You can get a list of supported extensions by typing

Code:
exiftool -listf

I don't give a list of MIME types, but there would be problems with
doing this:

ExifTool reports all RAW image formats as "image/x-raw" MIME type.  If there
are accepted MIME types that are more specific than this, perhaps I
should be using them...

- Phil

Archive

[Originally posted by mjc on 2007-07-13 18:08:10-07]

Thanks for all the comments, Phil!

I can gain speed by being more careful about not feeding exiftool unsupported formats. The only formats that gThumb supports that exiftool doesn't are the high dynamic range types (OpenEXR and Radiance rgbe - any plans for those? I don't even know if they carry metadata...).

The daemon idea probably wouldn't gain much speed after all.

- Mike

Archive

#15
[Originally posted by exiftool on 2007-07-13 19:21:59-07]

Hi Mike,

Great.  Glad this helps.

I've never heard of those formats, and haven't had any requests to support
them.  But unless they contain metadata, there isn't much reason to do
it (except maybe to extract the image dimensions or something like that).

- Phil

PH Edit: Read support was added for OpenEXR and Radiance images in ExifTool 8.73 on 2011-12-16.

Archive

[Originally posted by exiftool on 2007-07-13 19:36:28-07]

You got me thinking so I ran a quick test on my system here.
I timed the following two commands:

Code:
exiftool -listf
 exiftool -listg

The way ExifTool is implemented, the first command doesn't need to load
any of the format-specific modules, and it takes 0.100 sec on my
system.  The second command loads all modules to determine
the full group list, and takes 0.677 sec here.  The difference is mainly
due to the time required to load all the modules, but there is a bit
more CPU work done by -listg, so I hacked the code to remove this
and the time dropped to 0.622 sec.  So on my system (a 1.83 GHz Intel
Core Duo), the time to load all modules is 0.522 seconds.  That's pretty
hefty. (...and you want me to add more?... hehe)

- Phil

Archive

[Originally posted by mjc on 2007-07-13 19:50:08-07]

OpenEXR does have metadata, but I don't know anyone who uses the format. So it's just a curiosity...

- Mike

Archive

[Originally posted by metadatacrucher on 2009-01-23 21:35:16-08]

I do see a sense in an ExifTool Daemon. Imagine a non-Perl Web Application that wants to do metadata extraction for uploaded pics. Currently you have to call/exec exiftool on every uploaded pic which is a certain overhead - especially if you want to redirect the user to the results after the upload. A Daemon could improve performance in this case.

Can we expect any progress on this issue in near future?

Archive

[Originally posted by exiftool on 2009-01-24 00:14:06-08]

The short answer is:  No

Phil Harvey

#20
Update: 2010-10-30 - ExifTool 8.36 implemented a -stay_open option which effectively gives this "daemon mode" functionality.

- Phil

Edit: As of 2013-12-01 there is a C++ Interface for ExifTool that is available.  This interface puts an object-oriented C++ interface around the ExifTool -stay_open feature.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).