Help with speeding up ExifTool

Started by mball, May 16, 2012, 10:18:38 AM

Previous topic - Next topic

mball

I am new to exiftool and trying to get the fastest response from within a c# application. I only want to extract the creation date and subject tags from a file. I have used the following code but I am sure there must be a way of optimizing this, I hope someone can help

This is the command line I am using "exiftool.exe -fast2 -G -t -m -q FILENAME" and I am using a class exiftoolwrapper to do the work

Then i use this code to get the tags, is there a better way of extracting just the XMP and Subject groups and if so would that increase speed

                    foreach (ExifTagItem tag in exiftool.AsEnumerable())
                    {
                        if (tag.group == "XMP" && tag.name == "Subject")
                            keytags = tag.value.Split(',');
                    }

Phil Harvey

The biggest gains (roughly 60x) would come if you can pre-launch exiftool.exe with the -stay_open option instead of executing exiftool for each file.

- Phil

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mball

Yes I saw that in the documentation but I couldnt work out how to do that

BogdanH

Hi,
I don't know C at all, but maybe I can still give you a hint...
Now, not knowing how "exiftoolwrapper" actually works, I assume ExifTool is called for each file from which you wish to get metadata -I assume that, because otherwise you wouldn't mentioned slow speed. If I am correct on this, then in regard of speed, that's the worst scenario of using ExifTool from within your application.
It's hard to say what's the best solution for you, without knowing more about your application (and how important part ExifTool there is). But in general there are few ways to increase speed:

1. When ExifTool is called to return metadata, specify what tags values you need, i.e.:
ExifTool -Exif:CreateDate -Xmp:Subject FILENAME
-if tags aren't specified, ExifTool extracts all metadata inside each file (which takes some time).

2. Try not to call ExifTool for each file; specify files instead, i.e.:
ExifTool -Exif:CreateDate -Xmp:Subject FILENAME_1, FILENAME_2, ...
or group of files:
ExifTool -Exif:CreateDate -Xmp:Subject *.JPG
-that way, ExifTool is called only once (for all specified files), which is a big gain in speed.
Of course, you need to know how to catch metadata values of all specified files and how to process them in your app.

3. Try to use "args" file. It is a normal text file, which can contain all ExifTool options and all filenames you wish ExifTool to process. Once that file is prepared (within your app), you call ExifTool by:
ExifTool -@ MyArgs.args
Again however, you need to know how to read output given by ExifTool (I don know, maybe "exiftoolwrapper" can handle this).

4. If ExifTool is called from your app many times, then you shoud consider using ExifTool's -stay_open option. By using this option and if done properly, ExifTool will fly at speed of light -just think of this option as if ExifTool would be DLL (if you're using Windows).
There are two ways how to use -stay_open option:
A -from within args files. That's the easier way, once you figure out how to fiddle with args files.
B -you communicate with ExifTool "directly" via StdIn/StdOut pipes. That would require modifying your exiftoolwrapper I assume (or even rewrite it from scratch).
I didn't test, but I assume, speed difference between A and B is very small -anyway, I prefer B obtion.

Bogdan

mball

Thanks Bogdan I had dabbled with the -Exif flag but it didnt seem to help. My app builds a list of pictures from file, the pictures could be jpg but could also be a specific camera raw file, for this I use DCRAW to extract the file to a jpg but before I do that I extract the date and subject tags from the RAW file.

Its not exceedingly slow but I am sure it could be better, I will look into using the -stay_open flag and using an args file


mball

Quote from: mball on May 16, 2012, 11:43:58 AM
Thanks Bogdan I had dabbled with the -Exif flag but it didnt seem to help. My app builds a list of pictures from file, the pictures could be jpg but could also be a specific camera raw file, for this I use DCRAW to extract the file to a jpg but before I do that I extract the date and subject tags from the RAW file.

Its not exceedingly slow but I am sure it could be better, I will look into using the -stay_open flag and using an args file

Sorry, could you just give me a quick resume of how to use -stay_open, I am a bit confused as to how I can call exiftool for each of my pictures. At the moment I have a loop in my program that reads through the files one at a time and calls exiftool but presumably I wont be able to do that with -stay_open

Phil Harvey

Please let me know if I can clarify the exiftool documentation:

Quote1) Execute "exiftool -stay_open True -@ ARGFILE", where ARGFILE is
            the name of an existing (possibly empty) argument file or "-" to
            pipe arguments from the standard input.

            2) Write exiftool command-line arguments to ARGFILE, one argument
            per line (see the -@ option for details).

            3) Write "-execute\n" to ARGFILE, where "\n" represents a newline
            sequence.  (Note: You may need to flush your write buffers here if
            using buffered output.)  Exiftool will then execute the command
            with the arguments received up to this point, send a "{ready}"
            message to stdout when done (unless the -q option is used), and
            continue trying to read arguments for the next command from
            ARGFILE.  To aid in command/response synchronization, any number
            appended to the "-execute" option is echoed in the "{ready}" mes-
            sage.  For example, "-execute613" results in "{ready613}".

            4) Repeat steps 2 and 3 for each command.

            5) Write "-stay_open\nFalse\n" to ARGFILE when done.  This will
            cause exiftool to process any remaining command-line arguments
            then exit normally.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mball

Thanks, I had read that but I am still baffled about the arg file.
Do I create the arg file while looping through my pictures adding the args and filename to it then calling exiftool?

Sorry but not used to dealing with command line stuff

Phil Harvey

To take your specific example, and combine all of your common arguments into the initial command,  you could do this:

1) create an empty argument file /temp/my.args

2) execute this command from your application (in another thread so it stays running):

exiftool -stay_open True -@ /temp/my.args -common_args -fast2 -G -t -m -q

3) Write the following to /temp/my.args for each file you want processed:

FILENAME
-execute


4) Process the stdout from exiftool normally for each file up to the "{ready}" marker.

- Phil

P.S. Note that the -stay_open feature is only useful if you don't know the names of all files beforehand.  If you do know all of the names, then simply write them all to /temp/my.args before you start, and drop the  -stay_open and -execute options.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mball

Thanks for that, its much clearer now. I do know all the filenames beforehand so would I just create the args file with a filename on each line and then run exiftool with the arguments and point to the arg file, i.e.

exiftool -@ /temp/my.args -common_args -fast2 -G -t -m -q

BogdanH

#10
Hi Phil,
Above documentation is good enough... I mean, -stay_open isn't easy to explain and it took a while before I "got it"  :)
How I started was, I ran Exiftool from console window. That's the best way to understand -stay_open, because results (and mistakes) can be seen immediately. Once I understood it, I only needed to transfer that knowledge into my app -and if results weren't as expected, I knew only my app was resposible for that.
If I recall corectly, it goes like this:

1. create empty txt file, for example MyArgs.txt and save it.
2. Open console window and execue:
exiftool -stay_open True -@ MyArgs.txt
..and nothing will happen.
It can't, because ExifTool was only loaded in memory and is waiting to execute some command, which is expected to be inside MyArgs.txt file.
3. While console window is open, open MyArgs.txt file and write inside:
-ver
-execute

..and re-save the file. At the momment you do this, ExifTool will execute commands in that file and result will be written into console window.... and ExifTool keeps waiting for next command(s) inside MyArgs file.
4. Open MyArgs.txt file again, delete all lines and add* (for example):
-Exif:CreateDate
MyPhoto.jpg
-execute

..and re-save file again. And again, you'll get requested result from ExifTool. Repeat that as long you wish: ExifTool is just waiting for your commands  :)
When you're done, open MyArgs file for the last time, delete all lines and add* inside:
-stay_open
False

..and re-save file. Now, ExifTool will exit "waiting state" and you'll get prompt in console window again.

That's it (I hope I didn't make any mistakes above). Now it's up to programmer to "transfer" all that into it's own application. That is, don't assume how -stay_open works -tryout in console first.

Bogdan
* -Corrected (see Phil's post below) to avoid confusion.

mball

Thanks thats is a great explanation, i will be looking into this over the next couple of days. I will let you know how I get on

Phil Harvey

Bogdan,

Just one correction: Don't delete the existing lines for each new command.  ExifTool continues reading the ARGFILE from the last location, so your new lines will get missed if you delete the earlier ones.

But it sounds like mball doesn't need the -stay_open option anyway.  In fact, he probably doesn't even need the -@ option if all of the file names will fit on a command line (there is a command-line length limitation in Windows).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

BogdanH

Thank you for correcting me, Phil -as I wrote: tryout in console window first -I realy should follow my own advice  :)

Bogdan

mball

I have got my app producing a list of file names in a text file called Exif.args

I now want to parse each file in the same loop that creates the entry in Exif.args
After opening the Exif.args file I call EXIFTOOL with "exiftool.exe Exif.args -common_args -fast2 -G -t -m -q -Exif:DateTimeOriginal -Xmp:Subject"
Then I loop through the file names and add each file name to the Exif.args file

What command do I then use to get the data for the current filename so I can parse it?