Main Menu

Batch processing

Started by Skids, May 24, 2022, 07:55:11 AM

Previous topic - Next topic

Skids

Hi,
I am developing a utility that extracts words from filenames  and uses exiftool to write the words as keywords into dng files.  The utility is for my own use and is working but it is slow, no I mean really slow, probably due to the utility starting exiftool for every dng file.  So I am investigating how to speed things up.

Reading other posts on this site it seems that there are two possible ways of achieving faster processing.  The first is to only invoke exiftool once and keep it open with the -stay_open option between calls.  The second is to use an arguments file or possibly data.

At the moment I have very little idea how to use either method but I would like confirmation that one or both methods allow different commands to be actioned on each file name as each file in the list will have different keywords assigned.  I ask because so far most of the posts describe reading data from image files whereas I want to write to them.

best wishes

Simon

StarGeek

Quote from: Skids on May 24, 2022, 07:55:11 AMis working but it is slow, no I mean really slow, probably due to the utility starting exiftool for every dng file.

Yes, this is Common Mistake #3

QuoteThe first is to only invoke exiftool once and keep it open with the -stay_open option between calls.  The second is to use an arguments file or possibly data.

I'm not sure what you mean by "possibly data", but the -stay_open option requires use of the -@ (Argfile) option.  Otherwise, you could build up the entire list of commands, write them into a text file, and then call the text file with the -@ option by itself.

But if all you're doing is extracting words from the filename, then you should be able to do this in batch with just exiftool alone.  If you can figure out a regex that pulls the words from the filename, you could then copy that directly into the keywords tag.  If you can give an example filename, I can help create a command to do that.  Then you can just pass the directory to exiftool.  Or once that is created, you can just pass a list of all the files to process directly to exiftool, including piping the list from another command (see this post).

QuoteAt the moment I have very little idea how to use either method but I would like confirmation that one or both methods allow different commands to be actioned on each file name as each file in the list will have different keywords assigned.  I ask because so far most of the posts describe reading data from image files whereas I want to write to them.

Yes, it can be done.  As an example, Photool's Imatch runs four instances of exiftool in the background to read and write files.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Skids

Hi StarGeek and others,

Sorry I should have been clearer :  I am writing keywords or rather -xmp-dc:subject+=Keyword to dng files.

I have generated an output and can create a text file of commands which is working with a test batch of files.  I am uncertain if I am doing things correctly so here is a extract from my ArgsFile:
Quote-overwrite_original
-xmp-dc:subject+=MFA
-xmp-dc:subject+=People
-xmp-dc:subject+=Girton
-xmp-dc:subject+=SailingClub
-xmp-dc:subject+=ShortTelephotoLens
/Volumes/Image_Library_SSD/Test-Data-FileTags-To-Keywords copy/Parent/2008-05-04/2008-05-04-125324-DSC-1310-MFA-People-Girton-SailingClub-NIKOND70-123mm-ShortTelephotoLens-.DNG
-execute

-overwrite_original
-xmp-dc:subject+=Boats
-xmp-dc:subject+=Girton
-xmp-dc:subject+=Places
-xmp-dc:subject+=Sailing
-xmp-dc:subject+=Transport
-xmp-dc:subject+=SailingClub
-xmp-dc:subject+=NormalTelephotoLens
/Volumes/Image_Library_SSD/Test-Data-FileTags-To-Keywords copy/Parent/2008-05-04/2008-05-04-124830-DSC-1278-Boats-Girton-SC-Places-Sailing-Transport-SailingClub-NIKOND70-158mm-NormalTelephotoLens-.DNG
-execute

-overwrite_original
-xmp-dc:subject+=Girton
-xmp-dc:subject+=SailingClub
-xmp-dc:subject+=NormalTelephotoLens
/Volumes/Image_Library_SSD/Test-Data-FileTags-To-Keywords copy/Parent/2008-05-04/2008-05-04-123211-DSC-1216-Girton-SailingClub-NIKOND70-300mm-NormalTelephotoLens-.DNG
-execute

I have saved the commands to text file and it refers to 171 dng files.  I run the batch commands from Apple Terminal with the following exiftool -@ /Users/skids/Pictures/Test-SidecarSync/LargeBatchB.txt. It appears to work but I do have some questions.  The first is what is the difference between an option and an argument ?  I ask because I have read that the file should have a single argument per line but that options should be placed on a single line.

Each of the -execute lines cause the previous code block to execute and a result is passed back to the Terminal window.  It seems that this printing to screen slows things down so wonder if there is an option to suppress the reporting.

Calling exiftool 171 times once per image took 89 seconds.  Running this batch file took approximately 14 seconds

Lastly is there a way of passing the text of the file straight to exiftool without having to create a file ?



Thanks
Simon

StarGeek

Quote from: Skids on May 24, 2022, 11:14:47 AMThe first is what is the difference between an option and an argument ?  I ask because I have read that the file should have a single argument per line but that options should be placed on a single line.

An argument is a term used in describing the command line, any command line, not just Windows.  Anything that appears after the program name is an argument.  They are separated by spaces and quotes or escape characters (for example, backslashes on Mac/Linux) are needed to include a space in an argument.

Options are part of exiftool.  Anything on in the Documentation is an option.  An option may required more than one argument.  For example, the -sep option requires a second argument consisting of the separating character.

So -sep ", " is an option consisting of two arguments, -sep and ", ", which need to be on separate lines in an ArgFile.  The first is from exiftool, the second is part of the underlying command line.

QuoteEach of the -execute lines cause the previous code block to execute and a result is passed back to the Terminal window.  It seems that this printing to screen slows things down so wonder if there is an option to suppress the reporting.

Try the -q (quiet) option.

QuoteLastly is there a way of passing the text of the file straight to exiftool without having to create a file ?

The "see this post" link I gave above shows you how.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Skids

Thanks for the explanation of Args and Options.

I did follow the link but I afraid I'm not sure what it means:
Quote@StarGeek: Good idea.  You could do it without a temporary file, like this:

find . -iname "*.jpg" -newermt 2021-10-31 | exiftool -globaltimeshift -1 '-filemodifydate<datetimeoriginal' -alldates -@ -
So is the find command searching for all .jpg files newer than 2021-10-31?  I guess it creates a list of files which is then passed to the -@ - and is actioned ?

best wishes
Simon

StarGeek

Quote from: Skids on May 24, 2022, 12:49:05 PM
So is the find command searching for all .jpg files newer than 2021-10-31?  I guess it creates a list of files which is then passed to the -@ - and is actioned ?

Yes, exactly that.  It uses the linux command find to create a list of files and those files are directly piped into exiftool with -@ -.  This avoids the step of creating a temp file. The hyphen by itself represents reading from STDIN.

Using find is a common way on Linux/Mac to create a list of files and execute a command on them, but most of the time people will loop through the list and call exiftool once for each file.  The above example is much quicker.

See Piping examples for similar examples.  I regularly use the cURL example there to check metadata on files directly from websites, avoiding the step of downloading the files, saving to disk, running exiftool on it, and then deleting the file afterwards.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype