ExifTool Forum

ExifTool => Newbies => Topic started by: Iwonder on June 15, 2024, 06:43:23 AM

Title: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 15, 2024, 06:43:23 AM
Hello,
is it possible to run exiftool  only once with two format files and one output for each ?
instead of this :
exiftool -T -L -p print01_fmt.txt *.jpg > output01.txt
exiftool -T -L -p print02_fmt.txt *.jpg > output02.txt
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 15, 2024, 12:01:38 PM
Using the -execute option (https://exiftool.org/exiftool_pod.html#execute-NUM) on only two command like this won't save much time. Exiftool will still have to process all the files twice. The time saved over running two separate commands will be less than a second.

The main use of -execute is to keep exiftool running when you have a lot of individual commands to run on a lot of different files.

Otherwise the command to combine them might be something like this.  It would require the use of the -w (-TextOut) option (https://exiftool.org/exiftool_pod.html#w-EXT-or-FMT--textOut) instead of the file redirection > because redirection is handled by the command line, not exiftool and will not work on two separate file.
exiftool -p print01_fmt.txt -w+ output01.txt -execute -p print02_fmt.txt -w+ output02.txt -common_args -T -L *.jpg

Note I have not tested this, so there might be an error.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 20, 2024, 01:00:59 AM
Thank you StarGeek
With each post I know a little more how Exiftool works.
I noticed earlier, on the forum, command lines without the > redirecting, but I didn't understood  :)
Talking about time :
My friend runs a command to extract tags in a text output, acting on a folder containing about 20.000 photos, and he says it takes about 3 hours to finish the job. Is it ok? not too long? (I suggested him to use -fast1 and -fast2 but it didn't make it faster)
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 20, 2024, 01:50:56 AM
Is it a single command or are they running it once on each file.  If the latter, then that is Common Mistake #3 (https://exiftool.org/mistakes.html#M3).

I have a bat file that I picked up somewhere that will list the time it takes for a command to run. I just ran a basic exiftool command (FAQ #3 command) to list all the data in 28K+ files and the result was this, with the last line being the output from the bat.
   36 directories scanned
28383 image files read
command took 0:18:8.37 (1088.37s total)

So to me, it sounds like they're running it once per file. In your case, you're only running two commands, which is why I say it's not worth it to make a complex command that uses execute. But running exiftool 20k+ times, the startup time adds up.

Post from Phil (https://exiftool.org/forum/index.php?msg=6121) on the subject from the early days of this forum. And the blog post mentioned (https://web.archive.org/web/20120223091305/http://www.christian-etter.de/?p=458) via Archive.org
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 23, 2024, 04:19:47 AM
@StarGeek
Thanks for your answer !

I was not talking of running those 2 commands on 20.000 files. We only have to make them run on a few hundreds files and it's ok, and fast enough.

It was about another exiftool command (executing only once), extracting less than 10 tags, with a format file, from 20.000 photos, to a text file.
I attached batch and format files, if you want to have a look.
Do three hours seems too long to you ?


(Sorry, the Phil's post is too complicated for me to understand, but it seems to concern the -execute)
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 23, 2024, 10:56:12 AM
Is this on a local drive or over a network? Because that will affect the speed.

When editing, Exiftool should take only a little bit more time than it would be to copy all the data. Just listing data like your bat file should take even less time, as shown by my 18 minute result. So a 3 hour result indicates to me that this is over a network, as my drives are all slow (5400 rpm, WD blues/Seagate BarraCudas) and took significantly less time in my test.

I'm setting up a test directory with the 28k files and copying them with Teracopy with verification on. It will take less than an hour to do so with Teracopy and verification on (so it will reread the file after the copy). Then I'll set up the files with random data in the tags used in your format file and time that. Finally, I'll run your bat file and time that.  I'll let you know the results when done.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 24, 2024, 10:56:21 AM
Ok, I stand corrected. Using your FMT file, it took most of the day to run the single command on my test setup. I have slow drives, so that didn't help.

I'm not sure why it took so long. Maybe the large FMT file. I didn't time individual points but it seems the first 1,000 didn't take too long, but it seemed to slow down as time went on.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Phil Harvey on June 24, 2024, 11:25:16 AM
If it got slower as more files are processed, then memory is likely an issue because the Perl memory garbage cleanup is fairly time consuming.  Optimizing memory usage should help.  You can do this by adding -api ignoretags=all to the command so only the tags in the .fmt file are extracted.  From the last paragraph in the -p option documentation:

            Note that the API RequestTags option is automatically set for all
            tags used in the FMTFILE or STR.  This allows all other tags to be
            ignored using -API IgnoreTags=all, resulting in reduced memory
            usage and increased speed.

Also, the -fast1 option is doing nothing since you are also specifying -fast2.

- Phil
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 24, 2024, 12:32:09 PM
I was going to check with the -api IgnoreTags option (https://exiftool.org/ExifTool.html#IgnoreTags) when I got a chance, but I didn't want to possibly raise expectations too much beforehand.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 27, 2024, 12:39:55 PM
Hello
@StarGeek, @Phil
Quoteit seems the first 1,000 didn't take too long, but it seemed to slow down as time went on
Yes ! My friend noticed that too !
Its running on a local PC

Thank you for your api option Phil and StarGeek for your prompt answer and your test !
I'll give you feedback about this !
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 27, 2024, 12:53:46 PM
I'm assuming that you have some ported linux progams instealled due to the use of sed/sort/uniq.  You might check to see if you have the split program. You could then dump all the filenames to process into a text file, use split to separate that into, say, 1,000 line batches, and then use the -@ (Argfile) option (https://exiftool.org/exiftool_pod.html#ARGFILE) to process each batch:
exiftool -T -L -m -progress -ext jpg -r -sep ## -p print03_fmt.txt -@ Splitfile1.txt >> tout01.txt
Here I used >> to append the redirected text instead of overwriting it.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 28, 2024, 10:09:50 AM
1/@StarGeek :
Sorry but I don't understand what should Splitfile1 should contain. I don't know yet how the split unix command works.
And I don't see where you are using it...

2/ @StarGeek @Phil
testing the -api ignoretags=all :

3h30min to complete the 20.000 files :

After running 1 hour : 12 000 files were treated
After 13.000 files procecessed, processing is severely slower
After 14.000 files processed, about one file/second is processed
After 16.000 files processed, slowing-down is higher

2.5 hours to process 18.000 files
1 more hour to process the last 3.000 files

Then it's not better  :(



Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 28, 2024, 01:38:45 PM
Quote from: Iwonder on June 28, 2024, 10:09:50 AM1/@StarGeek :
Sorry but I don't understand what should Splitfile1 should contain. I don't know yet how the split unix command works.
And I don't see where you are using it...

I didn't give an actual example, it was just an idea. I don't know the actual commands well enough to give an exact process (actually, using ChatGPT, I figured this out). But my basic thought would be

1. Use find (unix version) to get a text file with all the filenames. ChatGPT came up with this
find /path/to/files/ -type f \( -iname "*.jpg" -o -iname "*.jpeg" \) >temp.txt
This took less than a second. Exiftool was significantly slower for just listing all the files.

I'm not sure what the proper command for the Windows version of find would be. And Windows will use that by default.

2 Use split to split the output into 1,000 line batches. Again, ChatGPT and some other searches
split --additional-suffix=.txt -d input.txt OutputList
This requires split  8.16+.  I'm using the MSys2 ports (https://www.msys2.org/) and that has version 8.32

I was able to combine them with a pipe
find /path/to/files/ -type f \( -iname "*.jpg" -o -iname "*.jpeg" \) | split --additional-suffix=.txt -d - OutputList
This resulted in 29 text files, OutputList00.txt to OutputList28.txt

3 Next, run exiftool on each of these output files. I really dislike trying to figure out Windows BAT file looping, and since I'm already using the Linux find command, this can be done with the -exec option

find . -maxdepth 1 -type f -name "OutputList*.txt" -exec exiftool -T -L -m -progress -ext jpg -r -sep ## -p print03_fmt.txt -@ {} ; >> tout01.txt

I used -maxdepth 1 to prevent find from looking for more OutputList files in the subdirectories. Probably not needed, but might as well be safe about it.

I just ran this sequence and it took 42 min, 42 seconds for 28.3K files

Doing some more pipes, combining the sed commands, and using sort's -u instead of uniq and here is the BAT file I ended up with

find ./speedtest/ -type f \( -iname "*.jpg" -o -iname "*.jpeg" \) | split --additional-suffix=.txt -d - OutputList
find . -maxdepth 1 -type f -name "OutputList*.txt" -exec exiftool -T -L -m -progress -ext jpg -r -sep ## -p print03_fmt.txt -@ {} ; >> tout01.txt
sed -e "/^-/d" -e "s/\#\#/, /g"  tout01.txt | sort -u >tout05.txt
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on June 28, 2024, 03:46:45 PM
waouh !!! this sounds great !
I'll have a try !

(I didn't think about ChatGPT, because I was very very very disappointed in the past for a specific question about a .dotm file : it never succeed...)
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on June 28, 2024, 04:55:15 PM
Yeah, ChatGPT is hit or miss, and you often have to test and double check. But for basic Linux commands, it does pretty well.  And I've had good results with ffmpeg commands as well.

Actually, it also did well creating a simple GreaseMonkey script for me, as well as a simple AutoHotkey script.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on July 03, 2024, 11:32:46 AM
@StarGeek
Hello
coming back with the results !
Although I didn't manage this running all this in only 3 ligns with my environment(but it's not important), the The whole process only lasted... 15 minutes instead of 3.5 hours !!!!!
This is amazing !!!
Thank you so much !
One more question about this command line : Could you explain this -@ {} ?
On the help file I only can see that -@ is used for introducing an argument file, but there is no ARG file here...

Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on July 03, 2024, 11:51:38 AM
Hmmm... looking at it now, I think that I made a mistake in adding the -@. For some reason, I was thinking that find would be piping the data. The -@ option can read data from a pipe, redirection, or STDIN, but that would have to be -@ - and I think find is directly providing the file list, which is inserted by find at the { }.

As a result, I think your processed list is one file short, as the -@ option would try to read the first file as an ARGS file.  Try running it again and dropping the -@.


Never mind, see below
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on July 03, 2024, 01:24:35 PM
in fact I tried omitting -@ {}
but it doesn't work, giving me an help page for ExifTool instead
but I don't know how to comment this
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on July 03, 2024, 02:01:55 PM
Yeah, it's needed.

Find is looking for the 1,000 line file list text files and exiftool is using -@ to read each of those for the list of files to process. I was taking a nap and remembering this woke me up to come make this post :D
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on July 03, 2024, 02:27:32 PM
thank you !
Hope you went back to your bed, with a rest mind ^^
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Phil Harvey on July 03, 2024, 03:32:55 PM
Yes, -@ shouldn't be there.

Also, your "find" is looking for .txt files, but your exiftool command processes only .jpg files.

And I needed quotes around the semicolon because "find" needs it to terminate the arguments, and without quotes it was eaten by the shell.

Other than that, the command worked for me.

- Phil
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on July 03, 2024, 04:51:19 PM
No, -@ does need to be there. Find isn't feeding file names directly to exiftool. It's giving exiftool text files with 1,000 filepaths per text file.

Back in this post (https://exiftool.org/forum/index.php?msg=86897), I used find to gather all the filenames to be processed. It takes exiftool several minutes to generate a list for 28,000+ files, even with -fast5, while find was able to generate the list in a couple seconds.

Then split is used to split the results of that find into separate files with 1,000 filepaths per file named "OutputList##.txt"

find is used again to gather the names of each of these "OutputList##.txt" files and that is what is passed to exiftool with the -@, running exiftool once per "OutputList##.txt" file.

I don't know where the slow down is, maybe memory management like you said, but my 28K test directory was running for over 6 hours before I shut it down, while running it in 1,000 file batches only took 42 minutes.  And @Iwonder says splitting like this takes only 15 minutes instead of 3.5 hours.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Phil Harvey on July 03, 2024, 09:25:19 PM
Quote from: StarGeek on July 03, 2024, 04:51:19 PMNo, -@ does need to be there. Find isn't feeding file names directly to exiftool. It's giving exiftool text files with 1,000 filepaths per text file.

Ah, sorry.  I missed that.  I didn't read the whole thread.

Quote@Iwonder says splitting like this takes only 15 minutes instead of 3.5 hours.

When I get a chance I should look into this.

- Phil
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on July 03, 2024, 10:38:37 PM
Quote from: Phil Harvey on July 03, 2024, 09:25:19 PMWhen I get a chance I should look into this.

My first thought is that it's a problem with the Windows version, but I just realized I'm not sure what OS @Iwonder is using. For some reason I was thinking Windows, but Linux commands are listed.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on July 04, 2024, 09:44:48 AM
@all

Yes I'm using Windows 10, with some .exe coming from UnixUtils for Windows :)
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: StarGeek on July 04, 2024, 11:04:17 AM
Regarding UnixUtils, you might want to take a look at MSYS2 (https://www.msys2.org/). MSYS2 versions are more up to date than UnixUtils.
Title: Re: -execute and 2 different format files and 2 output files ?
Post by: Iwonder on July 06, 2024, 04:09:52 AM
@StarGeek
thank you for this suggestion.
if I knew this at the begining of this project, I surely would have used it.
have a nice day !  8)