ExifTool Forum

ExifTool => Developers => Topic started by: sebutzu on June 17, 2025, 04:05:24 PM

Title: Perfomance question
Post by: sebutzu on June 17, 2025, 04:05:24 PM
I am comparing the performance of exiftool (with ImageDataHash computation) with a brute-force read all file and compute md5 (done in C#). Somehow in my case exiftool is at least 5 times slower (on multiple machines). I am just wondering, I assume exiftool reads just the bytes needed most of the times, doing more disk seeks, would it not help maybe in case of ImageDataHash to just read buffered the entire content of the file, and then do the rest of the processing on that buffer instead.
Also another question, I am running this on windows, any ideas if running it on linux would work much faster?
Title: Re: Perfomance question
Post by: StarGeek on June 17, 2025, 05:34:28 PM
What is the command you are using? Make sure you're not looping exiftool, running it once per file. See Common Mistake #3, "Over-scripting" (https://exiftool.org/mistakes.html#M3).
Title: Re: Perfomance question
Post by: Phil Harvey on June 17, 2025, 09:31:01 PM
ExifTool is extracting all the rest of the metadata as well. Adding -api ignoretags=all will help.

- Phil
Title: Re: Perfomance question
Post by: sebutzu on June 18, 2025, 04:45:14 PM
I use a command like:
-struct -m -q -q -charset filename=UTF8 -d "%Y.%m.%d %H:%M:%S" -c "%.8f" -a -use mwg -api largefilesupport=1 -s -P -api structformat=JSONQ -j -G:0:1 -all -ImageDataHash -n  "C:\Work\test photos\2011-12-06 11.08.37 0001.jpg" "C:\Work\test photos\2014-02-13 18.26.34 0001.mp4" "C:\Work\test photos\2022-03-05 12.16.14 0001.jpg" "C:\Work\test photos\2022-03-05 12.16.16 0001.jpg" "C:\Work\test photos\2022-03-05 12.16.18 0001.jpg" "C:\Work\test photos\2022-03-05 12.16.19 0001.jpg" "C:\Work\test photos\2022-03-05 12.16.28 0001.jpg" "C:\Work\test photos\2022-03-05 18.25.58 0001.jpg" "C:\Work\test photos\2022-03-05 18.25.59 0001.jpg" "C:\Work\test photos\2022-03-05 18.26.03 0001.jpg"

usually with 125 files or so.

I do need to extract all the other tags as well...so I don't want to use -api ignoretags=all

I can imagine doing the ImageDataHash is slow (because it needs almost all the file content), but I did not expect it to be like 5 times slower than a full md5 hash.

Am I doing something wrong here?

Does exiftool work on multiple threads?
Is there a way to speed up this?
Title: Re: Perfomance question
Post by: Phil Harvey on June 18, 2025, 09:09:21 PM
You can run as many instances of ExifTool as you want simultaneously.

- Phil