Adding new tags to jpegs

Started by Skippy, October 06, 2015, 06:58:38 AM

Previous topic - Next topic

Skippy

I have been adding ImageUniqueID tags to jpeg images that do not have them to start with.  To me it seems that when Exiftool updates a jpeg, it reads the orginal exif data into its internal data structures, then rewrites the exif data from those structures.  What was a camera manufacters exif data block becomes replaced with the exiftool substitute, which lacks the empty spaces that were in the original.  As the whole image file is rewritten, the writing process is quite slow. 

My question is are there any (fixed length?) tags that can be written to which do not require rewriting the entire file.  I am not sure how filing systems work in fine detail, but database files are certainly capable being edited a little bit at at time without need to process the whole file. 

I ask the question as I am looking for performance gains. 

Phil Harvey

#1
Yes, ExifTool does rewrite the metadata of images, but this isn't the reason why it is slow.

On my system, ExifTool writes images at the same speed that I can copy them using the system itself.

So given that ExifTool rewrites the entire file, the performance is limited only by the system I/O bandwidth.  And if you are adding a tag, you must (in general) rewrite the entire file.  (The only way around this would be to utilize some of the empty space you mention, but in most cases there wouldn't be enough empty space to store an ImageUniqueID value, so this is not a viable solution.)

This is assuming that you avoid the startup overhead of launching ExifTool by either running in batch mode or using the -stay_open feature.  If you launch ExifTool for each processed file, the startup overhead is significant and performance will suffer.

- Phil

Edit:  I just ran a speed test on 664 JPEG images from a wide variety of Canon cameras.  Here are the results:

19.11 seconds to copy the files directly using the system

19.51 seconds to rewrite the files with ExifTool, adding ImageUniqueID to each

That's an average of about 34 files/sec.  On my system, the startup overhead for launching ExifTool is about 0.25 seconds, which accounts for most of the difference between these two times.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Skippy

#2
Thanks Phil,

This basic information is really good to know.  I watch my system performance through task manager and for some reason, it is not stellar.  I did see exiftool shifting data to/from my core i7 laptops internal hard drive data peaking at a rate of about 15 MB/S which is OK but not great by modern standards.  When writing to an SD Card it was about 3.5 MB/S.  When backing up to external hard drives, I get up to about 30 MB/S but when exiftool reads lists of files it is about 7 MB/S, so this is the main constraint.  It seems that disk access is the constraint.

I sort the file list that I sent to exiftool by folder and by file name as when the data was originally written to disk, it would have been in that order. When I run the exiftool batch process, I don't feel the heads moving around in the drive a great deal, so I think that helps.