Main Menu

Image Unique Identifier

Started by Skippy, August 13, 2015, 08:00:39 PM

Previous topic - Next topic

Phil Harvey

1 per second seems really slow.  On my system I get about 20x this speed.

I get the same speed when I do a straight copy of the files.

So the speed of ExifTool when writing is limited only by the I/O bandwidth of my system.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Skippy

Confirming the above point, exiftool writes about 10 images per second on my laptop.  The bottleneck was in ms-access and was caused by using findfirst on a large recordset.  I switched to using SQL instead of findfirst and got a massive jump in performance.

wayn0i

Quote from: Phil Harvey on August 13, 2015, 09:09:19 PM
I have suggested something like this in the past:

exiftool FILE -imageuniqueid=`exiftool FILE -all= -o - | md5`

This will work on Mac/Linux to add an MD5 checksum that depends only on the image.  I'm not sure how to accomplish this in Windows.

- Phil


Hi Phil,

I have replaced FILE with FOLDER and tried to add individual md5's to imageuniqueid tag for each contained image. It seems to hash the entire folder and add the same hash to each image.

Can you assist?

Wayne


Phil Harvey

Hi Wayne,

Sorry for the delay in responding, I've been away on vacation.

The command I gave will work only for one file at a time.  You would need to write a script to automate this for a whole folder.  Either that, or create a CSV file containing the MD5 for all files in the folder then use the exiftool -csv option to read the values from this file.  See the -csv option documentation for details.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Hayo Baan

Quote from: wayn0i on September 09, 2018, 12:54:28 PM
Quote from: Phil Harvey on August 13, 2015, 09:09:19 PM
I have suggested something like this in the past:

exiftool FILE -imageuniqueid=`exiftool FILE -all= -o - | md5`

This will work on Mac/Linux to add an MD5 checksum that depends only on the image.  I'm not sure how to accomplish this in Windows.

- Phil


Hi Phil,

I have replaced FILE with FOLDER and tried to add individual md5's to imageuniqueid tag for each contained image. It seems to hash the entire folder and add the same hash to each image.

Can you assist?

Wayne

Assuming you're on Linux/Mac, to perform this on every file in a folder is simply a matter of a for loop and/or a smart find command:

If FOLDER contains all files you want to run the command on:
for f in FOLDER/*; do exiftool $f -imageuniqueid=`exiftool $f -all= -o - | md5`; done

If the folder (also) contains subfolders you can use find:
find FOLDER -type f -exec perl -e 'system(qq(exiftool $ARGV[0] -imageuniqueid=`exiftool $ARGV[0] -all= -o - | md5`));'  {} \;
Hayo Baan – Photography
Web: www.hayobaan.nl

BC

Quote from: Phil Harvey on August 13, 2015, 09:09:19 PM
exiftool FILE -imageuniqueid=`exiftool FILE -all= -o - | md5`

I'd like to accomplish this using the Image::ExifTool module.  I am guessing that I could set all of the metadata to an empty hash and write to a temp file, and then read in and hash the temp file to get the desired value.  But it would be far better to just hash the image data while I have the file loaded in memory.  All of the documentation is about dealing with tags (of course) but I can't figure out how to access just the binary image associated with the object.

Phil Harvey

You can write to memory using Image::ExifTool, then use Digest::MD5 to get the MD5 of the image in memory:

use Digest::MD5;
$exifTool->WriteInfo($file, \$buff);
my $md5 = Digest::MD5::md5($buff);


If you want to write the MD5 to the file, then read the file into memory first, then write it twice -- once to clear the metadata, and a second time back to disk with the MD5 embedded.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).