passing filename to piped command | md5sum

Started by luusac, April 23, 2023, 02:43:17 PM

Previous topic - Next topic

luusac

Is it possible for exiftool to pass the image filename to an external program - in this case md5sum?

So, what I want to do is something like:

exiftool -all= -o - image.jpg | md5sum > filename.md5
so I would end up with image.md5 containing the hash of image.jpg (where the hash doesn't contain metadata of image.jpg).

Even better, to be able to get exiftool to recurse through directories and write md5 hashes for all matching files in a -ext jpg -r . kind of way as separate .md5 files where .md5 filename matches the image filename.  I also tried exiftool -all= -o - image.jpg | md5sum > %%f.md5 results in a file called %%f.md5 containing the hash.
thanks

Phil Harvey

What about using the new ImageDataMD5 tag?:

exiftool -imagedatamd5 -s3 -w %d%f.md5 DIR

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

luusac

That's great, I hadn't seen that, thank you.  This looks like it will achive exactly what I was looking for!  Is there any further mention in the documentation of this feature other than the change logs and the Extra Tags page?

As a more general point, without this recent addition to ExifTool, would it have been possible to pass (forgive if I am using the wrong terms) ExifTool placeholders/variables like file and directory names, tag contents etc to an external command after a pipe or redirect as I was trying to do above from within an ExifTool commandline, rather than restorting to e.g. dynamically building a command line in a shell script?

Phil Harvey

The Extra tags page is the documentation for this.

Quote from: luusac on April 26, 2023, 07:32:26 AMwould it have been possible to pass (forgive if I am using the wrong terms) ExifTool placeholders/variables like file and directory names, tag contents etc to an external command after a pipe or redirect

There are many ways to do this.  One cool way would be to use the ExifTool -p option to build up the arguments for a command.  Here is an over-simplified example:

md5 `exiftool -p '$filename' a.jpg`

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

luusac

Thanks for your help Phil.  Once I used -imagedatamd5 the hash is displayed, but afterwards it doesn't show up with -all, so I am assuming that the hash isn't ever stored in the metadata once computed.  I see here that I could store the hash in the metadata with or without a config file.  But as the hash generation is now incorporated into exiftool (I am assuming the code for this is in exiftool itself, rather than relying on an external md5 hashing implmentation), I wondered if there was also an 'ExifTool standard' (as opposed to Exif standard) way of storing the hash in the metadata rather than to file or computing it every time it is required?

StarGeek

Quote from: luusac on April 28, 2023, 06:16:02 AMOnce I used -imagedatamd5 the hash is displayed, but afterwards it doesn't show up with -all, so I am assuming that the hash isn't ever stored in the metadata once computed.

From the notes under the ImageDataMD5 tag:
    Generated only if specifically requested

If you wish to store the MD5 in the file, you can use the OriginalImageMD5 tag
exiftool "-OriginalImageMD5<ImageDataMD5" /path/to/files/

The OriginalImageMD5 is an exiftool specific tag and probably won't show up in other programs such as LightRoom unless you view the raw XMP.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

It takes time to calculate the hash, so it isn't generated by default because that would slow things down for everyone else.

You can add -api requesttags=imagedatamd5 to request that it be extracted with the other tags.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

luusac

I have only tried this out on a couple of jpegs so far due to lack of time, but one thing that I noticed is that there is an inconsistency in the amount the file increases in size when the hash is stored in exif.  Storing the hash in one image increased the size by 3kb (which I thought rather a lot assuming all that is being stored is the hash and tag name), but on the other image only increased the size by 1kb.

To try and figure out whether there was anything else being stored, and why there was a difference in file size I took an unimportant jpg image (122kb) and stored the hash, which increased the image size to 123kb. Then took the jpg_original, removed tags with -all= (file size now 88kb) and stored a new hash increasing the size to 91kb.  Is the 3kb increase in size just overhead, caused by creating the exif structures to store the tags/hash in?  I think when I noticed this the other day the files I was trialling this with did have (very limited) exif data in already, excluding the File System info (date/times etc) there were probably only 10-15 other tags, and this was the image where I noticed a 3kb increase.

The filesize increase isn't important, I was just wondering where the inconsistency is coming from any why storing a simple hash would take up 3kb?


StarGeek

The OriginalImageMD5 tag is not an EXIF tag, it is an XMP tag (All EXIF data is metadata, but not all metadata is EXIF data).

I'm guessing that your original file didn't have any XMP tags.  By default, exiftool adds 2k of padding to XMP as is suggested by the specs.  You can change this with the -api Compact option.  Specifically, -api Compact=NoPadding

There are other options for -api Compact that can reduce this even further but some badly written programs, especially older ones, may not be able to read the more compact structures.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

luusac

Thanks StargGeek.  As I said the extra 2kb are not relevant for me, but I was just curious as to why the difference in filesize after the same operation applied to two different files, but your explanation is succinct and informative!  I wasn't aware that the terms exif and metadata in the context of image files are not synonymous.  Interesting.  Thank you.

StarGeek

Metadata is complex.  If you look at the Tag Table Index, each of those links is a different group of metadata. And each my have dozens or even hundreds of individual tags as part of that group. All totally over 26,000 different tag at the moment.

As you can see, EXIF data is only one of these groups.

EXIF data gets thrown around as a synonym for metadata and for people who aren't looking at the details, that's good enough.  But when you do get down to editing the individual tags, then it's better to be very clear on the group.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

luusac

#11
I am failing to store the md5 hash in the XMP tag and as a hash file in the same operation.  I have tried commands like

exiftool "-OriginalImageMD5<ImageDataMD5" -ImageDataMD5 -s3 -w %d%f.md5 .\IMG_0836.jpg and
exiftool "-OriginalImageMD5<ImageDataMD5" -OriginalImageDataMD5 -s3 -w %d%f.md5 .\IMG_0836.jpg

but either get a hashfile or the hash in the Original Image Data XMP tag, but not both in a single operation.  I have tried -execute, but can't get the syntax right.  What would be correct way to store the hash in OriginalImageData then read it (so as not to generate the hash again) and store it in a hashfile (same name as image, but with .md5 extension) in the same commandline? Many thanks

StarGeek

Quote from: luusac on May 15, 2023, 01:19:19 PMI am failing to store the md5 hash in the XMP tag and as a hash file in the same operation.  I have tried commands like
<snip>
but not both in a single operation.

You can't write to two different files with the same command.  It will have to be two separate commands.

QuoteI have tried -execute, but can't get the syntax right.

Using the -execute option will not save you any time.  Exiftool will still make two separate passes over the files, the same as running two separate commands.  Using -execute adds needless complexity and makes troubleshoot more difficult.

QuoteWhat would be correct way to store the hash in OriginalImageData then read it (so as not to generate the hash again) and store it in a hashfile (same name as image, but with .md5 extension) in the same commandline?

exiftool "-OriginalImageMD5<ImageDataMD5" .\IMG_0836.jpg
exiftool -OriginalImageMD5 -w %d%f.md5 .\IMG_0836.jpg

Example:
C:\>exiftool -G1 -a -s -ImageDataMD5 Y:\!temp\aaaa
======== Y:/!temp/aaaa/IMG_0836.jpg
[File]          ImageDataMD5                    : 91dfe82cc394fdf70c3d0a2b882bea34
======== Y:/!temp/aaaa/IMG_0837.jpg
[File]          ImageDataMD5                    : 2f7764d40595719f7d14f7a911d00844
    1 directories scanned
    2 image files read

C:\>exiftool -P -overwrite_original "-OriginalImageMD5<ImageDataMD5" Y:\!temp\aaaa
    1 directories scanned
    2 image files updated

C:\>exiftool -OriginalImageMD5 -w %d%f.md5 Y:\!temp\aaaa
    1 directories scanned
    2 image files read
    2 output files created

C:\>type Y:\!temp\aaaa\IMG_0836.md5 Y:\!temp\aaaa\IMG_0837.md5

Y:\!temp\aaaa\IMG_0836.md5

Original Image MD5              : 91dfe82cc394fdf70c3d0a2b882bea34

Y:\!temp\aaaa\IMG_0837.md5

Original Image MD5              : 2f7764d40595719f7d14f7a911d00844
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

luusac

#13
ok, noted, thank you.  How can I get exiftool to generate a 'traditional' md5 file?
What I am getting from using
exiftool -ext jpg -r -ImageDataMD5 -w %d%f.md5 . is a file containing something like like:
Image Data MD5                  : 77275908c85fbdexf360b9f3d6ca5b02whereas a traditional .md5 file may look like:
e05dbdfac0826532a60b429a602e1100  image.jpg
I have failed succeeded in finding syntax for -p that works.  This is my attempt:
exiftool -ext jpg -r -p "${ImageDataMD5}  ${Filename}" -w %d%f.md5 .Is there a better/more correct way of doing this?

How would I go about getting exiftool to generate one file per directory containing all of the hashes of the matching files (e.g. -ext jpg) in that directory?

StarGeek

Quote from: luusac on May 17, 2023, 09:56:05 AMI have failed succeeded in finding syntax for -p that works.  This is my attempt:
exiftool -ext jpg -r -p "${ImageDataMD5}  ${Filename}" -w %d%f.md5 .Is there a better/more correct way of doing this?

This is what I would have suggested.

QuoteHow would I go about getting exiftool to generate one file per directory containing all of the hashes of the matching files (e.g. -ext jpg) in that directory?

Per directory?  I don't think so. Note number 2 under the -w (-TextOut) option shows how to make a single file out of multiple ones, but I don't recall offhand any way of making a single file per directory.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype