ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: dbqandersons on November 05, 2024, 08:43:26 PM

Title: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 05, 2024, 08:43:26 PM
So, I've got another script (bash running on Ubuntu) that dumps image data into a flat file for comparison against image data stored in a database: File Path, File Name, Unique Image ID, and File md5sum.

As you can see by my loop structure, I'm calling exiftool once per image.  Icky. 

for FILE in `cat ${LIST_FILE}`
do
  echo -n $FILE | rev | cut -d "/" -f2- | rev | tr '\n' ' ' | sed -e 's/ $//' >> $OUTPUT_FILE
  echo -n "|" >> $OUTPUT_FILE
  echo -n $FILE | grep -o '[^\/]*$' | tr -d '\n'>> $OUTPUT_FILE
  echo -n "|" >> $OUTPUT_FILE
  echo -n "`exiftool -S -s -imageuniqueid $FILE`" >> $OUTPUT_FILE
  echo -n "|" >> $OUTPUT_FILE
  echo "`md5sum $FILE | awk '{print $1}'`" >> $OUTPUT_FILE
done


Looking to squeeze some more performance out of this thing. Any suggestions to keep exiftool open through the run of the script so it doesn't have to load once per image?  I know I can get exiftool to spit out the file name as well as the Unique Image ID, but I'm not sure how to grab the file path and md5sum and munge everything into a single line per image while not doing it image by image.

Cheers,

Bill
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: Phil Harvey on November 05, 2024, 08:58:21 PM
Hi Bill,

I'm thinking this may all be done in a single command with the -p option of ExifTool :) :) (https://exiftool.org/#awk).

The command will likely be something like this:

exiftool -@ $LIST_FILE -p fmt.txt >> $OUTPUT_FILE

Although I can't say exactly what your fmt.txt file will be because I don't have time to figure out what all of your awk/sed/grep shenanigans are doing, but as an example, the md5sum part could be based on something like this (and you could add appropriate regular expression substitutions to match your awk command):

${filepath;$_=`md5sum "$_"`}

- Phil
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 06, 2024, 09:09:17 AM
Thanks Phil; I'll read up on the -p and fmt.txt usage and give that a try.

As for my awk/sed/grep shenanigans, no worries; I can't figure them out half of the time myself! 

Thanks!

Bill
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: StarGeek on November 06, 2024, 10:01:18 AM
I don't know if this might interest you, but exiftool can do an md5 hash of just the image data with the ImageDataHash tag. This is different from using md5 because any edit of the metadata will change the md5 hash, but the ImageDataHash will stay the same.
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 06, 2024, 10:05:01 AM
I did see that as I was sifting through the documentation. Not sure if it'll help me in my use case, but I'll keep it in the back of my mind.

Thanks,

Bill
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 06, 2024, 06:23:12 PM
So, here's what I came up with that seems to be working pretty well.

exiftool -q -q -p ./exiftool-daily-dump-format.txt `cat /tmp/list-of-images.txt`

Here's the content of the format file.

${filepath;$_=`echo -n "$_" | rev | cut -d / -f2- | rev | tr '\n' ' ' | sed -e 's/ \$//'`}|${filename}|${imageuniqueid}|${filepath;$_=`md5sum "$_" | head -c 32`}

One more question I do have. I noticed that if a path to a file is given as a symlink, then the -filepath is written as the "true" path. 

In this example: /var/www/dev/photos/ is a symbolic link to /sftpjail/dev/photos/

$ exiftool -filepath /var/www/html/dev/photos/bill1.jpg
File Path                       : /sftpjail/dev/photos/bill1.jpg
$


Is it possible to change this behavior with a flag or something? If so, I haven't found it yet.

Thanks,

Bill
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 06, 2024, 06:50:45 PM
NVM, I added a little more insanity to my formatting file.

${filepath;$_=`echo -n "$_" | rev | cut -d / -f2- | rev | tr '\n' ' ' | sed -e 's/ \$//' | sed -e 's/sftpjail/var\\/www\\/html/'`}|${filename}|${imageuniqueid}|${filepath;$_=`md5sum "$_" | head -c 32`}

thanks for the help, gents.  I appreciate it.

Cheers,

Bill
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: Phil Harvey on November 06, 2024, 08:25:04 PM
You can use Directory and FileName instead of FilePath if you don't want the full path name.

Also, I would suggest using Perl regular expressions instead of running so many external commands.  But hey, I know you're more familiar with those.  Generally, I would avoid running external commands unless absolutely necessary.

- Phil
Title: Re: Building a Flat File with Exiftool Data and Other Data
Post by: dbqandersons on November 07, 2024, 10:53:53 AM
So yeah.

Duh on my part for not knowing about and/or finding on my own the -Directory tag/field. Working in IT (primarily UNIX) for 25+ years, I'm very much an RTFM kind of guy and in this case I didn't RTFM enough (or not the right parts of the M, anyway)!  :-[

New formatting file.

${directory}|${filename}|${imageuniqueid}|${filepath;$_=`md5sum "$_" | head -c 32`}

I did consider using the -imagedatahash, but decided to take the overhead hit and get the full file md5sum (at least for now).  Either way, my performance has definitely improved from where I was before.

Thanks as always,

Bill