Compare the Metadata of two Files

Started by JanK, May 01, 2011, 07:22:22 AM

Previous topic - Next topic

JanK

Is there an option to compare the Metadata of two files (jpg)?

When I manipulate a file for example with Adobe Bridge many Metadate is added to the file. To see what changes are made I will compare the original file and the edited file with Exiftool. I hope that Exifool can list the differences "deleted" "modified" "added" data to the file.
-Mac OSX Mountain Lion-

Phil Harvey

I do this all the time, but I use the "diff" utility (standard on Mac/Linux) to do this:

exiftool a.jpg b.jpg -a -G1 -w txt
diff a.txt b.txt


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

No fine pix with my FinePix

Hello, I had some problems with differing meta data, so - because I did not found anything - I wrote a small batch (running in XP, Vista). Perhaps you find it helpfull.


@echo off
:: Compare_EXIF.cmd Compare EXIF-Meta-Data V1.6
setlocal enableextensions enabledelayedexpansion &  if errorlevel 1 ( echo Can not set local & pause & exit/b 99 )

:: ARGUMENTS
:: either two dirs with same named pics, Wildcards are not possible, choose mask for dir comparison
set WILD=*.jpg
:: or two pics

:: NEEDS
:: exiftool http://www.exiftool.org/
set EF="G:\Programme\GeoSetter\tools\exiftool.exe"
:: Winmerge http://winmerge.org
set WM="G:\Programme\PROGRAMME (USB)\WinMergePortable\WinMergePortable.exe"

:: for DEBUGGING uncomment
:: set DEBCMD=echo ~

:: START BATCH

:: Functions
set EXITE=echo. ^& pause ^& exit/b 99

:: Initialization
cls
if not defined DEBCMD ( set DEBCMD=rem ) else %DEBCMD% Using debug command: %DEBCMD%

:: exiftool commandline
if not exist %EF% ( echo ExifTool %EF% not found % %EXITE% )
:: Instead of excluding here, may use LineFilter in Winmerge
set EF=%EF% -groupHeadings --FileModifyDate
%DEBCMD% exiftool (found) %EF%

:: Winmerge commandline
if not exist %WM% ( echo ExifTool %WM% not found % %EXITE% )
:: /s single instance
:: /e for end with escape
:: /x for message and end if equal
:: /ul /ur don't add to history left & right
:: /wl /wr read only left & right
set WM=%WM% /s /e /x /ul /ur /wl /wr
%DEBCMD% Winmerge (found) %WM%

:: Check Argument/1
set D1=%~dp1
set N1=%~nx1
set ISDIR1=%~a1
set ISDIR1=%ISDIR1:~0,1%
if ~%ISDIR1%~==~d~ (
set D1=%1
set N1=%WILD%
)
%DEBCMD% Arg/1 %1%
%DEBCMD% Path/1 %D1% & %DEBCMD% Name/1 %N1% & %DEBCMD% Attributes/1 %ISDIR1%
if not exist %1 ( echo 1st File/s %1? & %EXITE% )

:: Check Argument/2
set D2=%~dp2
set N2=%~nx2
set ISDIR2=%~a2
set ISDIR2=%ISDIR2:~0,1%
if ~%ISDIR2%~==~d~ (
set D2=%2
set N2=%WILD%
)
%DEBCMD% Arg/2 %2%
%DEBCMD% Path/2 %D2% & %DEBCMD% Name/2 %N2% & %DEBCMD% Attributes/2 %ISDIR2%
if not exist %2 ( echo 2nd File/s %2? & %EXITE% )

:: No Argument/3
if not ~%3~==~~ ( echo Only 2 args, either 2 files or 2 dirs, NO WILDCARDS &  %EXITE% )

:: Check constency of arguments
%DEBCMD% Matched Attributes %ISDIR1%%ISDIR2%
if ~%ISDIR1%%ISDIR2%~==~dd~ ( echo Comparing two directories & goto compare )
if ~%ISDIR1%%ISDIR2%==~--~ ( echo Comparing two files & goto compare )
%EXITE% Can only compare two dirs or two files

:compare
:: Temporary Files
if not defined TEMP (echo TEMP-Directory defined & %EXITE%) else %DEBCMD% Using TEMP: %TEMP%
set T1="%TEMP%\%~n0_1.tmp"
set T2="%TEMP%\%~n0_2.tmp"
%DEBCMD% Temp 1: %T1% & %DEBCMD% Temp 2: %T2%

:: Run first arg
echo.
echo exiftool/1
echo File/s: %N1%
echo in: %D1%
cd /d %D1%
%DEBCMD% "%EF% %N1% > %T1%"
%EF% %N1% > %T1%
if errorlevel 1 ( echo ExifTool/1 Error on %1 & %EXITE% )
if not exist %T1% ( echo Not found: %T1% & %EXITE% )

:: Rund second arg
echo.
echo exiftool/2
echo File/s: %N2%
echo in: %D2%
cd /d %D2%
%DEBCMD% "%EF% %N2% > %T2%"
%EF% %N2% > %T2%
if errorlevel 1 ( echo ExifTool/2 Error on %2 & %EXITE% )
if not exist %T2% ( echo Not found: %T2% & %EXITE% )

:: /dl /dr Description left & right
%DEBCMD% start "WinMerge" /b %WM% /dl %1 /dr %2 %T1% %T2%
start "WinMerge" /b %WM% /dl %1 /dr %2 %T1% %T2%

Alan Clifford

Quote from: Phil Harvey on May 01, 2011, 08:24:52 PM
exiftool a.jpg b.jpg -a -G1 -w txt
diff a.txt b.txt


- Phil

That looked interesting but I didn't like the thought of typing in the file names twice. So, still rather unrefined,


#!/bin/sh
# Usage: exifdiff.sh path/file1.ext path/file2.ext parameters
# eg:    exifdiff.sh dsc_7811.nef ./jpegs/dsc_7811.jpg -a -G1 -datetimeoriginal

B1=`basename ${1}`
B2=`basename ${2}`

exiftool "${@}"  -w! /tmp/%f.%e.txt \
  && diff /tmp/${B1}.txt /tmp/${B2}.txt  | less -i


I don't know if the /tmp directory is cleared out automagically on my mac. We shall see.

Alan

Alan Clifford

I've been looking at diff and side-by-side looks better.  It needs a wide window.

#!/bin/sh
# Usage: exifdiff.sh path/file1.ext path/file2.ext parameters
# eg:    exifdiff.sh dsc_7811.nef ./jpegs/dsc_7811.jpg
# eg:    exifdiff.sh dsc_7811.nef ./jpegs/dsc_7811.jpg -a -G1 -datetimeoriginal

B1=`basename ${1}`
B2=`basename ${2}`

exiftool "${@}"  -w! /tmp/%f.%e.txt \
  && diff -d -y --left-column /tmp/${B1}.txt /tmp/${B2}.txt  | less -i



simonmcnair

Hi all,
I naively thought that importing photos from my camera in windows would result in the images being imported in their unaltered state.  It seems now that it is not the case.  I _think_ that windows gallery/photo/import/whateveritscalledthisweek seems to add a load of schema junk to my files.  I have two files which look identical, and have identical exif and IPTC info but one is 3,065,614 bytes and the other is 3,641,151 bytes.  I originally thought that (tin foil hat time) a virus was storing itself in my files in a steganography type style.  After some research I suspect this is not the case, but I cannot fathom what microsoft could store that would increase the filesize by nearly 600K.

Are you aware of a tool that could point out to me what the data is (decode it preferably) so I can just treat it as a dupe and delete the larger file ?

as an aside, is there any mileage in batch converting all my jpg's to pngs or an alternative format which is lossless and 'better' ?

tia :-)

Simon

Phil Harvey

If you send me both images I will analyze them to see what the difference is.  I don't know of any tool that is as good as me. ;)

My email is philharvey66 at gmail.com

I don't recommend converting the JPEG format unless you plan to edit the files.  For intermediate files in an editing workflow, I suggest using the native format of the editor (PSD is what I use since I edit with Photoshop).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Simon,

I got the samples, thanks.  Below is a diff of the metadata.  Basically, here is what Windows has done:

1) Deleted the JPEG MPImage trailer containing the PreviewImage

2) Corrupted the MakerNotes offsets

3) Added some XMP

4) Re-generated the ThumbnailImage

5) Changed the EXIF byte order <-- THIS IS A REAL NO-NO!!

6) Added EXIF OffsetSchema and XPKeywords tags

The bottom line is that Microsoft has made a real mess of the EXIF and deleted the PreviewImage completely.  Nasty.  There is no way to completely recover from this.

- Phil

diff tmp/2011-01-08_14-49-59_1.txt tmp/2011-01-08_14-49-59.txt
2c2,4
< [System]        File Name                       : 2011-01-08_14-49-59_1.jpg
---
> [ExifTool]      Warning                         : [minor] Possibly incorrect maker notes offsets (fix by 3768?)
> [ExifTool]      Warning                         : [minor] Suspicious MakerNotes offset for DataDump
> [System]        File Name                       : 2011-01-08_14-49-59.jpg
4,5c6,7
< [System]        File Size                       : 4.4 MB
< [System]        File Modification Date/Time     : 2011:01:08 08:49:58-05:00
---
> [System]        File Size                       : 3.8 MB
> [System]        File Modification Date/Time     : 2011:01:08 09:49:59-05:00
9c11
< [File]          Exif Byte Order                 : Little-endian (Intel, II)
---
> [File]          Exif Byte Order                 : Big-endian (Motorola, MM)
24a27,28
> [IFD0]          XP Keywords                     : Misc
> [IFD0]          Padding                         : (Binary data 2060 bytes, use -b option to extract)
56a61,62
> [ExifIFD]       Padding                         : (Binary data 2060 bytes, use -b option to extract)
> [ExifIFD]       Offset Schema                   : 3768
66d71
< [Panasonic]     Data Dump                       : (Binary data 8200 bytes, use -b option to extract)
70c75
< [Panasonic]     Internal Serial Number          : (F52) 2009:11:26 no. 0059
---
> [Panasonic]     Internal Serial Number          :
84c89
< [Panasonic]     Baby Age                        : (not set)
---
> [Panasonic]     Baby Age                        :
96c101
< [Panasonic]     AF Point Position               : 0.7 0.5
---
> [Panasonic]     AF Point Position               : 0 0
115c120
< [Panasonic]     Baby Age                        : (not set)
---
> [Panasonic]     Baby Age                        :
117d121
< [InteropIFD]    Interoperability Index          : R98 - DCF basic file (sRGB)
139,140c143,144
< [IFD1]          Thumbnail Offset                : 10752
< [IFD1]          Thumbnail Length                : 4424
---
> [IFD1]          Thumbnail Offset                : 14670
> [IFD1]          Thumbnail Length                : 3809
155c159
< [MPImage2]      MP Image Start                  : 4014592
---
> [MPImage2]      MP Image Start                  : 4001936
157a162,165
> [XMP-rdf]       About                           : uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b
> [XMP-microsoft] Date Acquired                   : 2011:01:10 10:51:36.348
> [XMP-microsoft] Last Keyword XMP                : Misc
> [XMP-dc]        Subject                         : Misc
159c167
< [Composite]     Base Name                       : 2011-01-08_14-49-59_1
---
> [Composite]     Base Name                       : 2011-01-08_14-49-59
169c177
< [Composite]     Thumbnail Image                 : (Binary data 4424 bytes, use -b option to extract)
---
> [Composite]     Thumbnail Image                 : (Binary data 3809 bytes, use -b option to extract)
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

simonmcnair

Phil, You're a Gem.  So Bottom line is to copy from SD card, readonly the sucker and then go from there ?

simonmcnair

Phil,
Sorry one last thing.  Please can you give me a copy of the script you used to generate that ?, it'd be really useful for me to understand what's changed in my metadata :-)

Phil Harvey

Hi Simon,

Yes.  Windows has a nasty habit of changing writing image metadata when you aren't expecting it.

The "diff" utility I used is a standard Unix utility.  I'm sure there is a free version for Windows somewhere that you could download.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

simonmcnair

Hi Phil,
I meant to ask at the time, but forgot up to now.  Is there any way (and I know this is asking a lot) to get exiftool to identify/rename each Microsoft picture and rename it (for the sake of argument) with the suffix '-Microsoft' so that I can see if I have two copies of the file and delete the microsoft wrecked version ?

tia
Simon

Phil Harvey

Hi Simon,

You can do this if you can identify a tag that is added when you edit with Microsoft software.  One possibility is OffsetSchema, which should be added only by Microsoft if your image contained maker notes:

exiftool -if "defined $offsetschema" "-filename=%f-Microsoft.%e" DIR

where DIR is the name of the directory containing the images.  Add -r to also process images in sub-directories.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ruth

Quote from: Phil Harvey on May 01, 2011, 08:24:52 PM
I do this all the time, but I use the "diff" utility (standard on Mac/Linux) to do this:

exiftool a.jpg b.jpg -a -G1 -w txt
diff a.txt b.txt


- Phil

Sorry for replying to such an old thread...

Is it also possible to compare the metadata between 2 (or more) files by exporting all the metadata fields into a csv file? Would there be any problems with fields in the two files possibly not being the same (in number of fields, location, name, etc)?

Thanks, Ruth (total newbie)

Phil Harvey

Hi Ruth,

The -cvs option handles the case of different tags from each file.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).