Hi
I'm pretty new to ExifTool and I have ran into a problem in my images database where I have found out that some image files have messed up metadata where multiple XMP records have duplicate data.
Now I need a way of running a command to check which of my files have this problem.
I've tried to attach an example file with such problem but for some reasons this is not allowing me.
Many thanks for your help
The attachment may have been too big. If that's the case perhaps you can upload it somewhere and provide us with the link so we can have a look.
Here it is
I had a look at the metadata with exiftool and see quite a number of repeats of e.g., Description. Mostly this is to be expected and good as there could be copies in IPTC and XMP. But in your case the data is indeed repeated (a lot) more than expected.
First of all the caption is repeated in the User Comment (which may be what you like), but it is also repeatedly present in multiple XMP blocks which may not be what you want. Which ones of these you to remove could be tricky as these blocks sometimes contain other useful info that may or may not be repeated.
One odd thing I noticed is that the file contains two XMP:xmp-dc blocks, and one says it is a tiff format while the file clearly is a jpg... Perhaps you can use this to determine which of your files have this problem? The problem here is that eg. -if '$XMP:xmp-dc:format eq "image/tiff"'
doesn't work because $XMP:xmp-dc:format resolves to the jpg version of the record.
This clearly needs more thought...
Perhaps you can find a way yourself by looking at the output of exiftool -a -G0:1 FILE. This will show you the complete output, including duplicates and tag groups (add -s to get the tag name instead of its short description).
Many thanks for your prompt reply Hayo Baan.
Le me give you the whole story, I'm using the DAM software IMatch to organise my image files and so this is the IMatch which is writting the metadata via the ExifTool.
I'm not sure why some of my files have some erroneous metadata in them, but one thing that is quite sure is that it seems to be applicable only to older files.
The duplication of metadata is causing problems and confusing IMatch when it comes to display each of the metadata fields, hence I need to get rid of these duplicates. We've found a solution so far by executing the command "-overwrite_original_in_place -xmp=" on such files and then use the metadata still held within the IMatch database to write back the correct metadata.
I've tried that solution on a number of files and that cures nicely my problem.
Now the reason I came here on this forum is that now I need a solution to detect all those files which have duplicates. This is to allow me to do it quickly rather than having to browse through the 1000's of files in my database.
Regarding your comment about the XMP:xmp-dc blocks, within IMatch it is possible to define "versions" of "master" files and configure it such that versions of master automatically get the metadata copied from the master. In my case, the jpg file I've attached here which has the problem is indeed a "master" of a "version" and the version file is a TIFF file.
Does this make any sense ?
So based from all this information, what ExifTool command could I use on all my files so that it list out only those with that problem ?
It took a bit of puzzling to get to the second definition of the XMP-dc:Format tag, but I think I have found a way. This -if '${Format; $_ = $self->GetValue("Format (1)")} eq "image/tiff"' tests for the second format tag to be image/tiff. If all your "mangled" files are mangled in the same way, using this condition should find them. Test it out on a number of known good and bad files and see if it works.
exiftool -fileName -if '${Format; $_ = $self->GetValue("Format (1)")} eq "image/tiff"' -ext JPG DIRorFILE
That's great. I'll try it out tomorrow.
Many thanks for your time Hayo, this is very much appreciated.