Identifying and correcting XMP with strange namespace prefixes

Started by José Oliver-Didier, April 09, 2021, 01:07:57 PM

Previous topic - Next topic

José Oliver-Didier

A long running issue (I have found posts from 2007 on this) with certain Microsoft applications (Example: PhotoInfo and Windows Photo Gallery) and some other applications which use the Windows Imaging Component is that instead of using the standard namespace prefixes, they instead write back the XMP metadata in a seemingly valid but convoluted way using namespace prefixes with "prefix" with an appended incremental value (Ex. prefix0, prefix1, prefix2...)

The issue is described in this post from 2010 which I quote: - https://exiftool.org/forum/index.php?topic=1675.msg7327#msg7327

Quotealthough this is technically allowed by the XMP specification,
the namespace prefixes they use for Ipct4xmpCore are "prefix0" and
"prefix1" instead of the recommended "Iptc4xmpCore".

This leads  to quite a few problems, which I have emailed with Phil quite some time ago. Some applications ignore these tags which can cause duplicate entries in the XMP, other such a Digikam are unable to read the XMP metadata at all. This also caused a crash in some applications, as described in this exiv2 bug entry which also has an XMP sample - https://dev.exiv2.org/issues/0001284

The latest versions of Exiftool handle this issue quite well. I am able to correct the metadata by overwriting the XMP with the following command:

exiftool *.jpg -xmp:all= -tagsfromfile @ -xmp:all -overwrite_original

Having used applications which have caused this condition for quite some times I have quite a number of files with this issue. As such are not properly read by Digikam. I am tempted on running this command on all my images in a effort to fix them, but this may be an overkill. Is there a way, using exiftool, to identify the images with this condition (prefixN) and then re-contruct the XMP in order to correct them?
blog: http://jmoliver.wordpress.com
flickr:  http://flickr.com/jmoliver

Phil Harvey

There is no way to detect this problem with ExifTool.  No warnings are issued because this is perfectly valid according to the XMP specification.

The only thing I can think of is to do this:  grep -l prefix *.jpg

...which will give you a list of candidate files.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Quote from: José Oliver-Didier on April 09, 2021, 01:07:57 PM
exiftool *.jpg -xmp:all= -tagsfromfile @ -xmp:all -overwrite_original

You shouldn't use this command because it won't necessarily preserve the namespace of same-named tags (since the destination group is the same as the source, XMP, which allows exiftool to choose the preferred namespace within XMP).  Instead, do this:

exiftool *.jpg -xmp:all= -tagsfromfile @ "-all:all<xmp:all" -overwrite_original

The destination group of "All" in the command above preserves the specific location (namespace) of the tag.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

José Oliver-Didier

#3
Quite helpful! Being on Windows, to generate a file with the imaging containing "prefix" I used instead of grep:

findstr /m /s prefix *.jpg > XMPPrefixImages.txt

However, I am noticing a particular block in the xmp which gets deleted when the xmp is rebuilt:

                <rdf:Description rdf:about="" xmlns:prefix0="MSImagingV1">
                        <prefix0:Brightness>0.000000</prefix0:Brightness>
                        <prefix0:CameraModelID/>
                        <prefix0:Contrast>14.284375</prefix0:Contrast>
                        <prefix0:ExposureCompensation>0.142875</prefix0:ExposureCompensation>
                        <prefix0:ISO>100</prefix0:ISO>
                        <prefix0:PipelineVersion>01.00</prefix0:PipelineVersion>
                        <prefix0:StreamType>3</prefix0:StreamType>
                        <prefix0:WhiteBalance0>2.589193</prefix0:WhiteBalance0>
                        <prefix0:WhiteBalance1>1.000000</prefix0:WhiteBalance1>
                        <prefix0:WhiteBalance2>1.020229</prefix0:WhiteBalance2>
                </rdf:Description>


I am assuming that for this namespace there are no better known namespace prefix?
blog: http://jmoliver.wordpress.com
flickr:  http://flickr.com/jmoliver

José Oliver-Didier

From looking at samples around the web and my own collection, it seems that the preferred namespace prefix is indeed "prefix0"

I also noticed the following comment in Exiftool's git repo:

# Microsoft Photo 1.1 schema properties (MP1 - written as 'prefix0' by MSPhoto) (ref PH)

The "MSImagingV1" namespace is also odd.

blog: http://jmoliver.wordpress.com
flickr:  http://flickr.com/jmoliver

Phil Harvey

The namespace prefix of "prefix0" is fine, but the namespace URI of "MSImagingV1" is certainly wrong.  It is the URI that is significant, and ExifTool doesn't support writing "MSImagingV1", which isn't even a properly formed URI.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).