XMLSchema#base64Binary for EXIF:Copyright tag?

Started by Mac2, July 25, 2013, 08:39:05 AM

Previous topic - Next topic

Mac2

Windows platform, ExifTool command line app, latest version.

A image file contains the text © SomeName in the EXIF:Copyright tag. When I do a

exiftool -G -exif:copyright <file_name>

i get

[EXIF]          Copyright                       : ® SomeName

but when I use -x to produce XML output, I get:

<IFD0:Copyright rdf:datatype='http://www.w3.org/2001/XMLSchema#base64Binary'>qSB....HQ=</IFD0:Copyright>

(I changed the original copyright string / base64 code  for privacy reasons).

Since the XML output is in UTF-8 which should handle the © character just fine, why does ExifTool fall back to using Base64-encoding in this case?

(the XMP-dc:rights tag contains the © SomeName string without Base64-encoding).

I played with -charset exif= but no change in the result for XMP (when I use -charset exif:Arabic the normal output changes).

The metadata in the file was produced apparently by ExifToolGui.


Phil Harvey

I think I'll need to see a sample (you can email me: philharvey66 at gmail.com).  I get this (on a UTF-8 console):

> exiftool -ver
9.33

> exiftool a.jpg -copyright="© SomeName"
    1 image files updated

> exiftool a.jpg -copyright
Copyright                       : © SomeName

> exiftool a.jpg -copyright -X
<?xml version='1.0' encoding='UTF-8'?>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

<rdf:Description rdf:about='a.jpg'
  xmlns:et='http://ns.exiftool.ca/1.0/' et:toolkit='Image::ExifTool 9.33'
  xmlns:IFD0='http://ns.exiftool.ca/EXIF/IFD0/1.0/'>
<IFD0:Copyright>© SomeName</IFD0:Copyright>
</rdf:Description>
</rdf:RDF>


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

I sent you an email with a JPEG sample file and the ARGs file I use to import the data on Windows.
So far this worked with a wide range of files. This is the first file reported with this specific issue.

Phil Harvey

I got the sample, thanks.

The problem is that the copyright symbol is not UTF-8.  It is Latin1.  By default, ExifTool assumes UTF-8 for EXIF:Copyright (actually, but the EXIF spec it should be ASCII, but the MWG suggests UTF-8).  This is why it is converted to base64 in the XML.  Specifying -charset exif=latin1 should fix this (because the symbol is Windows Latin1 encoded).  It did for me.  But you say you tried this?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

My software so far always used the default character set for EXIF. For IPTC I had added a way for the user to configure the character set to use for reading/writing. I have now added a identical option for configuring the EXIF character set to assume for import and export. When specifying Latin(1) for reading, the data is properly converted  :D and the users can handle their old files.

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).