Windows platform, ExifTool command line app, latest version.
A image file contains the text © SomeName in the EXIF:Copyright tag. When I do a
exiftool -G -exif:copyright <file_name>
i get
[EXIF] Copyright : ® SomeName
but when I use -x to produce XML output, I get:
<IFD0:Copyright rdf:datatype='http://www.w3.org/2001/XMLSchema#base64Binary'>qSB....HQ=</IFD0:Copyright>
(I changed the original copyright string / base64 code for privacy reasons).
Since the XML output is in UTF-8 which should handle the © character just fine, why does ExifTool fall back to using Base64-encoding in this case?
(the XMP-dc:rights tag contains the © SomeName string without Base64-encoding).
I played with -charset exif= but no change in the result for XMP (when I use -charset exif:Arabic the normal output changes).
The metadata in the file was produced apparently by ExifToolGui.
I think I'll need to see a sample (you can email me: philharvey66 at gmail.com). I get this (on a UTF-8 console):
> exiftool -ver
9.33
> exiftool a.jpg -copyright="© SomeName"
1 image files updated
> exiftool a.jpg -copyright
Copyright : © SomeName
> exiftool a.jpg -copyright -X
<?xml version='1.0' encoding='UTF-8'?>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about='a.jpg'
xmlns:et='http://ns.exiftool.ca/1.0/' et:toolkit='Image::ExifTool 9.33'
xmlns:IFD0='http://ns.exiftool.ca/EXIF/IFD0/1.0/'>
<IFD0:Copyright>© SomeName</IFD0:Copyright>
</rdf:Description>
</rdf:RDF>
- Phil
I sent you an email with a JPEG sample file and the ARGs file I use to import the data on Windows.
So far this worked with a wide range of files. This is the first file reported with this specific issue.
I got the sample, thanks.
The problem is that the copyright symbol is not UTF-8. It is Latin1. By default, ExifTool assumes UTF-8 for EXIF:Copyright (actually, but the EXIF spec it should be ASCII, but the MWG suggests UTF-8). This is why it is converted to base64 in the XML. Specifying -charset exif=latin1 should fix this (because the symbol is Windows Latin1 encoded). It did for me. But you say you tried this?
- Phil
My software so far always used the default character set for EXIF. For IPTC I had added a way for the user to configure the character set to use for reading/writing. I have now added a identical option for configuring the EXIF character set to assume for import and export. When specifying Latin(1) for reading, the data is properly converted :D and the users can handle their old files.
Sounds good.
- Phil