[Originally posted by djeyewater on 2009-05-13 15:37:34-07]
I'm using exiftool with the -json output option, but it isn't encoding an exif value that contains the © character to UTF-8, so when I try to decode the JSON it doesn't work (I'm using the PHP json_decode function). I also tried using the -EscapeHTML option, but the copyright symbol is still output as © rather than ©
I tried adding some chinese characters to the exif, and exiftool encoded them as UTF-8 when extracting, just not the copyright symbol.
Any ideas on what's causing this/how to fix it?
Thanks
Dave
[Originally posted by exiftool on 2009-05-13 15:53:58-07]
Hi Dave,
The EXIF copyright tag isn't translated since the encoding is
not specified by the EXIF spec. However, this should work if you
write the string in UTF-8:
exiftool a.jpg -usercomment="\302\251"
1 image files updated
exiftool a.jpg -copyright
Copyright : ©
exiftool a.jpg -copyright -json
[{
"SourceFile": "a.jpg",
"Copyright": "©"
}]
exiftool a.jpg -copyright -json -escapehtml
[{
"SourceFile": "a.jpg",
"Copyright": "©"
}]
- Phil
[Originally posted by djeyewater on 2009-05-14 10:48:54-07]
Thanks, I didn't realise the problem was the value not being encoded in Unicode.
Unfortunately I can't ensure that the exif values will be encoded in Unicode, so I guess what I can do is to extract the metadata using -json -escapehtml, and then utf8_encode the json string in php before decoding it.
As a suggestion for a possible future feature, I'd find it quite useful if Exiftool had an option to convert all non UTF-8 strings to UTF-8
Regards
Dave
[Originally posted by exiftool on 2009-05-14 11:15:22-07]Hi Dave,
Note that some EXIF are stored as UCS-2, and these
are converted
to UTF-8. See the EXIF description in
FAQ
number 10 for details.
It is not possible to reliably convert other values because
there is no way to determine the original encoding (the only option
would be to ask the user to provide these details).
- Phil