[Originally posted by exiftool on 2009-08-25 00:23:23-07]Interesting, but I think there is some misunderstanding:
You wrote:
exifTool--Options(
'Unknown' == 1,
);
According to the docs, this should cause the default UTF8 charset to be used.
I wonder how you got this impression. The docs state:
Unknown
Flag to get the values of unknown tags. If set to 1, unknown
tags are extracted from EXIF (or other tagged-format)
directories.
The setting of the Charset option is entirely application dependent.
If your application interprets the tag values as UTF8, then it should
be set to the default "UTF8". But if you want special characters
translated to Windows Latin1, then set it to "Latin".
But this all assumes that the IPTC is properly encoded to begin
with (which is unlikely). Historically, applications have written
IPTC using whatever local character set the computer was using,
and there is no way to tell what this character set was. Blame
Adobe -- they are responsible for this mess because Photoshop
set the standard.
For these historic encodings, setting the ExifTool Charset to
"Latin" effectively disables translation of IPTC and the characters
are passed without translation. This may be what you want,
I don't know. You should read FAQ number 10 for more details
about the character handling.
The only real solution for this is to allow the user to specify
which character set to use if not specified, then do the translations
for the specific character set. But ExifTool will not do these
translations for you (there are just too many character sets,
and I don't want to implement them all). So this solution is
not easy.
- Phil