[Originally posted by eric80 on 2009-05-18 15:46:21-07]
Hello,
I am exporting image description from a (mysql) database, and like to set it as iptc caption in the jpg.
During the generation, I decode the html char using the html_entity_decode() php function, tested with 'UTF-8', 'cp1252' and 'ISO-8859-15'.
Once generated, the script file under notepad or vim (XP SP3 US) displays correct characters (â, ü, etc) for any encoding chosen, e.g. one line of my script is:
exiftool -IPTC:ObjectName="vue sur le chateau" -IPTC:Caption-Abstract="Au dessus des rues exigües avec les maisons anciennes, le château surplombe la ville." -IPTC:Writer-Editor="Eric" 73_vue_sur_le_chateau.jpg
However, when I launch the script, the characters are badly shown in the console, and the caption cannot be fully written, it stops at "ü" character, and I get a warning "Malformed UTF-8 character(s)".
If I copy/paste the sentence to e.g. ExifToolGUI, the file is properly updated.
So it seems to be more a problem with the dos command line as with ExifTool, but I don't know what I should do. Any idea?
And I got difficulties to write these characters in this forum too! I needed to reencode to html...
[Originally posted by exiftool on 2009-05-18 22:29:04-07]
It really isn't easy in Windows to get the character
encoding correct. Maybe reading FAQ number 10 and
FAQ number 18 will
help. Try that first and let me know if you have
any questions afterward.
- Phil
[Originally posted by eric80 on 2009-05-19 14:08:25-07]sorry I did not read these point in FAQ before.
So, I've found a solution, but actually not the one described in FAQ!
1) default code page is 850 in my computer. It causes the pb described before
2) if I set chcp 65001, I do not know why, the console does not take any cmd anymore. I've tried also /u, no way. Somebody else seems to encounter this pb too in this blog
http://blogs.msdn.com/michkap/archive/2006/03/06/544251.aspx 3) with chcp 1252, it works! Now, I've 2 combination which are working when I set this latin-1 code page:
a) text file in UTF-8 and direct use of exiftool
b) text file in ANSI and "exiftool -L"
Other tested cross combinations are causing wrong characters. In case a), wrong char are displayed in the console, but it does not seem to cause further pb.
Now, the final question: how can I know which encoding is used when I see the IPTC fields? I prefer to use a), but is my IPTC:caption really in UTF-8?
[Originally posted by exiftool on 2009-05-19 14:17:40-07]
I'm glad you worked this out.
The character encoding is problematic in IPTC. Other software
often writes using the local codepage, and there is no way to tell
what codepage was used. If special characters exist in IPTC, they
are only reliable if IPTC:CodedCharacterSet is "UTF8". Otherwise,
they could be any encoding.
The current Metadata Working Group guidelines address this problem, and
suggest using UTF-8 in IPTC for this reason.
- Phil
[Originally posted by exiftool on 2009-05-19 14:33:49-07]
One question: Does "chcp 437" have a similar effect for you
as "chcp 1252"?
- Phil
[Originally posted by exiftool on 2009-05-21 12:54:39-07]Ah, I see you wanted to solve this problem for your
Export
Image Metadata Piwigo extension. Cool.
Glad you found a solution.
- Phil