ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: BogdanH on November 13, 2011, 05:21:59 AM

Title: Question about -Lang option
Post by: BogdanH on November 13, 2011, 05:21:59 AM
Hi Phil,
I'm experimenting a bit with -Lang option in my test (stay_open) application. I don't use various "use codepage..characterset" ExifTool options. I just take care, that correct encoded string is being used when writting/reading data with ExifTool, i.e.:
-btw. this approach works very well in GUI.

Let's say I execute:
exiftool -exif:artist=Günther photo.jpg
-here, as mentioned above, Ansi string is being sent to ExifTool.
Now, if I execute:
exiftool -exif:artist photo.jpg
-I get correct (Ansi) tag name and it's value, as expected.

But, if I execute:
exiftool -lang de -exif:artist photo.jpg
-then I get:
KĂĽnstler : Günther
-here, I would expect to get Künstler : Günther.

If I encode the whole KĂĽnstler : Günther line into Utf8, I get tag name right, but then, converted tag value is wrong.
I assume, that in ExifTool, all translations are in Utf8. That's ok when dealing with Xmp, because whole output line is in Utf8 anyway. But for Exif, decoding is quite difficult, because in the same line, tag name is in Utf8, tag value however is in Ansi.
Maybe I am missing something or my characters de-coding process is wrong...
Any ideas?

Bogdan
PS: I've looked into ..Image/ExifTool/Lang/de.pm file and it is saved in Utf8. I've converted the file into Ansi, and the (Exif) problem is gone. But in this case, as expected, the problem appears for Xmp. If only Exif would allow Utf8...
Title: Re: Question about -Lang option
Post by: Phil Harvey on November 13, 2011, 05:57:46 AM
Hi Bogdan,

By default, exiftool does not encode/decode EXIF "string" values.  However, the MWG suggests writing UTF-8 for these.  I have just added a feature in ExifTool 8.69 which allows the internal encoding of EXIF strings to be specified.  [Isn't it great how I can anticipate these problems and provide a solution before they occur? ;) ]  See the updated FAQ number 10 (https://exiftool.org/faq.html#Q10) for details.

So now ExifTool will recode EXIF "string" values however you want.  If you want to store EXIF as UTF-8 (recommended), then you should do something like this when writing:

exiftool -charset exif=utf8 -charset CHARSET -exif:artist=Günther photo.jpg

where CHARSET is the character encoding you are using for input.  (Or -L is shorter if you are using -charset latin.)

With the command you gave, the EXIF "string" value will be stored using your local character encoding.  Use this command to read values like this:

exiftool -charset exif=CHARSET -lang de -exif:artist photo.jpg

- Phil
Title: Re: Question about -Lang option
Post by: BogdanH on November 13, 2011, 06:16:07 AM
My eyeballs just popped out! -that's all I can say.

Bogdan