Question about -Lang option

Started by BogdanH, November 13, 2011, 05:21:59 AM

Previous topic - Next topic

BogdanH

Hi Phil,
I'm experimenting a bit with -Lang option in my test (stay_open) application. I don't use various "use codepage..characterset" ExifTool options. I just take care, that correct encoded string is being used when writting/reading data with ExifTool, i.e.:

  • for Exif, Ascii (Ansi, actually) encoded characters are used,
  • for Xmp, Utf8 encoded characters are used.
-btw. this approach works very well in GUI.

Let's say I execute:
exiftool -exif:artist=Günther photo.jpg
-here, as mentioned above, Ansi string is being sent to ExifTool.
Now, if I execute:
exiftool -exif:artist photo.jpg
-I get correct (Ansi) tag name and it's value, as expected.

But, if I execute:
exiftool -lang de -exif:artist photo.jpg
-then I get:
KĂĽnstler : Günther
-here, I would expect to get Künstler : Günther.

If I encode the whole KĂĽnstler : Günther line into Utf8, I get tag name right, but then, converted tag value is wrong.
I assume, that in ExifTool, all translations are in Utf8. That's ok when dealing with Xmp, because whole output line is in Utf8 anyway. But for Exif, decoding is quite difficult, because in the same line, tag name is in Utf8, tag value however is in Ansi.
Maybe I am missing something or my characters de-coding process is wrong...
Any ideas?

Bogdan
PS: I've looked into ..Image/ExifTool/Lang/de.pm file and it is saved in Utf8. I've converted the file into Ansi, and the (Exif) problem is gone. But in this case, as expected, the problem appears for Xmp. If only Exif would allow Utf8...

Phil Harvey

Hi Bogdan,

By default, exiftool does not encode/decode EXIF "string" values.  However, the MWG suggests writing UTF-8 for these.  I have just added a feature in ExifTool 8.69 which allows the internal encoding of EXIF strings to be specified.  [Isn't it great how I can anticipate these problems and provide a solution before they occur? ;) ]  See the updated FAQ number 10 for details.

So now ExifTool will recode EXIF "string" values however you want.  If you want to store EXIF as UTF-8 (recommended), then you should do something like this when writing:

exiftool -charset exif=utf8 -charset CHARSET -exif:artist=Günther photo.jpg

where CHARSET is the character encoding you are using for input.  (Or -L is shorter if you are using -charset latin.)

With the command you gave, the EXIF "string" value will be stored using your local character encoding.  Use this command to read values like this:

exiftool -charset exif=CHARSET -lang de -exif:artist photo.jpg

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

BogdanH

My eyeballs just popped out! -that's all I can say.

Bogdan