ExifTool Forum

ExifTool => Newbies => Topic started by: edison23 on November 20, 2022, 06:25:34 AM

Title: Convert EXIF tags character set from Latin2 to UTF-8
Post by: edison23 on November 20, 2022, 06:25:34 AM
Hi, I've tried searching and experimenting but I'm unable to get the results. Sorry if I'm missing something obvious.

I have images with Image Description EXIF field populated in a crappy old software that didn't use UTF-8. The descriptions are in Czech, that means they contain characters like ř, š, ě, ...

When I do exiftool -exif:'*Description*' sample.jpg, I get
Image Description               : ��rka na cest� do B
When I do exiftool -exif:'*Description*' -charset exif=latin2 sample.jpg, I get the description correctly:
Image Description               : Šárka na cestě do B
This is causing issues in other software, for example, Piwigo, that can't read the descriptions (and tags that are coded similarly) properly.

My goal is to overwrite the images with correctly encoded EXIF data. I've tried
exiftool -tagsfromfile sample.jpg -exif:all -charset exif=latin2 sample.jpg
but the character set stays the same.

Could you please advise? Thank you very much!

P.S.: Yes, I don't know how to select a tag the name of which contains spaces in any other way than using the wildcard.
Title: Re: Convert EXIF tags character set from Latin2 to UTF-8
Post by: Phil Harvey on November 20, 2022, 02:40:09 PM
Unfortunately the -charset exif option applies when both reading and writing, so ImageDescription will be written back with the same encoding.  Instead, you could extract to an XMP sidecar, then write it back from there:

1. exiftool "-description<imagedescription" -charset exif=latin2 -o %d%f.xmp -ext jpg DIR

2. exiftool -tagsfromfile %d%f.xmp "-imagedescription<description" -ext xmp DIR

You can add other tags you want to convert to these commands, but I wouldn't suggest using a blunt instrument like -exif:all.

- Phil

P.S. Tag names don't contain spaces.  See FAQ 2 (https://exiftool.org/faq.html#Q2).
Title: Re: Convert EXIF tags character set from Latin2 to UTF-8
Post by: edison23 on November 22, 2022, 04:55:58 AM
Thanks, Phil, for the solution and explanation! :) It works great on the description field.

I have there another field `usercomment` that Exiftool reads correctly (the diacritics displays well) but Piwigo doesn't, even if I try to convert it using your method. But that doesn't have to do anything with Exiftool, I guess.
Title: Re: Convert EXIF tags character set from Latin2 to UTF-8
Post by: Phil Harvey on November 22, 2022, 07:01:54 AM
This only works for "string" format tags (see the EXIF Tag Name documentation (https://exiftool.org/TagNames/EXIF.html)).  UserComment is "undef", and includes an encoding parameter, so the encoding is known and shouldn't be an issue.  It is a bug in Piwigo if it can't interpret the encoding parameter properly.

- Phil