Convert EXIF tags character set from Latin2 to UTF-8

Started by edison23, November 20, 2022, 06:25:34 AM

Previous topic - Next topic

edison23

Hi, I've tried searching and experimenting but I'm unable to get the results. Sorry if I'm missing something obvious.

I have images with Image Description EXIF field populated in a crappy old software that didn't use UTF-8. The descriptions are in Czech, that means they contain characters like ř, š, ě, ...

When I do exiftool -exif:'*Description*' sample.jpg, I get
Image Description               : ��rka na cest� do B
When I do exiftool -exif:'*Description*' -charset exif=latin2 sample.jpg, I get the description correctly:
Image Description               : Šárka na cestě do B
This is causing issues in other software, for example, Piwigo, that can't read the descriptions (and tags that are coded similarly) properly.

My goal is to overwrite the images with correctly encoded EXIF data. I've tried
exiftool -tagsfromfile sample.jpg -exif:all -charset exif=latin2 sample.jpg
but the character set stays the same.

Could you please advise? Thank you very much!

P.S.: Yes, I don't know how to select a tag the name of which contains spaces in any other way than using the wildcard.

Phil Harvey

Unfortunately the -charset exif option applies when both reading and writing, so ImageDescription will be written back with the same encoding.  Instead, you could extract to an XMP sidecar, then write it back from there:

1. exiftool "-description<imagedescription" -charset exif=latin2 -o %d%f.xmp -ext jpg DIR

2. exiftool -tagsfromfile %d%f.xmp "-imagedescription<description" -ext xmp DIR

You can add other tags you want to convert to these commands, but I wouldn't suggest using a blunt instrument like -exif:all.

- Phil

P.S. Tag names don't contain spaces.  See FAQ 2.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

edison23

Thanks, Phil, for the solution and explanation! :) It works great on the description field.

I have there another field `usercomment` that Exiftool reads correctly (the diacritics displays well) but Piwigo doesn't, even if I try to convert it using your method. But that doesn't have to do anything with Exiftool, I guess.

Phil Harvey

This only works for "string" format tags (see the EXIF Tag Name documentation).  UserComment is "undef", and includes an encoding parameter, so the encoding is known and shouldn't be an issue.  It is a bug in Piwigo if it can't interpret the encoding parameter properly.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).