ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: jean on May 07, 2010, 04:48:35 AM

Title: Reading Unicode metadata
Post by: jean on May 07, 2010, 04:48:35 AM
Hello
I have JPEGs with unicoded metadata (cyrillic).
How can i display these metadata ?
I use exiftool.exe, Windows XP
Title: Re: Reading Unicode metadata
Post by: Phil Harvey on May 07, 2010, 06:58:51 AM
I think that reading FAQ 18 (https://exiftool.org/faq.html#Q18) and maybe FAQ 10 (https://exiftool.org/faq.html#Q10) too may help.

- Phil
Title: Re: Reading Unicode metadata
Post by: jean on May 07, 2010, 08:18:11 AM
I create a process that reads the IPTC metadata in a JPEG.
The metadata are written to a text file then i read the text file to parse the infos.
I use the following command line:

exiftool -IPTC:all -L file.jpg >report.txt

In the report.txt the insfos are not correct:

[IPTC] Keywords: Àíäðåé, Èâàí, Òàðàñ
[IPTC] By-line: Ïåòðîâ Ïåòð
Title: Re: Reading Unicode metadata
Post by: Christian Etter on May 08, 2010, 04:09:00 PM
Hi,
two suggestions: first, do not use the -L option. Cyrillic letters cannot be represented in Latin1 encoding. Second, make sure your text editor a) supports UTF-8 and b) will open the file as UTF-8. I would recommend Notepad++ for this purpose.

You can find more information about ExifTool and Unicode on my web page: http://www.christian-etter.de/?tag=exiftool (http://www.christian-etter.de/?tag=exiftool)

Christian
Title: Re: Reading Unicode metadata
Post by: jean on May 09, 2010, 02:31:01 AM
Hi
I removed the -L option.
Notepad usually displays correctly the cyrillic metadata, i'm going to your blob, thanks for the tip  :)
Title: Re: Reading Unicode metadata
Post by: jean on May 13, 2010, 05:06:24 AM
I did not find anything that allows to get correct unicode text.  :'(
Did someone succeed ?
Title: Re: Reading Unicode metadata
Post by: Christian Etter on May 20, 2010, 04:37:11 AM
Please make sure the files you are reading contain 100% correct Unicode meta data. Chances are that you might have written incorrect or non-unicode meta data, so it is an impossible task to get back correct Unicode.
If you are writing cyrillic characters, only use a UTF-8 encoded argfile with the -@ option. (There is a way to use UTF-8 in the command line, but it is a bit confusing at first).
When reading, write the output to a text file, do not use the -L option.
To get more help, you should post all commands you are executing.

Chris
Title: Re: Reading Unicode metadata
Post by: jean on May 20, 2010, 11:24:11 AM
I have attached the file (it's a small one).
It has cyrillic EXIF (Image Description) and IPTC (Caption-abstract, some keywords)
These metadata are not correctly read by Exiftool  :'(
Title: Re: Reading Unicode metadata
Post by: Christian Etter on May 20, 2010, 03:46:17 PM
Text is in UTF-8 format, it reads out fine. Try:

exiftool -location IMG_0650_web.jpg>location.txt

then open the text file with a UTF-8 aware editor. "St. Isaac's Cathedral (Исаакиевский Собор)" This should solve the problem.

Christian
Title: Re: Reading Unicode metadata
Post by: Phil Harvey on May 20, 2010, 05:44:26 PM
I'm glad Christian is here to help you with Windows quirks.  Thanks Christian.

Here is the information as extracted in a Mac terminal:

(https://exiftool.org/~phil/img/screen.png)

- Phil