Reading Unicode metadata

Started by jean, May 07, 2010, 04:48:35 AM

Previous topic - Next topic

jean

Hello
I have JPEGs with unicoded metadata (cyrillic).
How can i display these metadata ?
I use exiftool.exe, Windows XP

Phil Harvey

I think that reading FAQ 18 and maybe FAQ 10 too may help.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

jean

I create a process that reads the IPTC metadata in a JPEG.
The metadata are written to a text file then i read the text file to parse the infos.
I use the following command line:

exiftool -IPTC:all -L file.jpg >report.txt

In the report.txt the insfos are not correct:

[IPTC] Keywords: Àíäðåé, Èâàí, Òàðàñ
[IPTC] By-line: Ïåòðîâ Ïåòð

Christian Etter

Hi,
two suggestions: first, do not use the -L option. Cyrillic letters cannot be represented in Latin1 encoding. Second, make sure your text editor a) supports UTF-8 and b) will open the file as UTF-8. I would recommend Notepad++ for this purpose.

You can find more information about ExifTool and Unicode on my web page: http://www.christian-etter.de/?tag=exiftool

Christian

jean

Hi
I removed the -L option.
Notepad usually displays correctly the cyrillic metadata, i'm going to your blob, thanks for the tip  :)

jean

I did not find anything that allows to get correct unicode text.  :'(
Did someone succeed ?

Christian Etter

Please make sure the files you are reading contain 100% correct Unicode meta data. Chances are that you might have written incorrect or non-unicode meta data, so it is an impossible task to get back correct Unicode.
If you are writing cyrillic characters, only use a UTF-8 encoded argfile with the -@ option. (There is a way to use UTF-8 in the command line, but it is a bit confusing at first).
When reading, write the output to a text file, do not use the -L option.
To get more help, you should post all commands you are executing.

Chris

jean

I have attached the file (it's a small one).
It has cyrillic EXIF (Image Description) and IPTC (Caption-abstract, some keywords)
These metadata are not correctly read by Exiftool  :'(

Christian Etter

Text is in UTF-8 format, it reads out fine. Try:

exiftool -location IMG_0650_web.jpg>location.txt

then open the text file with a UTF-8 aware editor. "St. Isaac's Cathedral (Исаакиевский Собор)" This should solve the problem.

Christian

Phil Harvey

I'm glad Christian is here to help you with Windows quirks.  Thanks Christian.

Here is the information as extracted in a Mac terminal:



- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).