HOW TO WRITE SPECIAL CHARACTERS (unicode) with exiftool

Started by pauloHess, April 30, 2013, 01:45:22 PM

Previous topic - Next topic

pauloHess

Hi

My metadata has some special characters such as  (© or e') , but after running exiftool they appear as "?"

I tried (based on FAQ doc) the following with no results:

exiftool -charset="utf-8" -title="© 2007 Tim Hawkinson" FILE
or
exiftool -charset=utf-8 -title="© 2007 Tim Hawkinson" FILE
(Warning: Tag 'charset' does not exist)

exiftool -charset  XMP="utf" -title="© 2007 Tim Hawkinson" FILE
(Unknown type for -charset option: XMP)


how can I make these chars show up?

Thanks
P.H.


Phil Harvey

The syntax is:

exiftool -charset utf8 ...

... but this will only work if your console uses UTF8.  What system are you running?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess

thanks,

my system is UNIX and it is UTF-8. I can see special char(s) when I display it on the screen. but once I pass them to exiftool ... they come out as "?"
for example

exiftool -charset utf8 -rights="©copyrighted document " File

says "1 file updated ..."     but the o/p is :

rights=?copyrighted document

Phil Harvey

Very odd.  I don't think your console is set to UTF8, because this should work (without the "-charset UTF8" even, since this is the default).

If you can't get this to work with UTF-8, then try HTML-encoding:

exiftool -h -rights="©copyrighted document" FILE

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess

#4
please note the word "Saint-Rémy" (with é, that appears fine on unix terminal) :

teamsdev(test)$ exiftool -charset utf-8 -description="Irises Saint-Rémy France Europe"  gm_1.jpg

teamsdev(test)$ exiftool -all -g  gm_1.jpg

output from exiftool :
....
---- XMP ----
XMP Toolkit                     : Image::ExifTool 8.80
Description                     : Irises Saint-R?my France Europe
.....
Same exact behavior with Windows "cmd"
Is there any solution to this.

Thanks a lot.


Phil Harvey

The Windows cmd shell would be expected to have this behaviour since most systems use Windows Latin1 encoding by default.  For this, you would need to specify -charset Latin to get it to work.  As I said, specifying -charset UTF-8 has no effect since UTF-8 is already the default.  Try -charset Latin (or, equivalently, -L).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess

Thank you.

I have couple of quick questions:

1- When applying the  -Copyright tag tag and  look at the metadata on photoshop, I noticed that the field shows up on photoshop when photoshop is running on windows system but for photoshop runs on MAC machine "-Copyright" field is blank. Is there special thing I have to do to make it show up on MAC too. here is my command:

exiftool    -Copyright="© 2007 Tim Hawkinson"  File

2- In FAQ there is a list language supported for charset (see below) - do you have support for french, dutch and german. Thanks

Valid CHARSET values are (with aliases given in brackets):
UTF8   (cp65001, UTF‑8)   Thai   (cp874)
Latin   (cp1252, Latin1)   MacRoman   (cp10000, Mac, Roman)
Latin2   (cp1250)   MacLatin2   (cp10029)
Cyrillic   (cp1251, Russian)        MacCyrillic   (cp10007)
Greek   (cp1253)   MacGreek   (cp10006)
Turkish   (cp1254)   MacTurkish   (cp10081)
Hebrew   (cp1255)   MacRomanian   (cp10010)
Arabic   (cp1256)   MacIceland   (cp10079)
Baltic   (cp1257)   MacCroatian   (cp10082)

Phil Harvey

Quote from: pauloHess on May 03, 2013, 04:38:11 PM
1- When applying the  -Copyright tag tag and  look at the metadata on photoshop, I noticed that the field shows up on photoshop when photoshop is running on windows system but for photoshop runs on MAC machine "-Copyright" field is blank. Is there special thing I have to do to make it show up on MAC too. here is my command:

FAQ 3 gives help with this.

Quote2- In FAQ there is a list language supported for charset (see below) - do you have support for french, dutch and german. Thanks

These are character sets, not languages.  The -lang option has support for french (fr).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess

thank you so much with all the help.

unless I am using  -lang  wrong but when I use it with a  simple french sentence.
some char(s) are correctly appear some not. example:

C:\exiftool>exiftool -lang  -title=" décision vient à l'improviste à" FILE

I get this:
---- XMP ----
XMP Toolkit                     : Image::ExifTool 9.28
Title                           :  d?cision vient ? l'improviste ?

I also used -latin ... with no success.
can we make all char(s) to appear correctly.

One thing that puzzles me is that when I use ( just for the sake of testing) -lang -copyright=" décision vient à l'improviste à"
it works  perfectly.

---- EXIF ----
X Resolution                    : 72
Y Resolution                    : 72
Resolution Unit                 : inches
Y Cb Cr Positioning             : Centered
Copyright                       : décision vient à l'improviste à
-----------------------------------------------
but again, this is the title and not the copyright.

Thanks again.

Phil Harvey

#9
OK.  First, you don't want to use the -lang option.  That is for languages, and you are having character set problems.  There is no French character set.

The entire problem is that we don't know the character set that your shell is using.  Apparently it isn't Latin1, because you say that -charset latin didn't work.  Did you try this?:

exiftool -charset latin -title="décision vient à l'improviste à" FILE

then extract with

exiftool -charset latin FILE

If that doesn't work, then try -charset latin2 maybe.

I am a bit surprised that your character set isn't UTF8, in which case this would work without the -charset option.

Try this:

echo "décision vient à l'improviste à" > out.txt

Then attach "out.txt" in this forum so I can take a look at the file.  I have attached the output on my system (which is UTF-8).  A hex dump of this file gives:

> hexdump -C out.txt
00000000  64 c3 a9 63 69 73 69 6f  6e 20 76 69 65 6e 74 20  |d..cision vient |
00000010  c3 a0 20 6c 27 69 6d 70  72 6f 76 69 73 74 65 20  |.. l'improviste |
00000020  c3 a0 0a                                          |...|


The Copyright works because ExifTool does not translate the encoding of EXIF ASCII strings.  But there is little chance that another system will display this information correctly the way you have written it.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess

Hi and thanks again.

I am attaching "out.txt" from my system.

By the way, I did use -lang :

exiftool -lang  -title=" décision vient à l'improviste à" FILE

I get this:
---- XMP ----
XMP Toolkit                     : Image::ExifTool 9.28
Title                           :  d?cision vient ? l'improviste ?


One sloppy solution is that I write it to -copyright tag (-lang -copyright=" décision vient à l'improviste à") since we know by now that ("ExifTool does not translate the encoding of EXIF ASCII strings") then copy the value of the -copyright tag to -title...

-How do you copy values of one tag to another?

Phil Harvey

First, the -lang option requires an argument (ie. -lang fr), so what you are doing won't work.

Second, -lang has nothing to do with your character set problem.

Third, your "out.txt" file is Windows Latin1, so you should use -charset latin or -L when reading and writing with ExifTool, just like I suggested in my last post.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pauloHess


pauloHess

Hi Phil

If you remember we had a discussion about embedding string with special char(s). You recommendation was to use -charset latin and it worked fine with all cases so far.

At the time I was running this on a UNIX machine.
Recently I moved to a Linux box and started  running the same script that executes some exiftool commands. I am facing the same old problem.
All metadata are correctly processed except special char(s). But this time  -charset latin doesn't help.

What is the problem here? How can I fix this.
Thanks




Phil Harvey

It seems that your terminal character set has changed.  You need to figure out what it is, and use the corresponding -charset in exiftool.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).