Hi,
I'm not getting something about the use of -charset. Can you see what I'm doing wrong here?
I'm running Windows 7
c:\dev\palaso\DistFiles>chcp 65001
Active code page: 65001
c:\dev\palaso\DistFiles>exiftool -ver
8.96
c:\dev\palaso\DistFiles>exiftool -use mwg -codedCharacterSet=utf8 -charset exif=utf8 -charset iptc=utf8 -copyright="Copy
right ŋoŋ" "test.png"
1 image files updated
c:\dev\palaso\DistFiles>exiftool -copyright -charset exif=utf8 -charset iptc=utf8 "test.png"
Copyright : Copyright ?o?
c:\dev\palaso\DistFiles>
Notice than the engs (ŋ) come out as '?"s.
The console is using Lucinda console, so I don't think it's a font issue. I've tried -wmp, xmp fields etc, but never managed to round-trip utf8.
Thanks so much
jh
The problem is your console character encoding. Try adding -charset Latin when extracting, or change your console to UTF-8 (see FAQ 18 (https://exiftool.org/faq.html#Q18)).
Also, specifying -charset iptc=utf8 is not necessary. CodedCharacterSet takes priority if it exists.
- Phil
Phil, thanks for the help.
-charset Latin not only doesn't make sense to me (I'm trying to support all of Unicdoe), and it didn't help:
c:\dev\palaso\DistFiles>chcp 65001
Active code page: 65001
c:\dev\palaso\DistFiles>exiftool -ver
8.96
c:\dev\palaso\DistFiles>exiftool -use mwg -codedCharacterSet=utf8 -charset exif=utf8 -copyright="Copyright ŋoŋ" "test.pn
g"
1 image files updated
c:\dev\palaso\DistFiles>exiftool -charset Latin -copyright -charset exif=utf8 "test.png"
Copyright : Copyright ?o?
c:\dev\palaso\DistFiles>
As you can see from the first line, I begin by changing my console to 65001, as per the FAQ. So I'm still stuck. Could you post a similar example demonstrating it round-tripping? At least then I'd know for sure that the problem was my computer, and not my use of exiftool.
What I'm actually doing is open-source literacy software which is used with many scripts. The program uses exiftool to embed copyright/license on the illustrations to protect the indigenous artist's rights. So I need a solution that doesn't require, for example, permanently changing the user's default code page (I call exiftool from code).
thanks
jh
If you set your console to UTF-8, you shouldn't use -charset Latin.
But even if your console is UTF-8, it won't help if the console font doesn't contain the special characters. (What console font are you using?)
On my Mac here (with a UTF-8 console and a big font character set):
> exiftool -use mwg -codedCharacterSet=utf8 -charset exif=utf8 -copyright="Copyright ŋoŋ" "test.png"
1 image files updated
> exiftool -a -G1 -copyright "test.png"
[PNG] Copyright : Copyright ŋoŋ
[IFD0] Copyright : Copyright ŋoŋ
- Phil
Phil, thanks for checking on your mac.
I'm using Lucida Console, but it's not a font issue for me... the input text shows fine, so the font can't be the cause of the output showing as question marks.
Maybe chcp 65001 is quite enough? Maybe there is still some non-utf8 code page in the mix somewhere in the process? If any windows users are reading this and could try out the test, your results would be appreciated.
thanks
jh
As further datapoint, the latest exiftoolgui with the latest exiftool also cannot round-trip unicode. E.g, you can paste in "ᴓ", but once you click "Save", it reverts to '?'.
Interesting.. In GUI, I can put this character into Exif by using ExifTool direct, but not via Workspace (in both cases Utf-8 is used). However, I can't store this (3byte?) character into Xmp -neither with ETdirect nor via Workspace.
I'll play later with this...
Correction:
GUI stores "ᴓ" character correctly into all (Exif, Xmp and Iptc) metadata section if ExifTool direct is used. In Workspace however, this doesn't work. I couldn't figure out yet, what makes this difference in GUI.. Btw. in Workspace mode, ExifTool isn't called the same way as in ETdirect mode, but parameters should be the same -which obviously isn't the case. I'll try to solve that "mistery" as soon as possible.
Bogdan
I'm also having the problem that exiftool.exe write 3f instead of the data it's given (c2a9).
Please reply with solution!
Thanks!
fileName=foo.jpg
description=©
exiftool -overwrite_original -charset UTF8 -XMP-dc:Description="$description" $fileName
1 image files updated
exiv2 -gXmp.dc.description -PXv $fileName|xxd -g1
0000000: 6c 61 6e 67 3d 22 78 2d 64 65 66 61 75 6c 74 22 lang="x-default"
0000010: 20 3f 0a ?.
exiv2 -M"set Xmp.dc.description $description" $fileName
exiv2 -gXmp.dc.description -PXv $fileName|xxd -g1
0000000: 6c 61 6e 67 3d 22 78 2d 64 65 66 61 75 6c 74 22 lang="x-default"
0000010: 20 c2 a9 0a ...
exiftool -ver
8.97
exiv2 -V
exiv2 0.21.1
uname -a
CYGWIN_NT-6.1-WOW64 PC 1.7.15(0.260/5/3) 2012-05-09 10:25 i686 Cygwin
mintty -V
mintty svn-svn-r1275
Same result when running bash.exe through cmd.exe
bash -c ./foo.sh
...
bash --version
GNU bash, version 4.1.10(4)-release (i686-pc-cygwin)
ver
Microsoft Windows [Version 6.1.7601]
Happily, I chanced upon this blog (http://www.christian-etter.de/?p=33 (http://www.christian-etter.de/?p=33)). It lists several ways to get unicode into exiftool. The one I'm using now is where you add "-E" to the command line, then encode any non-ascii characters as html character entities. For example,
Instead of
exiftool -E -copyright="Copyright ŋoŋ"
I use
exiftool -E -copyright="Copyright ŋoŋ"
Also, in case it helps someone... if you have non-ascii characters in your filenames, in many cases you can instead pass the 8.3 version of the filename, which strips these characters. The Windows GetShortPathName() function generates this for you. So:
exiftool -copyright "C:\Users\John\AppData\Local\Temp\ффPalasoMetadataTest\teффst.png"
becomes
exiftool -copyright "C:\Users\John\AppData\Local\Temp\ALASO~1\TEST~1.PNG"
When writing to a file, this method leads to an error. I fixed this by adding "-overwrite_original_in_place".
If anyone chances upon this and is working in c#, you're welcome to our open-source code for working with metadata using exiftool. See http://projects.palaso.org/projects/palaso . The relevant code is under the PalasoUIWindowsForms project, "ClearShare" namespace.
John Hatton
SIL International