[Originally posted by charliekk on 2008-03-12 12:34:02-07]Hi,
I'm trying to set EXIF UserComment. The software I use the image files with (e.g. JAlbum under Linux, MS Windows Explorer,...) seems to expect it in UTF-16LE (little-endian). Unfortunately the "exiftool" command-line application sets it as UTF-16BE (big-endian). I find no way to change this. Also if I try it with a small skript (see below) it doesn't work - the resulting image file contains the comment as UTF-16BE. According to the docu the code line marked with the # sign should set little-endian? It says "Note that if EXIF information already exists, the existing order is maintained", so my first idea was there might be UTF-16BE stuff already in the file. But I removed all EXIF first before running exiftool and still only get BE.
So my question is how can I change the byte order - both in the "exiftool" application and programatically when using the Image::ExifTool classes?
BTW I ran my test skript in the debugger and know which variable to change to create the desired result (breakpoint in function "Charset2Unicode", set $fmt to "v" there and continue program). But how can I get this via regular options?
Here's the small test script I used:
#!/usr/bin/perl
use Image::ExifTool qw(:Public);
$imgFile = $ARGV[0]; $commentStr = $ARGV[1];
my $exifTool = new Image::ExifTool;
$exifTool->Options(ByteOrder, 'II'); # This should set "Intel" byte-order (= little endian)
$exifTool->SetNewValue(UserComment => $commentStr);
$exifTool->WriteInfo($imgFile,$imgFile.".xxx");
[Originally posted by exiftool on 2008-03-12 12:51:19-07]Unfortunately, the EXIF specification is very vague about the byte ordering
of Unicode text fields (also very vague about the codepage, but that is
another problem). I have seen that some utilities write only little-endian
unicode, which in my opinion is wrong in a big-endian file. Currently,
the API ByteOrder option (and the EXIFByteOrder tag, which can be used
from the command line) set only the byte ordering of the EXIF when creating
a new EXIF segment. But since you are writing to existing EXIF, the EXIF byte
order is always used. This is the first complaint I have received about an
application that has had problems with this.
The proper place to change this in the ExifTool code is in the EncodeExifText()
subroutine of lib/Image/ExifTool/WriteExif.pl. If you change the call to
Charset2Unicode() as follows, then EXIF Unicode text will always be written
little-endian:
return "UNICODE\0" . $exifTool->Charset2Unicode($val, 'II');
I know this isn't a good solution for you because you will have to make this
modification each time you update ExifTool. I have done some research in the
past to see what the accepted practice is, but I will look into this again.
- Phil
[Originally posted by charliekk on 2008-03-12 13:05:08-07]
Ok, it seems to work with this patch, thank you VERY much, I can live with this "quick'n dirty" solution.
I also found no better info which encoding type to use in a "wellformed" EXIF. I also don't know what is commonly used. At least I think for a standard to be useful the picture files have to be readable on every hardware - regardless of it's CPU using BE or LE. But that would be in an ideal world...
[Originally posted by exiftool on 2008-03-12 14:49:35-07]I found some posts of other people with this problem:
post 1,
post 2.
In the first post, the person has the opposite complaint to you. He complains
that applications don't read the little-endian UserComment written by Windows.
A Microsoft employee answers by saying "this is how photoshop does it".
Then in the second post, someone complains that photoshop changes the
byte order of the UserComment.
So people can't even seem to agree on the current behaviour of photoshop,
let alone agree on a standard byte ordering strategy for EXIF Unicode.
To me, the only reasonable strategy is to store the text in the same byte order
as the rest of the file. This should be the default unless stated otherwise
by the EXIF specification, which it was not.
So unless someone can convince me otherwise, I don't think default behaviour
of exiftool should be changed. Perhaps, though, it may be reasonable to add
an option to allow the behaviour to be specified.
- Phil
[Originally posted by charliekk on 2008-03-12 18:51:52-07]
I agree, the default behaviour absolutely makes sense. A command line switch for "exiftool" and a API property to change it would be very helpful though.
BTW, maybe a stupid question: What does it mean at all to say "same byte order as the rest of the file"? Is there a "byte order notion" for JPEG at all? Is a JPEG generated on a Motorola not readable on an Intel or Vax machine?
[Originally posted by exiftool on 2008-03-12 19:44:51-07]
If you were talking about TIFF, then "same byte order as the rest of
the file" is meaningful. For JPEG, I would rephrase it to be
"same byte order as the rest of the EXIF information". The
JPEG format is actually big endian, but it contains EXIF information
which may be big or little endian.
Any computer can read any file generated in any byte order.
The only drawback is that the bytes will have to be swapped
in software if the byte order is not native to the machine.
- Phil
[Originally posted by exiftool on 2008-03-13 14:11:10-07]After looking into this in more detail, I am a bit embarassed to say
that exiftool has not been behaving as designed, and has been writing
EXIF Unicode text in big-endian byte order regardless of the EXIF byte
ordering. I have fixed this so the EXIF byte order is used by default as
it should have been, and have added the ability to force a specific
byte order by setting the value of the new ExifUnicodeByteOrder tag.
This feature will appear in exiftool 7.22 when released, and a
pre-release is available
herefor testing.
- Phil
[Originally posted by charliekk on 2008-03-14 13:56:24-07]First of all I thank you so much for digging into this so quickly!!
I installed 7.22p and tested:
exiftool -ExifUnicodeByteOrder="II" -Exif:UserComment="Bär" 100_0365.jpg
Using khexedit I can confirm that "Bär" (BTW can I embed an
ä in a <code> section?) is now in UTF-16LE as intended. I'm only surprised about the following:
[~/pics]$ exiftool 100_0365.jpg | grep -i byte
Exif Byte Order : Big-endian (Motorola, MM)
exiftool thinks the LE comment it just created is BE?
For the second change you did - use LE as default if the initial EXIF IS already LE - I could not test directly. Seems all my JPEGs (I have no TIFF) are BE??? So what I did is delete all EXIF from a file using
exiftool -all= 100_0365.jpg
and then add a comment - now without the new command line switch:
exiftool -Exif:UserComment="Bär" 100_0365.jpg
The resulting comment is now BE, as indicated by khexedit. I conclude from this that "BE" is kind of a "master default", means if there is no initial EXIF at all, use BE? This is ok for me.
[Originally posted by exiftool on 2008-03-14 14:24:51-07]You are getting confused between the EXIF byte order, and
the byte order of the Unicode comments inside the EXIF (which
can be different). ExifTool writes EXIF in BE by default, but this
can be set with the ExifByteOrder tag. ExifTool does not report the
byte order of the Unicode text. The ExifByteOrder tag reports
the overall byte order for the EXIF.
So with version 7.22, if you do this:
exiftool -all= 100_0365.jpg
exiftool -Exif:UserComment="Bär" -exifbyteorder=Little 100_0365.jpg
Then you will get a little-endian exif byte order and a little-endian
Unicode comment.
If you want a big-endian comment in little-endian EXIF, you would
have to do this:
exiftool -all= 100_0365.jpg
exiftool -Exif:UserComment="Bär" -exifbyteorder=Little -exifunicodebyteorder=Big 100_0365.jpg
I hope this clears things up.
For anyone else reading this: In general, I would suggest not setting
either byteOrder tag, and letting exiftool use its defaults unless you
have a specific reason for doing otherwise.
- Phil
(I don't know how to get special characters in the <code> block either)
[Originally posted by charliekk on 2008-03-14 14:27:40-07]
Ah, I think after thinking about it, I understand the logics now. The new command line switch ExifUnicodeByteOrder is intended to "fool" the tool into believing that the EXIF segment was in a byte order different from what it might otherwise conclude from an existing "byte order marker" or magic reasoning from existing tags - NOT to introduce a new EXIF tag into it. So I understand the behaviou described above ("...I'm only surprised about the following:...")
[Originally posted by exiftool on 2008-03-14 14:32:02-07]Correct. Except there is no magic, just the EXIF byte order mark.

- Phil