Hi,
In my PHP script I have used command -
$meta_data = '/usr/bin/exiftool -exif:all -iptc:all -a -g -j -struct -c "%s" -fast -charset iptc=UTF-8 '.$image_path;
and then Json decoded the data for processing.
However, I am facing a problem reading utf-8 characters properly. For some images, this works perfectly, while for some images not.
I have noticed that there is also a difference between reading images from PC and Mac.
Is there any way, I can detect from which source the image is coming and adjust script according to that. Or, convert IPTC data from images from
PC and Mac both to UTF-8 format and then extract.
Your help is much appreciated.
Thanks,
Sourav
Assuming IPTC is stored as UTF-8 is an assumption that will be often wrong. I would think the ExifTool default would give more reliable results (it assumes Latin1 unless UTF-8 is specified by the IPTC CodedCharacterSet). In the past, you might have been able to tell the difference between Windows/Mac images by looking at the byte order (Windows always uses little-endian), but Mac's now use Intel CPU's so this difference may be disappearing. You could also look at the Software tag to see if it gave any hint about platform.
- Phil
Hi Phil,
Thanks for your answer. However, is there any way via Exiftool, that I can convert all IPTC data in an image to UTF-8 format internally before extraction regardless of the actual character encoding, and regardless of the PC or Mac source.
Thanks,
Sourav
If you have a heuristic that you can apply to decide what encoding to use, then one option would be to assume UTF-8 for IPTC as you were doing, which effectively disables conversion, then do the conversion yourself.
But there is no 100% reliable method to determine the encoding of IPTC. This is one reason why this information type lost favour.
- Phil
Hi Phil,
There is one more question. I am trying to change the setting 'ExifUnicodeByteOrder' by this command -
exec("/usr/bin/exiftool -ExifUnicodeByteOrder='MM' ".$imagePath);
and then extract image data. So, that all images are in a common format before extraction. But that isn't changing the byte order.
Could you please specify, if I have missed something.
Many thanks,
Sourav
The ExifUnicodeByteOrder specifies the byte order when writing, not when reading. When reading, ExifTool always uses a heuristic to determine the actual byte order used.
I don't understand how you expected to influence the byte order of the images. If ExifTool reads the Unicode in the wrong byte order, all you would get is garbage.
- Phil