Fragility of metadata

Started by mikmach, February 15, 2018, 09:18:27 AM

Previous topic - Next topic

mikmach

Cautionary tale about how fragile is metadata chunk in photos when files are bounced around between various applications.

When processing photos into archive I use exiftool to fix and insert metadata, ImageMagick and Photoshop for editing. For later management and retrieval - Fotostation.

Recently noticed that Unicode encoded characters in Fotostation are mangled. During investigation I've found that from my suite only exiftool was updated, rest of the programs were at old versions.

Also something strange in workflow. Only files which were processed by ImageMagick last were affected. If they were 'touched' later by Photoshop - everything was OK, PS was somehow fixing them. Last piece - only in Fotostation encoding was broken. All other programs are showing metadata as they should (starting from Adobe products and ending with system properties dialogs). So, problem lies somewhere between mangling of metadata by ImageMagick and displaying it by Fotostation (even when only exiftool was changed).

This is not bug report, just cautionary tale (and cry of frustration) about fragility of metadata handling and interpretation.

Phil Harvey

What tags are having Unicode character problems?  Is it EXIF:UserComment?  If so, then byte ordering may be the problem.  Try running this command on a problem file:

exiftool -validate -warning -a FILE

If you get this warning message:

Warning   : Wrong byte order for EXIF UserComment UNICODE text

then you can fix it with the following command:

exiftool -tagsfromfile @ -usercomment FILE

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mikmach

No, problem is for Source and Copyright. When I run validate it shows another warning (as reference  (p) file is OK).


[/j/validate]$ exiftool -validate -warning -a 157953*
======== 157953(1).jpg
Validate                        : 1 Warning
Warning                         : Bad IPTC data tag (marker 0x0)
======== 157953(2).jpg
Validate                        : 1 Warning
Warning                         : Bad IPTC data tag (marker 0x0)
======== 157953(p).jpg
Validate                        : OK
    3 image files read


Master tif files show the same error but metadata is displayed correctly, only jpegs produced with ImageMagick are affected.

So workflow looks:

acquisition (tif) -> exiftool -> imagemagick -> tif OK display but validate returns warning
acquisition (tif) -> exiftool -> imagemagick -> tif -> imagemagick -> jpg BAD display and validate returns warning
acquisition (tif) -> exiftool -> imagemagick -> tif -> imagemagick -> jpg -> photoshop -> imagemagick -> jpg OK display and validate returns OK

Phil Harvey

I assume you mean IPTC:Source and EXIF:Copyright.  The former should be OK if you set CodedCharacterSet properly.  The latter is technically supposed to be straight ASCII, but most software should also accept UTF-8.  See FAQ 10 for more details.

But it would be interesting to know what is causing FotoStation to display things incorrectly.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mikmach

All images have CodedCharacterSet set properly and it was working properly for at least two years.

Interesting fact: it is not FotoStation itself which is buggy. When displaying photos from disk it shows them properly, only when using IndexManager metadata harvesting service it is broken.