Does embedding metadata corrupt image files?

Started by emccainaz, January 07, 2013, 02:54:14 PM

Previous topic - Next topic

emccainaz

Hi All,

I'm working at the Center for Creative Photography now and have been advocating the use of embedded metadata in image files, especially those that are being shared with outside vendors or other outside users. The question has come up as to whether or not the embedding of metadata can corrupt image files. Specifically, someone seems to remember that long text strings or just a certain large amount of text in an IPTC field will cause image corruption. Can someone clarify this issue for me? Is there particular data collected about this somewhere so I can document this and share it with my work group at the CCP?

Thanks,

Edward McCain
Center for Creative Photography

Phil Harvey

Hi Edward,

Are you talking about JPEG images?  If so, the metadata is well separated from the image so (at least using ExifTool) there should be no risk of image corruption.

With some TIFF-format images, if you write the certain tags you could potentially affect the way the image is displayed (ie. ICC_Profile and some EXIF tags), however the image data should never be corrupted.  IPTC and XMP aren't among the tags that would affect the image display.

If ExifTool ever corrupts an image, please tell me and I'll fix it immediately.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

Quote from: Phil Harvey on January 07, 2013, 02:58:28 PM
Are you talking about JPEG images?  If so, the metadata is well separated from the image so (at least using ExifTool) there should be no risk of image corruption.

With some TIFF-format images, if you write the certain tags you could potentially affect the way the image is displayed (ie. ICC_Profile and some EXIF tags), however the image data should never be corrupted.  IPTC and XMP aren't among the tags that would affect the image display.

The CCP is using the TIFF file format for the primary archival source of each image. Right now these are all high resolution scans from the original fine art print or sometimes from a negative. The derivative files are either TIFF or JPEG formats.

I think some of the concern came from one of the staff who had heard that JPEG2000 was more "archival" than JPEG because - according to one source - the possibility of corruption from embedding metadata in the JPEG format was supposedly greater than possible metadata corruption of JPEG2000. I am not exactly sure where this belief originally came from, but I want to follow due diligence so that our collection of hundreds of thousands of image files is not in jeopardy of corruption from my efforts to embed ITPC metadata.

Thanks,

Edward

Phil Harvey

Hi Edward,

Quote from: emccainaz on January 07, 2013, 03:43:45 PM
one of the staff who had heard that JPEG2000 was more "archival" than JPEG because - according to one source - the possibility of corruption from embedding metadata in the JPEG format was supposedly greater than possible metadata corruption of JPEG2000.

I can't lend any credibility to this claim.

However, whatever you decide, I would suggest that you use the newer IPTC core (XMP) tags rather than the old IPTC IIM format, simply because XMP is extensible and likely to be better supported in the future.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

Thanks, Phil. That confirms my understanding of the current state of embedded metadata, but I didn't want to take any unnecessary chances with the CCP collections.

Edward