Exif corruption when copying tags from JPG to TIFF

Started by mariposa, March 18, 2012, 11:32:46 AM

Previous topic - Next topic

mariposa

Hello all,

I am in the process of creating backup copies of my important images for long-term storage. Therefore I am converting my existing JPGs to uncompressed TIFFs with Imagemagick's "convert old.JPG new.TIFF"

Unfortunately, in the process a lot of the existing Exif tags are stripped out, like maker notes, aperture, flash, etc. Only a few remain in the new TIFF file.

But when I use
exiftool -tagsfromfile old.JPG new.TIFF
in order to save my tags to the TIFF file for future generations, some appropriate tags (like maker notes) are copied into the new TIFF image, while others (like aperture, shutter speed, flash) are not copied over.

Furthermore, I end up with the following
Warning: ExifIFD pointer references previous IPTC directory

I'm unsure where it all goes wrong. Can Exif information like shutter speed not be copied into a TIFF image? And why should the tags that exiftool copies correctly leave the Exif information corrupted?

Thanks for all pointers.

Phil Harvey

There is something funny happening here.  Would it be possible to email me the images in question? My email is philharvey66 at gmail.com

It should be possible to do what you want, but it looks like there is something odd with one or other of the images.

Also, what version of ExifTool are you using?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Thanks for the samples.  This is very interesting.  I haven't seen this before...

There are 2 rather unique things about your images which are causing this problem:

1) There is an IPTC directory inside the Photoshop information of your original TIFF image (as written by ImageMagick).  Technically, this is wrong, because the IPTC should be stored separately in IFD0 of a TIFF image.

2) The length of the IPTC in both the original TIFF image and the JPG image is zero.  This is odd and unusual, but technically allowed.

The problem occurs because ExifTool validates the addresses of the various types of metadata in the image to avoid recursively processing the same metadata, and raises a warning if two addresses are the same.

In this case, and quite by chance, the IPTC and the ExifIFD in the resulting TIFF have the same starting address, so ExifTool raises a warning.  But since the length of the IPTC directory is zero, the two types of metadata don't actually overlap.

Crazy.

I never would have been able to figure this out without the samples you sent.  Thanks for these.

I will add a length test to exiftool to avoid the warning in cases like this.  This update will appear in ExifTool 8.85.

However, in your case the problem is solved by deleting the empty IPTC directory in your TIFF image (either before, during or after you copy the tags from the JPG image).  You could just delete the IPTC, but I recommend deleting the containing Photoshop information as well (since it contains nothing but the IPTC anyway):

exiftool -photoshop:all= new.TIFF

This will fix the problem that is generating the ExifTool warning, and also improve the consistency of the metadata in this image by deleting the out-of-place (and empty) IPTC information.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mariposa

Works!

Thank you very much. (Also glad I could help a little...  :) )

pb

Quote from: mariposa on March 18, 2012, 11:32:46 AM
Hello all,

I am in the process of creating backup copies of my important images for long-term storage. Therefore I am converting my existing JPGs to uncompressed TIFFs with Imagemagick's "convert old.JPG new.TIFF"

Unfortunately, in the process a lot of the existing Exif tags are stripped out, like maker notes, aperture, flash, etc. Only a few remain in the new TIFF file.

But when I use
exiftool -tagsfromfile old.JPG new.TIFF
in order to save my tags to the TIFF file for future generations, some appropriate tags (like maker notes) are copied into the new TIFF image, while others (like aperture, shutter speed, flash) are not copied over.

Furthermore, I end up with the following
Warning: ExifIFD pointer references previous IPTC directory

I'm unsure where it all goes wrong. Can Exif information like shutter speed not be copied into a TIFF image? And why should the tags that exiftool copies correctly leave the Exif information corrupted?

Thanks for all pointers.

I am curious why you are archiving jpegs as uncompressed tiffs?

--peter

mariposa

Sorry Peter, I thought I'd have more time to respond to your question, but just didn't work out, so here a short reply:

The reason for going for an uncompressed file format for long-term storage is that compressed images are susceptible to loss, once the compression information is corrupted. I've had that happen to me a couple of times. In those cases, almost all of the image information can still be recovered, however, the picture cannot be decoded any more because the crucial info about *how* the picture was encoded is not recoverable. Unfortunately, I can't express it in more technical terms, because I don't really understand the technology behind image compression. Maybe someone else can elaborate on that.

This post on the blog "Scan you entire life" addresses a very similar point under the sub-heading "What I really think of compression".
http://www.scanyourentirelife.com/2011/tiff-vs-png-file-format-hurt-save-scanned-photos/

However, what this guy describes as a theoretical possibility has actually happened to me a couple of times, as a result of which I have lost photos.

Hope this clears up matters somewhat.

pb

That's a good reason.  However, I think similar things can happen to tiff if some header information is corrupted that tells how the tiff is organized.  Also, depending on exactly what jpeg encoding you're using, the data loss of a bad bit or byte might not be very severe.  It's actually a tradeoff -- because a tiff file is a lot larger, there is a higher probability that it will suffer corruption, but that corruption might be less objectionable depending on where it occurs in the tiff file.  In a jpeg file, you might only lose one frequency coefficient in an 8x8 block, which might be no more noticeable than one pixel in a tiff. 

It's not clear to me which has the higher probability of really screwing up your image -- corruption of a jpeg header, or a tiff header.  That will also depend, btw, on what kind of tiff file it is.

If we consider only the data, not header, part of the file, then an argument can be made that corruption will affect either one equally.  The argument goes like this:  the amount of redundancy in the tiff is equal to the compression ratio of the jpeg (for a given perceptual quality).  Call this ratio C for compression.    That means that a bad byte in a jpeg is C times worse on average than a bad byte in a (uncompressed) tiff.

But meanwhile, the size of the tiff is also C times larger than the jpeg, which means that any corruption that occurs on the disk has a C times greater probability of corrupting that file.  So the average expected perceptual corruption of a jpeg relative to a tiff is C * (1/C).  In other words, the same.

I just made all that up without checking the literature, so if you look for real papers on it, you might find a different take.

On a slightly different subject, also, be careful to note that tiff format actually also includes multiple different compression options, so you should be sure you are not using any of them if you want to avoid compression.

And finally, the best insurance against corruption is multiple conscientious backups and periodic data verification, no matter what encoding you use.

MOL

Quote from: pb on April 29, 2012, 03:28:20 PMIn other words, the same.

You forget that there is an additional step involved when saving the JPEG: the compression. In a non-compressed TIF that is (obviously) not the case, so in the end you will have to ask yourself whether an increased file size is riskier than compressing AND saving the file.

Uwe

pb

Quote from: MOL on May 01, 2012, 03:47:47 PM
Quote from: pb on April 29, 2012, 03:28:20 PMIn other words, the same.

You forget that there is an additional step involved when saving the JPEG: the compression. In a non-compressed TIF that is (obviously) not the case, so in the end you will have to ask yourself whether an increased file size is riskier than compressing AND saving the file.

Uwe
I don't believe mariposa was talking about a jpeg file that he is compressing, but rather a jpeg file that came that way from whatever the source is.  If you have a file that has not ever been compressed, then there is good reason not to compress it for archival purposes.  But he was talking about converting jpeg to tiff.

However, maybe you meant to make the opposite point.  Namely, if the eventual use of the file is going to be as a jpeg, then saving it as tiff will LATER require compressing it to jpeg, and that will generally result in quality degradation compared to the jpeg you started with, although depending on the quality of the first and second jpegs, it might or might not be noticeable.

MOL

Sorry, I misunderstood. I thought you were saying that saving an image as JPEG and saving an image as an uncompressed TIF would be equally risky.

Uwe