Image data corruption on MacOS

Started by MichaelKnight, March 15, 2023, 02:45:12 PM

Previous topic - Next topic

MichaelKnight

Just wanted to report that I just had the same corruption problem happen to me recently, on Mac OS. It happened when shifting the dates of just over 1000 images/videos at once. So far, I've found 3 of the photos to have been corrupted in their image data.

I wanted to post in this thread because it's very recent and what happened to my photos might be connected. Not sure how to post the files here, but I have the original photo, the modified corrupted photo, and the modified good photo if anyone would like to compare them. (Running the same exiftool command on the same, singular photo later resulted in no corruption.)

Running version 12.50 from homebrew

Phil Harvey

Hi Michael,

If you could upload the files somewhere (google drive or some file sharing service) and send me a link I'll take a look.  My email is philharvey66 at gmail.com

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).


Phil Harvey

#3
I got the files, thanks.

Your file is corrupted in a very different way (different from the original thread that you posted to before I split this topic):  There is a block of 832 bytes in the middle of the JPEG image data that is basically all corrupted.  The bad block ends exactly on a 256-byte boundary, which I don't think is a coincidence, and I think points to a disk problem because any problem in the computer RAM wouldn't align with a boundary like this in a JPEG file (the start of the JPEG image data isn't even byte aligned in this file).
 
I looked at the corrupted data and it did look similar to JPEG-compressed data in general appearance.  I tried scanning for some of the bad byte sequences in the good file to see if perhaps they were transposed from another sector in the same file, but they weren't.

My conclusion here is that this problem has symptoms very different from the other problem in this thread, yet again I can't see how this could be caused by software.  :(

In either case, the new feature provides a way to check for errors like this:

1. Add an argument like "-xmp:identifier<imagedatamd5" when modifying the file with ExifTool.  (Here I'm repurposing XMP:Identifier to use it to store the image data MD5 digest.)

2. After modifying all files, you can identify any ones with corrupted image data using a command like this:

exiftool -p "This file is bad: $filename" -if "$xmp:identifier ne $imagedatamd5" DIR

(use single quotes instead of double if you are on Mac or Linux)

This command will print the names of any bad files, while good files will give no output because they will fail the -if condition.

- Phil

NOTE: ImageDataMD5 values will change for some JPEG images in ExifTool version 12.59 and later (these versions will also add RST segments to the MD5)

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

MichaelKnight

#4
Thanks Phil for looking into it. I ran some tests. If I run exiftool on files on my Mac's internal SSD, I don't seem to get the corruption. But if I run exiftool on files on an external SSD, I do get the corruption (about 10 out of every 2000 images edited becomes corrupted). But the corruption only happens when using exiftool.

Never mind, it does happen on the internal SSD as well. Just seemingly less often than on the external SSD (~3 corrupted out every 2000 changes instead of 10). I doubt both my internal and external SSDs are both failing, so maybe it's a bug with the file operations that exiftool uses?

This is really bad, I've edited tens of thousands of files with exiftool in the last few weeks. I can recover the originals for the JPEGs by using something like Bad Peggy to identify the corrupted JPEGs, but the videos... the videos are being changed too. But I can't tell which of them are corrupted because they don't go bad like JPEGs immediately do when something is changed. So I'd have to start over with everything...

Phil Harvey

#5
This is a problem that I take VERY seriously.  If other people experienced this problem I would need to disable the write capability of ExifTool completely unless a solution was found.  I am running MacOS myself and use ExifTool extensively, often on large batches of images, and have never experienced a problem like this.  But I'll run some dedicated tests using the new ImageDataMD5 feature to stress this as much as possible and see if I can reproduce the issue.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

I only have about 20 GB of free disk space on my system here, so I ran a test were I edited almost 7000 Nikon NEF and JPG files with a total size of 17 GB.  The files were read from a USB thumb drive and written to the local SSD on my 2014 MacBook Pro running MacOS 10.14.6.

I ran the test 2 times.  Zero image corruption issues (as verified by comparing ImageDataMD5 with the originals).

Then I wrote a script to edit a bunch of tags in the same 44 MB Nikon D810 NEF file 1000 times, checking the ImageDataMD5 and erasing the edited file after each write.  I only needed 44 MB of free space for this test, but it effectively wrote 44 GB worth of files.  This time the source file was on the SSD (vs. the thumb drive in the first test).

I ran this this test 2 times as well.  Zero image corruption issues.

To summarize:  I used ExifTool to write a number of tags to almost 16000 Nikon NEF and JPEG files totalling 122 GB on my MacOS system with zero image data corruptions.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

MichaelKnight

Thanks Phil, I think this is actually a bug in recent versions of macOS itself. Finder seems to randomly corrupt files when they're copied to/from an external drive. I started using command line "cp" to copy files instead of Finder, and the corruption stopped. I think Bad Peggy is inconsistent in how it finds corruptions as well, so it looked like an exiftool problem when it likely isn't.

More details on this Mac bug.

Phil Harvey

A problem like this could easily be caused by using cheap USB cables.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).