exiftool extracting JPG from RAW, simultaneously, causing corruption

Started by jchin, July 22, 2018, 08:56:37 PM

Previous topic - Next topic

jchin

Help!  Is this a bug that I stumbled upon?

I have a batch file (Windows) to that runs from the Command Prompt to extract JPG from my RAW files.
I have never had a problem until today.
Normally I run the batch file against one set of files.

Today, I had 4 Command Prompt windows open, each in their own folder of RAW files, about 2000-2500 images per folder.
I ran the batch file in each Command Prompt on their own set of RAW files.
There was random corruption where one JPG image was used as the extracted JPG for multiple (usually consecutive) output files.
The output filenames were different, but the rest was identical.
Sometimes the corrupted JPG image was from another directory.


The batch files does this.

1. extract the JPG into a sub-folder called "extracted-JPG" using:
set EXTEN=-ext .CR2 -ext .CRW -ext .DNG -ext .NEF -ext .RAW -ext .TIF -ext .TIFF
exiftool.exe -progress -if $previewimage -b -previewimage -w%W% extracted-JPG/%%f_%%ue%THM%.jpg %EXTEN% .

2. removes some EXIF data and updates the timestamp of the extracted JPG:
exiftool.exe -progress -tagsfromfile @ -all "-EXIF:Orientation=" "-DateTimeOriginal>FileModifyDate" -srcfile extracted-JPG/%%f_%%ue%THM%.jpg -overwrite_original --ext JPG .


Does exiftool use a common "temp" folder?  Could that be the cause of the corruption?  Can we specify in the command-line where to place the "temp" files to avoid this?

My workaround for now, a batch file that consecutively walks to different folders and runs the extract JPG batch file; so I can run it and walk away for an hour or two while it is busy.

Phil Harvey

There is no temporary folder used by ExifTool.  The -w option (1st command) doesn't write a temporary file, and when writing (the 2nd command) a temporary file is created in the same directory as the original, with the same name but "_exiftool_tmp" added to the extension.  So there is no way that your commands should interfere.  There must be something else happening.

If you re-run the command that generated the bad files without running the other 3 exiftools, do you see the same problem?  If not, can you reproduce the problem when running 4 exiftools?  Are the same files affected each time?  Are you running any A/V software which may be interfering with ExifTool? (Although this would usually cause read/write errors and not corruption.)

I realize this may be a pain to track down since you are working with so many files, but I need your help on this one.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

jchin

I will re-run those 4 batches in a test group with the same original RAW (Canon and Nikon) files tomorrow evening and see what happens.
I need a quick dump of the preview images for an editor to look over tomorrow.

I know this is a crazy one and it will take time to debug.
Thank you for listening to the problem.



jchin

Sorry.  Forgot to answer your questions.

Yes.  I ran the 4 jobs, one after the other, and so far so good.  No corrupted files that I can see.  Then again, we are talking about almost 7000 files, so I cannot be 100% sure there is no corruption but so far I don't see any obvious series of duplicates like I did before.

And yes, there is Avast anti-virus running on the computer.  I also had a web browser open at the time.  I believe Lightroom was open too.  CPU (2nd gen Core-i7) and RAM, if I recall was not totally taxed but was high (CPU maybe above 60% and RAM was about 10GB out of 16GB in use).   Running Windows 7 x64.

Phil Harvey

The other apps shouldn't be a problem, but AV software has a record of trying to read files as ExifTool is writing them, which can cause problems.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

jchin

I ran the simultaneous jobs today and again there is corruption.
Different images, repeatedly replaced other images.

I ran them simultaneously again, each Command Prompt isolated to its own two CPU cores. 
Same thing.  Corruption, different files.

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

jchin

I tried it with Avast excluding the folders.  Same thing happens.
I then tried again with Avast completely disabled until reboot, same thing happens.

The odd thing is the corrupt sometimes crosses batches and sometimes it does not. 
The repeating images not always replacing consecutive images either.

See attached screen grabs of the resulting output.
Each camera was its own batch.
As you can see, the corruption crosses the batches; meaning the resulting corrupt image may not be from its own batch.

Filenames prefix 5D2 is from Canon 5D2.
Filenames prefix DSC is from Nikon D4S.

jchin

PROBLEM RESOLVED !!!

It must have been some Windows Updates.
I rebooted and the system finished installing some updates.
Tested again, still a problem.

I rolled back the updates using System Restore, to July 8th (before the July 10th patch-Tuesday updates).
Rebooted again just to be sure.
Ran the 4 batches again, simultaneously, with Avast running.
This time they were all correct, no overwritten files.

I have no idea how the recent patch-Tuesday stuff affects things, but this was something a colleague at work (non-photographer, but IT geek just like me) suggested because he experienced odd issues after this recent batch of Microsoft patches also.

jchin

And just for the sake of being double sure.
I ran the 4 batches again.
All is good now.

Thank you for entertaining my post.
Next time I will reboot my computer and try again, before freaking out.
Sorry for the bug scare.

Phil Harvey

Actually, I'm more scared now.  If this update affected ExifTool like this then there would be a good chance that it was randomly scrambling all of the files written to your disk!

Do you know exactly what update this was?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).