Mac OS X: -overwrite_original_in_place -preserve fail to reliably preserve dates

Started by houdah, March 13, 2014, 07:08:16 AM

Previous topic - Next topic

houdah

Hi Phil,

Combining -overwrite_original_in_place -preserve should preserve modification dates. On Mac OS X this sometimes works and sometimes doesn't. I can see no pattern. The same file, the same call to exiftool can work as expected once and fail the next time.

I have done a lot of testing on this. exiftool seems to work exactly like it should. setFileTime is called with the correct values each time. It manages to set the times. When exiftool exists all looks good when checked with Finder or ls -l.

However 10 to 20 seconds later some files see their modification date updated to the current time.

Using -overwrite_original_in -preserve does preserve dates, but loses other metadata like the Mavericks Finder tags.

I am calling exiftool from my application (HoudahGeo). I tried restoring the dates there after the call to exiftool. Made no difference. Some 10 seconds later the dates might get updated. If I remove the call to exiftool, the files remain unchanged and dates don't get updated.

I figure exiftool is somehow triggering an OS bug or odd behavior.
I wonder if there could be a delay in the file being closed after exiftool finished writing. exiftool does call close(ORIG_FILE).
Or maybe the OS is updating the file in an attempt to restore metadata like Mavericks Finder tags.
I have disabled Spotlight and Time Machine on the directory. Made no difference.

Do you have any idea what could be the problem or where else to look?
I noticed the workaround for a Mac OS X 10.6 bug. Is this the same problem?

I have been testing with exiftool 9.53 on Mac OS X 10.9.2.
The command line call is of this form:

exiftool -m -a -f -overwrite_original_in_place -P -j=IMG_1399.json IMG_1399.CR2

Best,
Pierre Bernard
Houdah Software s.à r.l.

Phil Harvey

Hi Pierre,

I think this is the same problem notice in 10.6 that I tried to patch by writing the date/time twice.  It seems that if I close the file then set the file modify date immediately afterwards, the new time occasionally won't take.  It must be some sort of race condition in the way the O/S closes a file, but don't have any more ideas about how it could be fixed.  If it was reproducible then I could submit a bug report to Apple.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

houdah

Hi Phil,

I wonder of Mac OS X somehow delays the closing of the file. Dates are correct after exiftool exists. Having exited, exiftool should not have any open files anymore. Yet some 10-20 seconds later some files see their dates change.

Unfortunately this is not fully reproducible. When I tag some 300 files, I see it happen for about half of them.

I wonder if there is a way to check if the file is really closed or to force the system to close it for good. I'll try copying the file. That may force the OS to purge queues.

Pierre

Phil Harvey

Hi Pierre,

I certainly do everything I can to close the file at my end.  If I remember correctly I think I even tried adding a flush before closing, but without success.  If you manage to find a pattern with the affected files, or any correlation with other system settings, please let me know.  I didn't turn up any similar problems with a Google search either, although I found it difficult to search for this exact problem.

It is interesting that you say 10-20 seconds later, implying that sometimes it takes more than 10 seconds.  I hadn't noticed that there could be such a large time lag.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

houdah

Hi Phil,

I have opened a bug report with Apple. They have requested additional information from fs_usage. I have collected and submitted that, but am not sure that is going to help much.

So I tried to build a sample script to recreate the issue. Unfortunately I was not yet able to trigger the bug with my script.
If you wish to have a go, it is at https://dl.dropboxusercontent.com/u/2381634/RADAR/TestModificationDates.zip

The ZIP contains 3 files.
- A sample image file
- A perl script which copies the sample image into another file and the restores the modification date on that file
- A bash script which creates a directory in ~/Downloads, creates files there and passes those to the perl script

After processing a file, the bash script verifies that the file has the expected modification date. Then it re-verifies modification dates of all files in the test directory. I hope to catch a file whose modification dates changes after the fact.

The bash script tries the above procedure for 1000 files. It should exit on first error. Has not happened here. I tried to model the file open, copy, close, date setting procedure after exiftool. Maybe I missed something here.

Pierre

houdah

From the data I sent to Apple:

"I ran a test where I processed a folder with 384 files. After processing a file I reset its modification dates to their original values. Yet after completing the test, 116 of the files showed new modification dates. One of the files is IMG_1369.CR2. The perl script was done with the file at 10:12:55. Yet stat reports a modification time of 10:13:51."

10:12:55.054587    WrData[A]       D=0x52600908  B=0xfe000  /dev/disk0s2  /Users/pierre/Downloads/HoudahGeo TEST/IMG_1369.CR2                                                                                                 0.031924 W perl5.16.2997221

...

10:13:51.097537    WrData[AT3]     D=0x52607108  B=0x2000   /dev/disk0s2  /Users/pierre/Downloads/HoudahGeo TEST/IMG_1369.CR2                                                                                                 0.000089 W launchd.250

...

10:13:51.117658    WrData[AT3]     D=0x52630d70  B=0x12000  /dev/disk0s2  /Users/pierre/Downloads/HoudahGeo TEST/IMG_1369.CR2                                                                                                 0.000222 W launchd.250

Phil Harvey

Hi Pierre,

Thanks for your work on this problem.

I spent a few hours this morning trying some tests of my own.  I can't yet trigger the bug with my test script (attached), but I did notice that using exiftool the bug tends to happen with the same files each time, and that most of these files have a resource fork.  If I delete the resource fork with -rsrc:all= when using exiftool, then I am no longer able to trigger the bug.

I was able to reproduce these results consistently iterating back and forth between the two commands 3 times in a row.  There were 8000 files in my test set, usually about 6 of them would trigger the bug, but only when I didn't use -rsrc:all=.  (Note that I disabled the PerserveTime patch in exiftool for the purposes of this test.)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Hi again Pierre,

My previous tests were not using the -overwrite_original_in_place option.

When overwriting in place, I am able to trigger the bug with the attached script.  This script fails more often than my previous exiftool tests, and on files without a resource fork.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

houdah

Hi Phil,

So far I have not been able to trigger the bug using my scripts or either of yours.
The following does trigger the problem:

exiftool -overwrite_original_in_place -P -gpslatitude=1 *.*


Will dig further.

Pierre

Phil Harvey

Interesting.  The last script I posted (bug_test2) triggers the problem with a fairly high rate for me (maybe 100 files or so out of 8000) on OS X 10.8.5.

Thanks again for your work on this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

houdah

Hi Phil

Thank you for all your work!

I was able to trigger the problem using bug_test2. It seems I need to keep the file system otherwise busy to trigger the bug. E.g. by running several instances of the script over several copies of the test directory concurrently. On a Mac with a hard disk drive I was also able to trigger the bug by extracting a large zip file while the script was running.

I have updated my bug report with Apple.

Pierre