Some files may be bloated with over 50mb of errant XMP-photoshop:DocumentAncestor data.
When processing, ExifTool reports "Warning: [Minor] Extracted only 1000 photoshop:DocumentAncestors items. Ignore minor errors to extract all".
I was wondering if there is a method to report on files that contain this tag and to perhaps selectively list/sort/filter or otherwise weed out files that have "excessive" amounts of this data. So rather than simply removing this entry from all files, only remove it from files where the items are over 100 items, or 1mb of data, or output a CSV file listing the size of the tags etc.
Hi Stephen,
You could try this to delete DocumentAncestors if it has more than 100 items:
exiftool -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" -documentancestors= DIR
Here I have used a little Perl trick to count the number of items in the string.
- Phil
Edit: This command actually works for > 101 items because it is counting the number of separators, and the number of items will be one greater
Fantastic, thanks Phil!
There is no point "throwing the baby out with the bathwater", so rather than indiscriminately deleting this data I thought that it would be wise to do it for "excessive" items... Now all I need to do is figure out what "excessive" means!
Quote from: Phil Harvey on April 20, 2017, 07:47:50 AM
You could try this to delete DocumentAncestors if it has more than 100 items:
exiftool -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" -documentancestors= DIR
Shouldn't, then, the following command return a list of what those images are?
exiftool -filename -r -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" DIRexiftool is not returning any:
1 directories scanned
67 files failed condition
0 image files readeven thought I have confirmed there is a PNG file in the DIR folder matching that condition by extracting the XMP metadata with
exiftool -b -XMP image.png >out.xmpwhich returns:
Warning: [Minor] Extracted only 1000 photoshop:DocumentAncestors items. Ignore minor errors to extract all - image.png
You need to use single quotes on Mac/Linux.
- Phil
Ouch! Thanks!
Now, how come that while extracting the XMP metadata reports "Extracted only 1000 photoshop:DocumentAncestors items [...]" but checking if there are more than 999 items returns false?:
$ exiftool -filename -r -if '$documentancestors and (()=$documentancestors =~ /, /g) > 999' image.png
1 files failed condition
Note that checking for "only" more than 100 does return true.
You're counting the number of ", " in the string, which is one less than the number of items. So "> 998" should be true.
- Phil