ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: Stephen Marsh on April 20, 2017, 05:48:44 AM

Title: Size of Tag?
Post by: Stephen Marsh on April 20, 2017, 05:48:44 AM
Some files may be bloated with over 50mb of errant XMP-photoshop:DocumentAncestor data.

When processing, ExifTool reports "Warning: [Minor] Extracted only 1000 photoshop:DocumentAncestors items. Ignore minor errors to extract all".

I was wondering if there is a method to report on files that contain this tag and to perhaps selectively list/sort/filter or otherwise weed out files that have "excessive" amounts of this data. So rather than simply removing this entry from all files, only remove it from files where the items are over 100 items, or 1mb of data, or output a CSV file listing the size of the tags etc.
Title: Re: Size of Tag?
Post by: Phil Harvey on April 20, 2017, 07:47:50 AM
Hi Stephen,

You could try this to delete DocumentAncestors if it has more than 100 items:

exiftool -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" -documentancestors= DIR

Here I have used a little Perl trick to count the number of items in the string.

- Phil

Edit: This command actually works for > 101 items because it is counting the number of separators, and the number of items will be one greater
Title: Re: Size of Tag?
Post by: Stephen Marsh on April 20, 2017, 04:28:58 PM
Fantastic, thanks Phil!

There is no point "throwing the baby out with the bathwater", so rather than indiscriminately deleting this data I thought that it would be wise to do it for "excessive" items... Now all I need to do is figure out what "excessive" means!
Title: Re: Size of Tag?
Post by: elmimmo on October 16, 2017, 09:06:28 AM
Quote from: Phil Harvey on April 20, 2017, 07:47:50 AM
You could try this to delete DocumentAncestors if it has more than 100 items:

exiftool -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" -documentancestors= DIR

Shouldn't, then, the following command return a list of what those images are?

exiftool -filename -r -if "$documentancestors and (()=$documentancestors =~ /, /g) > 100" DIR

exiftool is not returning any:

    1 directories scanned
   67 files failed condition
    0 image files read


even thought I have confirmed there is a PNG file in the DIR folder matching that condition by extracting the XMP metadata with

exiftool -b -XMP image.png >out.xmp

which returns:

Warning: [Minor] Extracted only 1000 photoshop:DocumentAncestors items. Ignore minor errors to extract all - image.png
Title: Re: Size of Tag?
Post by: Phil Harvey on October 16, 2017, 09:21:40 AM
You need to use single quotes on Mac/Linux.

- Phil
Title: Re: Size of Tag?
Post by: elmimmo on October 17, 2017, 08:03:27 AM
Ouch! Thanks!

Now, how come that while extracting the XMP metadata reports "Extracted only 1000 photoshop:DocumentAncestors items [...]" but checking if there are more than 999 items returns false?:

$ exiftool -filename -r -if '$documentancestors and (()=$documentancestors =~ /, /g) > 999' image.png
    1 files failed condition


Note that checking for "only" more than 100 does return true.
Title: Re: Size of Tag?
Post by: Phil Harvey on October 17, 2017, 08:09:18 AM
You're counting the number of ", " in the string, which is one less than the number of items.  So "> 998" should be true.

- Phil