New XMP Compact settings

Started by Phil Harvey, June 18, 2019, 11:49:11 AM

Previous topic - Next topic

Phil Harvey

Here is a demonstration of the new settings for the API Compact option (available with ExifTool 11.52 and later) and their effect on the file size:

First, here are the commands I used to create the test files:

% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/0.jpg
    1 image files created
% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/1.jpg -api compact=1
    1 image files created
% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/2.jpg -api compact=2
    1 image files created
% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/3.jpg -api compact=3
    1 image files created
% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/4.jpg -api compact=4
    1 image files created
% exiftool t/images/Writer.jpg -tagsfromfile t/images/XMP.xmp -all:all -o tmp/5.jpg -api compact=5
    1 image files created


And here are the sizes of the files for each setting of the Compact option, from 0 to 5:

% exiftool tmp -p '$filesize# bytes for Compact=$basename'
7878 bytes for Compact=0
5454 bytes for Compact=1
5215 bytes for Compact=2
4960 bytes for Compact=3
4868 bytes for Compact=4
3791 bytes for Compact=5
    1 directories scanned
    6 image files read


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

And because I'll forget how to find it later

API Compact listing
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

Good idea, thanks.  I've edited my post to add these links there too.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello,

thanks for the new version of Exiftool and the enhancement of option compact=...

But please allow the following question:
Comapct and XMPShorthand define the formatting of the XMP block.
For me "OneRdfDescription" is similar to "XMPShorthand" which does more than only define the "text-formatting".
Should "OneRdfDescription" be also independent from "NoPadding", "NoIndentation" and "NoNewlines" like XMPShorthand is?

Best regards
Herb

Phil Harvey

#4
Hi Herb,

Sure.  How about if I expand XMPShorthand so a value of 2 also compacts to one rdf:Description?

This is starting to get a bit complicated, but I wanted to avoid creating a bunch of new options for each of these settings (I think they will be rarely used anyway).  I also thought about making this a bitmask to allow one to pick and choose which features they wanted, but I thought that probably nobody would ever take advantage of this.

- Phil


Full Disclosure:

The single-rdf:Description option is actually a bit of a problem for ExifTool's extended-XMP logic.  If the XMP for a JPEG file doesn't fit in a single segment, ExifTool splits it into separate rdf:Description chunks and puts the largest of these into the extended XMP segment(s).  If you force a single rdf:Description, then it is possible that all the XMP will go into the extended segment.  This could be a problem if you use older software that doesn't recognize extended XMP.

Also note that if you are writing a JPEG image, certain XMP properties are forced into a separate rdf:Description in anticipation of placing in the extended segment.  If you are writing any of these properties to a JPEG file you will get multiple rdf:Descriptions regardless of the Compact setting.  Currently, the properties that will be placed in a separate rdf:Description are:

  XMP-photoshop:History
  XMP-xmp:Thumbnails
  XMP-crs:all
  XMP-crss:all

As well, any property with a value larger than 10 kB is placed in a separate rdf:Description.

I haven't documented this anywhere because it would just confuse most people and add to the already-too-long documentation, but you should be aware of this side-effect of the single-rdf:Description feature.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

I agree that these options will not be used very often and I also agree that it makes less sense to introduce a bunch of options.

So I think the best would be to use
- Compact for only text-formatting (no padding, no newlines, no indentation) and
- XMPShorthand for "logical"-fortmatting: XMP shorthand format and also 1 rdf:Description (as given in your details)

Best regards
Herb

Phil Harvey

Hi Herb,

That makes sense, and I'm not too far away from this if I add a XMPShorthand=2 setting.  So I think I'll do that.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

thanks for the new version of Exiftool and the enhancement of option XMPShorthand.

And please be patient that I discuss the option Compact again:
The mix of "text-formatting" and features that belong to XMPShorthand (for me) is not "Exiftool like". (Sorry to say.)
Please remove compact=5 and also compact=3, because this should be ruled by XMPShorthand only.

As this enhancement is not released in any production-release of Exiftool I see no problems to change it again.

Thanks and best regards
Herb

Phil Harvey

Hi Herb,

Thanks for your feedback.

Actually, if I'm going to change things I would prefer getting rid of the XMPShorthand option and changing the Compact option to be more flexible.  How about a single Compact option that takes a comma-delimited string of settings.  The settings could be "Padding", "Indent", "Newlines", "Shorthand" and "OneDesc".  So

-Compact=Padding would be equivalent to the old Compact option.

and

-Compact=Shorthand would be equivalent to the old XMPShorthand option.

I know this combines the logical and text formatting, but I like the idea having a single option for all of this.

I could also add an "All" setting to impose them all.  And maybe an "AllFormat" and "AllSpace" for the logical and text formatting separately.

I can make this backward compatible to all previous ExifTool versions.

What do you think?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

thanks for your reply.
I am a little bit astonished that you propose -compact=Padding.
I had expected that it is -compact=NoPadding,  compact=NoNewline or compact=NoIndent because the default is Padding etc.

I hope I understand your proposal correct:
-compact=NoPadding,NoIndent is possible.
So I agree to your solution.

Thanks and best regards
Herb

Phil Harvey

Hi Herb,

I see how this is confusing, but I was thinking that -compact=padding would enable compaction of padding (ie. no padding).  I agree that NoPadding is more intuitive, so I will use that instead.

Thanks.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Here is a draft of what the new documentation could look like:

Compact

Comma-delimited list of settings for writing compact XMP.  Below is a list
of available settings.  Note that 'NoPadding' effects only embedded XMP
since padding is never written for stand-alone XMP files.  Case is not
significant.  Default is undef.

  NoPadding - avoid 2 kB of recommended padding at end of XMP
  NoIndent  - avoid spaces to indent lines for readability
  NoNewline - avoid unnecessary newlines
  Shorthand - use XMP Shorthand format
  OneDesc   - combine XMP properties into a single rdf:Description
  AllSpace  - equivalent to 'NoPadding,NoIndent,NoNewline'
  AllFormat - equivalent to 'Shorthand,OneDesc'
  All       - equivalent to 'AllSpace,AllFormat'


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).