pdf:keywords not being cleared when clearing mwg:keywords

Started by Joanna Carter, June 07, 2021, 09:25:27 AM

Previous topic - Next topic

Joanna Carter

I am writing mwg:keywords to an xmp file and it seems to be adding pdf:keywords as well.

All well and good, but when I issue mwg:keywords= to clear them down, pdf:keywords remains.


<rdf:Description rdf:about=''
  xmlns:pdf='http://ns.adobe.com/pdf/1.3/'>
  <pdf:Keywords>Nounours, Didier, Joanna</pdf:Keywords>
</rdf:Description>


It is not listed as a derived tag in the MWG list - see the attached screenshot

Any ideas?

Joanna Carter

In fact, if I ask for -pdf:keywords with exiftool, I get nothing back but if I ask for -keywords, I get the keywords I have added with mwg:keywords

Just which tag should I be setting/clearing to ensure that every trace of keywords disappears from an XMP file?

Phil Harvey

Hi Joanna,

There are number of things wrong with this post.

1. PDF:Keywords doesn't exist in an XMP file.  You mean XMP-pdf:Keywords.

2. ExifTool will not add XMP-pdf:Keywords to an XMP file when writing MWG:Keywords.

3. If you write MWG:Keywords, you should read back MWG:Keywords if you want to see it.  Also, consult the MWG tags documentation to understand more what is happening here.

Give me a specific example that I can reproduce and I'll tell you what is happening.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Joanna Carter

#3
Quote from: Phil Harvey on June 08, 2021, 08:29:06 AM
There are number of things wrong with this post.

You're telling me  ;)

Quote from: Phil Harvey on June 08, 2021, 08:29:06 AM
1. PDF:Keywords doesn't exist in an XMP file.  You mean XMP-pdf:Keywords.

2. ExifTool will not add XMP-pdf:Keywords to an XMP file when writing MWG:Keywords.

That might be so but it was there in plain sight

<rdf:Description rdf:about=''
  xmlns:pdf='http://ns.adobe.com/pdf/1.3/'>
  <pdf:Keywords>Nounours, Didier</pdf:Keywords>
</rdf:Description>

As to how that came about, it gets complicated, "when you are a bear of little brain" as Winnie the Pooh would say.

If I write only to an XMP file, I send the following list of arguments...

    // clear out everything from the existing XMP file in case the user has two RAW files with the same name, but different extensions,
    // in the same directory and are now setting keywords on a different image to the one they've already worked on (unlikely but it could happen)

  - 0 : "-all="
  - 1 : "--xmp:all" // except for the xmp data

    // copy all tags from the RAW file

  - 2 : "-tagsFromFile"
  - 3 : "/Users/joannacarter/Pictures/DxO Beta/_JNA0004.NEF"
  - 4 : "-all"

    // I need the SidecarForExtension and format tags to identify which make of RAW file this was created from,
    // for when I do a search using Spotlight and it finds the XMP file, so I can get back to the correct RAW file

  - 5 : "-filetype>SidecarForExtension"
  - 6 : "-mimetype>format"

    // finally write the keywords and hierarchy to the XMP file

  - 7 : "-mwg:keywords=Nounours"
  - 8 : "-mwg:keywords=Didier"
  - 9 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours}"
  - 10 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours,children={keyword=Didier}}"
  - 11 : "-xmp-lr:hierarchicalsubject=Nounours"
  - 12 : "-xmp-lr:hierarchicalsubject=Nounours|Didier"
  - 13 : "/Users/joannacarter/Pictures/DxO Beta/_JNA0004.xmp"

And all that works fine.

When I want to write to both a RAW and an XMP file, I send the following arguments...

    // write to the RAW file

  - 0 : "-preserve"
  - 1 : "-ignoreMinorErrors"
  - 2 : "-overwrite_original_in_place"
  - 3 : "-mwg:keywords=Nounours"
  - 4 : "-mwg:keywords=Didier"
  - 5 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours}"
  - 6 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours,children={keyword=Didier}}"
  - 7 : "-xmp-lr:hierarchicalsubject=Nounours"
  - 8 : "-xmp-lr:hierarchicalsubject=Nounours|Didier"
  - 9 : "/Users/joannacarter/Pictures/DxO Beta/_JNA0004.NEF"

    // write to the XMP file

  - 10 : "-all="
  - 11 : "--xmp:all"
  - 12 : "-tagsFromFile"
  - 13 : "/Users/joannacarter/Pictures/DxO Beta/_JNA0004.NEF"
  - 14 : "-all"
  - 15 : "-filetype>SidecarForExtension"
  - 16 : "-mimetype>format"
  - 17 : "-mwg:keywords=Nounours"
  - 18 : "-mwg:keywords=Didier"
  - 19 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours}"
  - 20 : "-xmp-mwg-kw:hierarchicalkeywords={keyword=Nounours,children={keyword=Didier}}"
  - 21 : "-xmp-lr:hierarchicalsubject=Nounours"
  - 22 : "-xmp-lr:hierarchicalsubject=Nounours|Didier"
  - 23 : "/Users/joannacarter/Pictures/DxO Beta/_JNA0004.xmp"

And now, that works fine.

What seemed to be causing the pdf entry to appear was putting an -execute argument between the two "sections" of arguments.

So, I am sorry to have troubled you when it seems it was I who had made a mistake. Nonetheless, such mistakes all add to the wealth of "don't do this again" knowledge for future enquirers.

Quote from: Phil Harvey on June 08, 2021, 08:29:06 AM
3. If you write MWG:Keywords, you should read back MWG:Keywords if you want to see it.  Also, consult the MWG tags documentation to understand more what is happening here.

If only that were possible.

I am writing MWG because that seems to cover the most bases when it comes to people working with and searching for keywords in all sorts of different software.

But my software doesn't actually read MWG tags, because it uses Apple's metadata and Spotlight frameworks to carry out searches. It is all neatly indexed for a blazingly fast response without having to enumerate files manually. And that uses dc:subject for reading and Apple's kMDItemKeywords metadata tag for searching.

If framework writers could agree on standard constants, the world would be a lot easier place to code in.

Once again, your rubber duck consulting services have saved the day  ;D

Joanna Carter

I shouldn't have celebrated too soon.

It seems the pdf tags are appearing in the XMP file when I am deleting keywords from both RAW and XMP files

Here are the arguments for that...

    // clear out the RAW file

  - 0 : "-preserve"
  - 1 : "-ignoreMinorErrors"
  - 2 : "-overwrite_original_in_place"
  - 3 : "-mwg:keywords="
  - 4 : "-xmp-mwg-kw:hierarchicalkeywords="
  - 5 : "-xmp-lr:hierarchicalsubject="
  - 6 : "/Users/joannacarter/Pictures/DxO Beta/IMG_1855.CR2"

    // clear out the XMP file

  - 7 : "-all="
  - 8 : "--xmp:all"
  - 9 : "-tagsFromFile"
  - 10 : "/Users/joannacarter/Pictures/DxO Beta/IMG_1855.CR2"
  - 11 : "-all"
  - 12 : "-filetype>SidecarForExtension"
  - 13 : "-mimetype>format"
  - 14 : "-mwg:keywords="
  - 15 : "-xmp-mwg-kw:hierarchicalkeywords="
  - 16 : "-xmp-lr:hierarchicalsubject="
  - 17 : "/Users/joannacarter/Pictures/DxO Beta/IMG_1855.xmp"


The excess tags cause a problem because, unlike for an image file, the Spotlight search predicate for an XMP file has to be based on its text content, not any metadata tags.

So, pending any answer as to why these tags are appearing, with the "deleted" keywords in them, I have coded around it by including the tag start and end in the predicate text thus...

"kMDItemTextContent = \"<rdf:li>Didier</rdf:li>\"cdw"

Nonetheless, I'd be interested to find out why this is happening

StarGeek

Quote from: Joanna Carter on June 08, 2021, 01:14:18 PM
    // clear out the XMP file

  - 7 : "-all="
  - 8 : "--xmp:all"
  - 9 : "-tagsFromFile"
  - 10 : "/Users/joannacarter/Pictures/DxO Beta/IMG_1855.CR2"
  - 11 : "-all"

From the docs on the -TagsFromFile option
    If no tags are specified, then all possible tags (see note 1 below) from the source file are copied to same-named tags in the preferred location of the output file (the same as specifying -all).

If IPTC:Keywords exists in the file and  you use -TagsFromFile -All, exiftool will copy Keywords to the preferred destination in the target file.  And because it can't copy Keywords to EXIF or IPTC in an XMP file, the only available target is the XMP-pdf:Keywords.

You can see a similar operation happening in the example below with the XMP-Tiff tags, which are copied from the File group

C:\>exiftool -g1 -a -s -exif:all -Iptc:all -xmp:all y:\!temp\Test4.jpg
---- IPTC ----
Keywords                        : Test
ApplicationRecordVersion        : 4

C:\>exiftool -P -overwrite_original -TagsFromFile y:\!temp\Test4.jpg -all y:\!temp\Test4.xmp
    1 image files updated

C:\>exiftool -g1 -a -s -exif:all -Iptc:all -xmp:all y:\!temp\Test4.xmp
---- XMP-x ----
XMPToolkit                      : Image::ExifTool 12.28
---- XMP-pdf ----
Keywords                        : Test
---- XMP-tiff ----
BitsPerSample                   : 8
ImageHeight                     : 1205
ImageWidth                      : 1749
YCbCrSubSampling                : YCbCr4:2:0 (2 2)


When copying to an XMP file, you should probably use -XMP:All instead of -All.  Or optionally, use the exif2xmp.args and iptc2xmp.args files, found here on GitHub, to copy from EXIF/IPTC to the corresponding tags in XMP.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype