Excire Keywords (XMP-excire:HierarchicalSubject) to standard XMP Tags

Started by PhotoManiax, May 01, 2023, 10:13:29 AM

Previous topic - Next topic

PhotoManiax

Hello,

thank you so much @Phil for this amazing tool. It's become like the gold standard for manipulating metadata.
I have used it a lot to cleanup my mistakes I made over the course of the last almost 20 years using various kinds of photo management software.
So I am Ok with the use of extended command lines and mapping values from one IPTC to XMP, etc.

I ended up using digiKam as my main DAM tool, but I also use Excire since it does a relatively good job when it comes to auto tagging based upon their picture analysis and categorisation algorithms. That saves a tremendous amount of time compared to manually tagging thousands of pictures.

Unfortunately they don't use the standard XMP namespace but "invented" their own, see snippet below.

I am looking for a way to use the auto-categorisation (keyword assignment based upon picture content)  of Excire and "translate" it as hierarchical tags in digiKam.
digiKam is configured according to the guide Setup of digiKam for Windows compatibility, even though I use Windows, Linux and Mac.

The challenge is to append the Excire XMP structure (XMP-excire:HierarchicalSubject) to:

-XMP-dc:Subject
-XMP-lr:HierarchicalSubject
-XMP-microsoft:LastKeywordXMP

and replace the separator "|" with "/".

The Excire XMP structure looks as follows:
<XMP-excire:HierarchicalSubject>
  <rdf:Bag>
   <rdf:li>excire|content|vehicle|car</rdf:li>
   <rdf:li>excire|content|vehicle</rdf:li>
   <rdf:li>excire|content|vehicle|motorcycle</rdf:li>
  </rdf:Bag>

It should not only be written to -XMP-dc:Subject, -XMP-lr:HierarchicalSubject and -XMP-microsoft:LastKeywordXMP but also the separators need to be replaced from "|" to "/".

This command here appends the Excire keywords to the respective structures but it retains the "|" separator:
exiftool "-XMP-dc:Subject+<XMP-excire:HierarchicalSubject" "-XMP-microsoft:LastKeywordXMP+<XMP-excire:HierarchicalSubject" "-XMP-lr:HierarchicalSubject+<XMP-excire:HierarchicalSubject" test2.jpg

Result:
<XMP-lr:HierarchicalSubject>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>excire|content|vehicle|car</rdf:li>
   <rdf:li>excire|content|vehicle</rdf:li>
   <rdf:li>excire|content|vehicle|motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-lr:HierarchicalSubject>
 <XMP-dc:Subject>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>excire|content|vehicle|car</rdf:li>
   <rdf:li>excire|content|vehicle</rdf:li>
   <rdf:li>excire|content|vehicle|motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-dc:Subject>
 <XMP-microsoft:LastKeywordXMP>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>excire|content|vehicle|car</rdf:li>
   <rdf:li>excire|content|vehicle</rdf:li>
   <rdf:li>excire|content|vehicle|motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-microsoft:LastKeywordXMP>

Excire assigns all keywords within the hierarchy to the metadata. Ideally only the "longest" hierarchy path should be written and the word "excire" eliminated from the tree, so that it looks like this in the "HierarchicalSubject", "LastKeywordXMP" and "Subject":

<XMP-lr:HierarchicalSubject>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>Content/vehicle/car</rdf:li>
   <rdf:li>Content/vehicle/motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-lr:HierarchicalSubject>
 <XMP-dc:Subject>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>Content/vehicle/car</rdf:li>
   <rdf:li>Content/vehicle/motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-dc:Subject>
 <XMP-microsoft:LastKeywordXMP>
  <rdf:Bag>
   <rdf:li>Places/NameOfTheCountry/NameofTheState/NameOfTheCity/NameOfTheSublocation</rdf:li>
   <rdf:li>Content/vehicle/car</rdf:li>
   <rdf:li>Content/vehicle/motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-microsoft:LastKeywordXMP>

However, I'd be fine if the data can only be "translated" to the target XMP namespaces and the separators from "|" to "/"

Thanks in advance for any helpful responses.

StarGeek

Removing the first word and switching the character is simple
exiftool -api "Filter=s(^excire\|)()i;s(\|)(/)g" "-XMP-dc:Subject+<XMP-excire:HierarchicalSubject" "-XMP-microsoft:LastKeywordXMP+<XMP-excire:HierarchicalSubject" "-XMP-lr:HierarchicalSubject+<XMP-excire:HierarchicalSubject" test2.jpg

Though I can't remember if an -AddTagsFromFile @ is needed in this case, I can never keep that straight.

Removing the shorter duplicate paths would be difficult and probably require a user-defined tag.

Quote from: PhotoManiax on May 01, 2023, 10:13:29 AMUnfortunately they don't use the standard XMP namespace but "invented" their own, see snippet below.

Phil's previous post on excire's new tags.  Related, ACDSee special tags.

If you need, there is also a config file for editing excire tags.

edit: fixed bad regex, and more tweaks to avoid Leaning toothpicks
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

PhotoManiax

Thank you StarGeek for the blazing fast reply.

It works like great.

Would it be too much to ask for an extension of that expression which capitalises the words?

So from
excire|content|vehicle|car
to
Content/Vehicle/Car
instead of
content/vehicle/car

Thank you!!!

I am aware of the issues with non standard XMP metadata and I read through the articles you mentioned. The only reason I am using Excire is the auto-categorisation as mentioned above. Would I like to use a DAM software that has that feature and uses standards: absolutely yes, but I have not come across one so far. That's why I am stuck for the moment with Excire and using digiKam as the primary DAM tool.

PhotoManiax

Using my search engine of choice, I found the following approach, which seems to work:

exiftool -m -r -overwrite_original -api "Filter=s(^excire\|)()i;s(\|)(/)g;s/([\w']+)/\u\L$1/g" "-XMP-dc:Subject+<XMP-excire:HierarchicalSubject" "-XMP-microsoft:LastKeywordXMP+<XMP-excire:HierarchicalSubject" "-XMP-lr:HierarchicalSubject+<XMP-excire:HierarchicalSubject" test2.jpg

This converts
<XMP-excire:HierarchicalSubject>
  <rdf:Bag>
   <rdf:li>excire|content|vehicle|car</rdf:li>
   <rdf:li>excire|content|vehicle</rdf:li>
   <rdf:li>excire|content|vehicle|motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-excire:HierarchicalSubject>
to
<XMP-lr:HierarchicalSubject>
  <rdf:Bag>
   <rdf:li>Content/Vehicle/Car</rdf:li>
   <rdf:li>Content/Vehicle</rdf:li>
   <rdf:li>Content/Vehicle/Motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-lr:HierarchicalSubject>
 <XMP-microsoft:LastKeywordXMP>
  <rdf:Bag>
   <rdf:li>Content/Vehicle/Car</rdf:li>
   <rdf:li>Content/Vehicle</rdf:li>
   <rdf:li>Content/Vehicle/Motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-microsoft:LastKeywordXMP>
 <XMP-dc:Subject>
  <rdf:Bag>
   <rdf:li>Content/Vehicle/Car</rdf:li>
   <rdf:li>Content/Vehicle</rdf:li>
   <rdf:li>Content/Vehicle/Motorcycle</rdf:li>
  </rdf:Bag>
 </XMP-dc:Subject>

So thank you again for you help @StarGeek.