Main Menu

Subgroups tags separator

Started by ArchZu, September 08, 2024, 06:29:18 PM

Previous topic - Next topic

ArchZu

Hello, I would like to know if there is something in exiftool that can change the separators of subtag arguments, some use "/" and others use "|", is it possible to make them all use, for example, "|", to separate the subtags? Or if it is possible to change this through a configuration file.

It's like this:
-XMP-digiKam:TagsList='Character/Ainz Ooal Gown'
-XMP-lr:hierarchicalSubject='Character|Ainz Ooal Gown'


I want it like this:
-XMP-digiKam:TagsList='Character|Ainz Ooal Gown'
-XMP-lr:hierarchicalSubject='Character|Ainz Ooal Gown'



If it's possible to do this, I want to know if it's possible to use "/", for example, in "-XMP-digiKam:TagsList", without using something before the slash to put it like this "\/", although I haven't tested it.
Like this:
-XMP-digiKam:TagsList='Character|Ainz/Ooal/Gown'
-XMP-lr:hierarchicalSubject='Character|Ainz/Ooal/Gown'


I know it's possible to make scripts that change "/" to "|", but every time I type "/" it will be changed to "|", I don't really like this behavior in python, bash, etc. scripts.
Arch Linux (KDE Plasma)

StarGeek

Quote from: ArchZu on September 08, 2024, 06:29:18 PMis it possible to make them all use, for example, "|", to separate the subtags?

It is possible to change them. For example, changing a slash to a pipe can be done like this
exiftool -api "Filter=s(/)(|)g" -TagsFromFile @ -HierarchicalSubject -TagsList /path/to/files/

The -api Filter option is global and will affect every tag, so you don't want to use it if you are copying other data at the same time. The reason I like using it is that it will affect every entry of a list type tag separately, removing the need to deal with the -sep option.

QuoteIf it's possible to do this, I want to know if it's possible to use "/", for example, in "-XMP-digiKam:TagsList", without using something before the slash to put it like this "\/", although I haven't tested it.

Perl is flexible when it comes to RegEx, so you can use other delimiters to avoid Leaning toothpick syndrome. In the example I use parenthsis instead of the usual s/search/replace/ structure.

QuoteLike this:
-XMP-digiKam:TagsList='Character|Ainz/Ooal/Gown' -XMP-lr:hierarchicalSubject='Character|Ainz/Ooal/Gown'

Now I'm not sure what you want. Do you want to only replace the first / with a | and replace the spaces with slashes?

The important thing to take note of is that changing the hierarchy separator does not mean that the program you are using will understand it. If Digikam uses a slash by default and doesn't have an option to change it to a pipe, then instead of having a hierarchy, you will have a single entry with pipes. You cannot force some program to change the way it reads the data.

IMO, the only time this should be a problem is when you are copying from one tag to another tag that uses a different separator. Let your programs read the data the way they write them.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

ArchZu

#2
Quote from: StarGeek on September 08, 2024, 07:32:30 PMNow I'm not sure what you want. Do you want to only replace the first / with a | and replace the spaces with slashes?

exiftool -overwrite_original -charset "UTF8" -sep ", " -XMP-digiKam:TagsList+='Copyright/fate∕grand order' -XMP-digiKam:TagsList+='Copyright/fate∕stay night' -XMP-digiKam:TagsList+='loose_tag1' -XMP-digiKam:TagsList+='loose_tag2' -XMP-lr:hierarchicalSubject+='Copyright|fate∕grand order' -XMP-lr:hierarchicalSubject+='Copyright|fate∕stay night' -XMP-lr:hierarchicalSubject+='loose_tag1' -XMP-lr:hierarchicalSubject+='loose_tag2' -XMP-dc:subject+='fate/grand order, fate/stay night, loose_tag1, loose_tag2' image.png
It should result in this:
Copyright
    fate∕grand order
Copyright
    fate∕stay night
loose_tag1
loose_tag2


Usually what I do is use a python script or bash script, which changes "U+002F"="/" for another character like "U+2215"="∕".
script.sh --tags 'Copyright|fate{U+2215}grand order', I type normal backslash and it changes it for another character so as not to cause problems, although this works, I don't know if it would be the best way to do this.

script.sh --tags 'Copyright|fate/grand order, Copyright∣fate/stay night, loose_tag1, loose_tag2' image.png
Copyright
    fate∕grand order
Copyright
    fate∕stay night
loose_tag1
loose_tag2

[XMP:XMP-dc:Image] Subject                      : fate/grand order, fate/stay night, loose_tag1, loose_tag2
[XMP:XMP-digiKam:Image] TagsList                : Copyright/fate∕grand order, Copyright/fate∕stay night, loose_tag1, loose_tag2
[XMP:XMP-lr:Image] HierarchicalSubject          : Copyright|fate∕grand order, Copyright|fate∕stay night, loose_tag1, loose_tag2
Arch Linux (KDE Plasma)

StarGeek

Quote from: ArchZu on September 08, 2024, 08:08:27 PMIt should result in this:
Copyright
    fate∕grand order
Copyright
    fate∕stay night
loose_tag1
loose_tag2

If DigiKam doesn't have a setting to allow the use of the pipe character as a separator, then using a slash as part of a leaf keyword will not be possible. There isn't anything that exiftool does that can change how another program reads the data. This is something you need to check on the DigiKam side or switch to another separator for things like "fate∕grand order".

QuoteUsually what I do is use a python script or bash script, which changes "U+002F"="/" for another character like "U+2215"="∕".
script.sh --tags 'Copyright|fate{U+2215}grand order', I type normal backslash and it changes it for another character so as not to cause problems, although this works, I don't know if it would be the best way to do this.

I can't help on the Python side of things, but forcing a change of a character is a really bad idea. Sounds like something else might be happening. It actually sounds like something a word processor or Google Docs would do to "help" you. Those programs automatically do that by changing regular quotes to Smart/Fancy quotes.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

ArchZu

#4
Quote from: StarGeek on September 09, 2024, 11:38:03 AMIf DigiKam doesn't have a setting to allow the use of the pipe character as a separator, then using a slash as part of a leaf keyword will not be possible. There isn't anything that exiftool does that can change how another program reads the data. This is something you need to check on the DigiKam side or switch to another separator for things like "fate∕grand order".

I'm just going to change the characters, to avoid problems.


Quote from: StarGeek on September 09, 2024, 11:38:03 AMI can't help on the Python side of things, but forcing a change of a character is a really bad idea. Sounds like something else might be happening. It actually sounds like something a word processor or Google Docs would do to "help" you. Those programs automatically do that by changing regular quotes to Smart/Fancy quotes.

I intend to do something like this.

exiftool -charset UTF8 -all= -tagsFromFile @ -XMP-lr:HierarchicalSubject 2.png
exiftool -charset UTF8 -api 'Filter=s(/)(\xE2\x88\x95)g' -XMP-lr:HierarchicalSubject="A-b-C|a/B/b|C_c_C" -XMP-lr:HierarchicalSubject="Z-z-Z|x/X/x|W_w_W" "-XMP-mediapro:CatalogSets<XMP-lr:HierarchicalSubject" -CodedCharacterSet="UTF8" 2.png
exiftool -charset UTF8 -api 'Filter=s(\|)(/)g' '-XMP-digiKam:TagsList<XMP-lr:HierarchicalSubject' '-XMP-microsoft:LastKeywordXMP<XMP-lr:HierarchicalSubject' 2.png
exiftool -charset UTF8 -api 'Filter=s/.*\|([^|]*)/$1/g' -sep ', ' '-XMP-dc:subject<XMP-lr:HierarchicalSubject' 2.png

I'll type the tag just once and it will copy it to the other tags. Do you know if it is possible to do this with the configuration file? "-config tags.config". I don't know how to copy one value to another within the config file, just display them.
'Image::ExifTool::XMP::xmp' => {
  Lower => { # define a tag composta HierarchicalSubject
    Require =>
        { # define as tags que devem existir para que essa tag seja criada
      0 => 'XMP-lr:HierarchicalSubject',
        },
    Desire => { # define as tags que devem ser usadas para criar a tag composta
      1 => 'XMP-mediapro:CatalogSets',
      2 => 'XMP-digiKam:TagsList',
      3 => 'XMP-microsoft:LastKeywordXMP',
      4 => 'XMP-dc:subject',
    },
  },
};

I need to know the characters that it is not advisable to use in a tag.

This is used to separate child tags: "/" and "|"
this is used to separate parent tags: ","

In that case, could it be a problem to use something like a "<>"? Or single quote, double quote, other character?
Arch Linux (KDE Plasma)

StarGeek

Quote from: ArchZu on September 10, 2024, 08:43:27 PMI need to know the characters that it is not advisable to use in a tag.

This is used to separate child tags: "/" and "|"
this is used to separate parent tags: ","

In that case, could it be a problem to use something like a "<>"? Or single quote, double quote, other character?

This is entirely dependent upon the programs you use. You would have to test each program.

As an example, in the old Picasa program, if you had a comma in a keyword, e.g. "Smith, John"/"Doe, Jane", then it will automatically split the single keyword into two separate ones, e.g. "Smith" "John"/"Doe" "Jane".
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

ArchZu

Quote from: StarGeek on September 11, 2024, 10:26:33 AMThis is entirely dependent upon the programs you use. You would have to test each program.

I thought about it, I'll change the single quote, double quote, and acute accent because they can cause problems with scripts. And change the slash in the names. It's easier to adapt only for the programs I use, then if I come to use other software, I adapt the tags with some script. 👍
Arch Linux (KDE Plasma)