ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: Mac2 on June 06, 2023, 04:27:06 PM

Title: Merge Keywords
Post by: Mac2 on June 06, 2023, 04:27:06 PM
I want to:
a) copy the values of xmp-lr:hierarchicalSubject and xmp-dc:subject from file A to B
b) avoid duplicate keywords in B

I've read the FAQ (17) and the info on NoDups  and I tried several things, all with different results, all wrong.
My basic args file looks like this:

-tagsfromfile
A.jpg
-xmp-dc:subject
-xmp-lr:hierarchicalSubject
B.jpg

This copies the keywords from A to B, wiping existing keywords in B. Not what I want.

-tagsfromfile
A.jpg
-sep "##"
-xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject
B.jpg

This appends the keywords from A to B. But if B already has some of the keywords of A, these keyword is duplicated in B.

-xmp-dc:subject-<xmp-dc:subject
-xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject-<xmp-lr:hierarchicalSubject
-xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject

Same effect as above. Already existing keywords in B are duplicated

-xmp-dc:subject+<${xmp-dc:subject;NoDups}
-xmp-lr:hierarchicalSubject+<${xmp-lr:hierarchicalSubject;NoDups}

Combines keywords from A into a comma-separated string ("alpha, beta") and writes them to B. Running the same args file again, duplicates the combined string => alpha, beta; alpha, beta

-sep "##"
-xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject

-tagsfromfile
@
-sep "##"
-xmp-dc:subject<${xmp-dc:subject;NoDups}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups}

Does not copy keywords from A but merges the existing keywords of B in a comma-separated string.

Either I'm too stupid or this is not something that can be done?
Any suggestions would be appreciated.
Title: Re: Merge Keywords
Post by: Phil Harvey on June 06, 2023, 05:26:27 PM
When using -tagsfromfile, copy assignments override previous arguments for the same tag by default.  You must use -addTagsFromFile (the old method), or use the "+" prefix to add to previous operations for the same tag (the newer, more flexible, but also a bit confusing, technique).

-addtagsfromfile
A.jpg
-xmp-dc:subject-<xmp-dc:subject
-xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject-<xmp-lr:hierarchicalSubject
-xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject
B.jpg

or

-tagsfromfile
A.jpg
-xmp-dc:subject-<xmp-dc:subject
-+xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject-<xmp-lr:hierarchicalSubject
-+xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject
B.jpg

- Phil
Title: Re: Merge Keywords
Post by: Mac2 on June 07, 2023, 02:52:01 AM
Marvelous! This works beautifully. Thank you!

I've never used +- before in combination with < so it would have taken me forever to figure this out.
Thanks again!
Title: Re: Merge Keywords
Post by: Mac2 on June 08, 2023, 09:15:31 AM
The suggestion above works great when used in isolation, but not for the case I'm now trying to cover:

1. I want to copy XMP metadata from A to B
2. I want XMP data not in A to be removed from B
3. I want to merge existing XMP keywords from A merged with the keyword in B without duplication

-xmp:all=
-tagsfromfile
@
-xmp-dc:subject
-xmp-lr:hierarchicalSubject

-tagsfromfile
A.jpg
-xmp:all
-sep "##"
-xmp-dc:subject-<xmp-dc:subject
-+xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject-<xmp-lr:hierarchicalSubject
-+xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject
B.jpg

These arguments:
a) Copy XMP data from A to B
b) Delete XMP data that is in B but not in A from B
c) Merge XMP keywords from A into B

But when I run it twice, the keywords of A appear twice in B! That is my problem.
Title: Re: Merge Keywords
Post by: StarGeek on June 08, 2023, 01:27:13 PM
What version of exiftool are you using?  I think the -+ construct requires version 12.59+.

Your argument file seems to work correctly here with version 12.62.  I just used Subject and not HierarchicalSubject, though.
C:\>type temp.txt
-tagsfromfile
y:\!temp\Test3.jpg
-xmp:all
-sep "##"
-xmp-dc:subject-<xmp-dc:subject
-+xmp-dc:subject+<xmp-dc:subject
-xmp-lr:hierarchicalSubject-<xmp-lr:hierarchicalSubject
-+xmp-lr:hierarchicalSubject+<xmp-lr:hierarchicalSubject
y:\!temp\Test4.jpg
C:\>exiftool -G1 -a -s -subject y:\!temp\Test3.jpg y:\!temp\Test4.jpg
======== y:/!temp/Test3.jpg
[XMP-dc]        Subject                        : Keyword 1, Keyword 2
======== y:/!temp/Test4.jpg
[XMP-dc]        Subject                        : Keyword 3, Keyword 2
    2 image files read

C:\>exiftool -P -overwrite_original  -@ temp.txt
    1 image files updated

C:\>exiftool -G1 -a -s -subject y:\!temp\Test3.jpg y:\!temp\Test4.jpg
======== y:/!temp/Test3.jpg
[XMP-dc]        Subject                        : Keyword 1, Keyword 2
======== y:/!temp/Test4.jpg
[XMP-dc]        Subject                        : Keyword 3, Keyword 1, Keyword 2
    2 image files read
Title: Re: Merge Keywords
Post by: Phil Harvey on June 08, 2023, 01:33:23 PM
I think the duplication occurs when you run the same argfile a second time.

- Phil
Title: Re: Merge Keywords
Post by: Phil Harvey on June 08, 2023, 03:10:37 PM
I've thought about this a bit.

The problem is that the -< operation removes existing items, but if you run the command twice you are adding new items twice to the list (since they already exist in both files).  Also, the -sep option is unnecessary in your command because you have never dealing with the stringified version of a list, plus you have used quotes and put two arguments on the same line which won't work in an argfile.

I think the easiest thing to do is to remove the duplicates with a second command because I don't see an easy way to remove duplicates from the queued values.  So your argfile would be:

-xmp:all=
-tagsfromfile
@
-xmp-dc:subject
-xmp-lr:hierarchicalSubject

-tagsfromfile
A.jpg
-+xmp:all
-execute
-sep
 ##
-xmp-dc:subject<${xmp-dc:subject;NoDups(1)}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups(1)}

and your command:

exiftool -@ argfile -common_args B.jpg

Note that I've put a space before "##" because lines that start with "#" are ignored in an argfile, but leading spaces are removed so this works.

Also note that an argfile can't contain the -common_args option, so it must go on the command line.

- Phil
Title: Re: Merge Keywords
Post by: Mac2 on June 11, 2023, 07:22:35 AM
Thanks, Phil!

The -sep issues was just me making some experiments. My application creates arg files dynamically, based on user options etc. Which makes my use-case a bit more complex. The args I'm looking for is used when metadata is copied in a controlled way from one file to one or more other files. Many other arguments, tag groups etc. might come into play here.

Merging keywords without duplication seems to be a lot more complex than I thought.

Probably it would be best to check this in advance (my software has all keywords from all involved files in its database) to see if there are duplicates and if not run the args file in my last test (which worked in general but caused duplicates) and otherwise copy everything without keywords in one command and then run separate commands to copy non-duplicate keywords into each target file.

Need to think about this for a bit.

Just noticed that you have released 12.63 with support for GUANO metadata. This will make some scientists happy :)
Title: Re: Merge Keywords
Post by: Mac2 on June 11, 2023, 12:35:32 PM
-sep
 ##
-xmp-dc:subject<${xmp-dc:subject;NoDups(1)}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups(1)}
-m
A.jpg

Running it once removes duplicate keywords from A.jpg.
Running it again wipes all keywords.
Am I missing a FAQ (again) or...?
Title: Re: Merge Keywords
Post by: Mac2 on June 12, 2023, 02:54:50 AM
I have solved this in code now.
My software knows the keywords of the source and target files, and just tells ExifTool the keywords to add to the target in the args file. Works well.
Title: Re: Merge Keywords
Post by: Phil Harvey on June 12, 2023, 06:48:10 AM
Quote from: Mac2 on June 11, 2023, 12:35:32 PMRunning it once removes duplicate keywords from A.jpg.
Running it again wipes all keywords.

I get this, which works as expected (the 2nd pass shouldn't touch the file):

> cat a.args
-sep
 ##
-xmp-dc:subject<${xmp-dc:subject;NoDups(1)}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups(1)}
> exiftool a.jpg -subject
Subject                         : a, b, c, b
> exiftool a.jpg -@ a.args
Warning: [minor] Tag 'xmp-lr:hierarchicalSubject' not defined - a.jpg
    1 image files updated
> exiftool a.jpg -subject
Subject                         : a, b, c
> exiftool a.jpg -@ a.args
Warning: [minor] Advanced formatting expression returned undef for 'xmp-dc:subject' - a.jpg
Warning: [minor] Tag 'xmp-lr:hierarchicalSubject' not defined - a.jpg
Warning: No writable tags set from a.jpg
    0 image files updated
    1 image files unchanged
> exiftool a.jpg -subject
Subject                         : a, b, c

- Phil
Title: Re: Merge Keywords
Post by: Mac2 on June 12, 2023, 07:40:22 AM
Strange. I can reproduce this here:

exiftool -xmp-lr:all "c:\temp\test.jpg"
Hierarchical Subject            : alpha, alpha, beta, beta, alpha

exiftool -@ test.args

exiftool -xmp-lr:all "c:\temp\test.jpg"
Hierarchical Subject            : alpha, beta

exiftool -@ test.args

exiftool -xmp-lr:all "c:\temp\test.jpg"
Hierarchical Subject            :

My test.args:

-overwrite_original_in_place
-charset
FILENAME=UTF8
-sep
 ##
-xmp-dc:subject<${xmp-dc:subject;NoDups(1)}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups(1)}
-m
c:\temp\test.jpg

I see no difference to your test arguments, except for the overwrite_original_in_place and the charset.
I've created a fresh test.jpg in an image editor for this test.

To add the keywords I've used

exiftool -overwrite_original_in_place -xmp-lr:hierarchicalSubject+="alpha" "c:\temp\test.jpg"
several times to add the duplicate keywords.
Title: Re: Merge Keywords
Post by: Phil Harvey on June 12, 2023, 11:54:22 AM
Right.  -m is the difference since this causes an empty value to inserted for undefined tags, and "NoDups(1)" returns an undefined value if nothing changed.  So you should use "NoDups()" instead.  This should behave as you expected, but will rewrite the file even if nothing changed:

-overwrite_original_in_place
-charset
FILENAME=UTF8
-sep
 ##
-xmp-dc:subject<${xmp-dc:subject;NoDups()}
-xmp-lr:hierarchicalSubject<${xmp-lr:hierarchicalSubject;NoDups()}
-m
c:\temp\test.jpg

(An interesting interaction between the options, which could be hard to anticipate.)

- Phil
Title: Re: Merge Keywords
Post by: Mac2 on June 12, 2023, 12:47:53 PM
Interesting. Thanks for clarifying this.

My code emits the -m to cleanup noise from the ExifTool output before parsing it for real warnings and errors and to inform the user when needed.

I understood the NoDupes(1) behavior from the documentation, but I'd never thought about -m causing any side-effect with these special constructs. Learn something new every day.

My code now emits the keywords to add to the target files and I no longer need these special NoDupes constructs. Works well, including the decision whether or not also update legacy IPTC keywords. My software maintains existing legacy IPTC metadata but no longer creates it.

ExifTool still does most of the work, though :D
Title: Re: Merge Keywords
Post by: Phil Harvey on June 12, 2023, 01:44:40 PM
Great.  This discussion has pointed out a deficiency in ExifTool:  There is no way to guard against duplicate items being added to the queued values for a list.  I'll think about this -- I may be able to add an option to address this situation.

- Phil

Edit:  It turns out that it was fairly easy to add this feature.  An API "NoDups" option will appear in version 12.64 that will eliminate duplicate items from queued values for List-type tags.
Title: Re: Merge Keywords
Post by: Mac2 on June 13, 2023, 05:44:01 AM
Awesome! :) :)