Copy RegionPersonDisplayName to Keywords preserve Keywords and avoid duplicates

Started by grasdk, November 02, 2023, 11:27:04 AM

Previous topic - Next topic

grasdk

Hi

Thanks for a great tool. The world is a better place because of it ;)

I have this scenario that I hope someone can help me solve:
I have a huge collection of photos with keyword-tags, including people, but also descriptive keywords like "Holiday". Recently I began using the face recognition features of digiKam (open source photo management and more) and another "freemium" tool. I now ended up with the face recognition discovering people I missed in the keyword-tagging. Due to tool configuration and other considerations, these face tags have not all been "mirrored" to the keywords-list. So I want to remedy that.


  • So I need to be able to copy RegionPersonDisplayName to Keywords, but avoid duplicates (if name is already present) and preserve descriptive keywords like "Holiday". I have no guarantee on order of either Keywords or RegionPersonDisplayName.
  • Secondarily I also need to find the photos where a RegionPersonDisplayName exist, which is not in Keywords. This should avoid unnescessary updates of pictures where everything is already in order.

1)
I found this topic to get me started on the first requirement, but I ran into the problem shown below:

Start:
$ exiftool -RegionPersonDisplayName -keywords 2023-08-27-120000.jpg
Region Person Display Name      : Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Keywords                        : Holiday, RedHairedPerson SittingOnBench, Person Carrying ChildInYellowDress

As you can see, Person Carrying ChildInBlueShirt is missing from keywords, and keywords and region person display name are in different orders.

So I want to add to Keywords (dynamically). Remove first, then add:
$ exiftool "-Keywords-<RegionPersonDisplayName" "-Keywords+<RegionPersonDisplayName" 2023-08-27-120000.jpg

Result is unfortunately with duplicates.

$ exiftool -RegionPersonDisplayName -keywords 2023-08-27-120000.jpg
Region Person Display Name      : Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Keywords                        : Holiday, RedHairedPerson SittingOnBench, Person Carrying ChildInYellowDress, Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench

Is it possible to adjust the above command to fix this?



exiftool version 12.69 on WSL2 - Ubuntu 22.04.03 under Windows 10 Pro 22H2. Exiftool installed with sudo make install

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
$ exiftool -ver
12.69


I get the same results with exiftool 11.40 on my Synology NAS and on exiftool 12.69 for Windows. I also tried the full tag naming:
$ exiftool "-IPTC:Keywords-<XMP:regionpersondisplayname" "-IPTC:Keywords+<XMP:regionpersondisplayname" 2023-08-27-120000.jpg

Same result.



Attached a wikimedia commons picture that I custom tagged for test-purposes.



2)
Is there a way to use something similar to "-RegionPersonDisplayName-<Keywords is not empty" as an exiftool if-condition? Subtracting Keywords from RegionPersonDisplayName not being empty would mean that a face exists that is not also a keyword-tag. But I need to know this without altering the photo.



Thank you very much for any help. Alternate solutions are welcome, maybe I'm on the completely wrong track? :)

Kind regards.

StarGeek

I don't have time to go over this much atm, but removing and adding a list type tag to another has some subtleties to be accounted for.

The quick version is to put a leading plus sign + in front of the tag name.
exiftool "-IPTC:Keywords-<XMP:regionpersondisplayname" "-+IPTC:Keywords+<XMP:regionpersondisplayname" 2023-08-27-120000.jpg

See the last few paragraphs of FAQ #17, the part starting "Note there is a complication when copying multiple tags to a single list tag"

As for part two, it would be difficult to do on the command line.  A user-defined tag would be better in this case.  I think I have one that could be adapted to this, as it compares Subject to HierarchicalSubject.  I should be able to pull that out later today unless someone comes up with something first.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

grasdk

Quote from: StarGeek on November 02, 2023, 12:22:34 PMThe quick version is to put a leading plus sign + in front of the tag name.
exiftool "-IPTC:Keywords-<XMP:regionpersondisplayname" "-+IPTC:Keywords+<XMP:regionpersondisplayname" 2023-08-27-120000.jpg

See the last few paragraphs of FAQ #17, the part starting "Note there is a complication when copying multiple tags to a single list tag"

1)
Thank you so much! I actually did read that, but didn't understand it. And I must have messed up my attempt to use it, before reverting to basics. This works! :)

2)
For the record, I did a quick test of the idempotency of this command, and it is absolutely harmless to photos that are already in order. In other words, you can run it again and again and the result will be the same. So I could just run it on my full collection. However it would be wasteful, since I have around 100k photos, and I expect less than 5% to need correction. Reading metadata on 100k and writing only 5k is definitely better than reading AND writing 100k photo's metadata.

So anything goes... Thank you in advance, if you look something up StarGeek. I'm not in a hurry, so no rush :)

I am also considering combining exiftool with some output files and using a programming langage like python with support for set calculus:
if ((regionpersondisplayname MINUS keywords) IS NOT EMPTY) print filenameI'll post it as a response if I end up resorting to that.

StarGeek

Turns out I already had a tag which looked for region names that weren't in Keywords or Subject.  I took some time to improve it and had to change from RegionName (MWG regions) to RegionPersonDisplayName (Microsoft regions).

Additionally, you can avoid dealing with -< +< because this tag will only list names that aren't already in the keywords.

Here's a complete config file.  If you already have an default config file, you can cut the tag section to add to your own config file, but you must also copy the two subroutines that are at the top.


# Removes items in array 1 from array 2,
# takes two array refs as input, returns new array
sub Diff_Array{
    my @InputArr = @{+shift};
    my @MainArr = @{+shift};
    my %hash;

    #initialize hash index
    @hash{@InputArr} = undef;
   
    @MainArr = grep {not exists $hash{$_}} @MainArr;
    return (@MainArr ? \@MainArr : undef);
}

# Removes duplicates from an array
sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Composite' => {
        ################# Cut here #################
        FacesNotInKeywords => {
        # Takes combined keywords/subject, removes any of them that also are faces, leaving faces that
        # are not duplicated to keywords
            Require => {
                0 => 'RegionPersonDisplayName',
            },
            Desire => {
                1 => 'Keywords',
                2 => 'Subject',
            },
            ValueConv => q{
                # combine Keywords and Subject and remove duplicates
                my @Faces        = ref $val[0] eq 'ARRAY' ? @{$val[0]} : ($val[0]);
                # Combine Keywords and Subject
                my @list  = (ref $val[1] eq 'ARRAY' ? @{$val[1]} : (defined $val[1] ?( $val[1] ):()), ref $val[2] eq 'ARRAY' ? @{$val[2]} : (defined $val[2] ?( $val[2] ):()) );
                @list = uniq(@list);
                @Faces= uniq(@Faces);
                my @NewList = @{Diff_Array(\@list,\@Faces)} ;
                # remove blank lines
                @NewList = grep(/\S/, @NewList);
                return @NewList ? \@NewList : undef;
            }
        },
        ################# Cut here #################
    },
);
#------------------------------------------------------------------------------
1;  #end

Example output, the Shift value warning can be ignored or you can specify XMP-dc:Subject to avoid it.
C:\>exiftool -config test4.config -G1 -a -s -Subject -RegionPersonDisplayName -FacesNotInKeywords Y:\!temp\x\y\2023-08-27-120000.jpg
[XMP-dc]        Subject                        : Holiday, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
[XMP-MP]        RegionPersonDisplayName        : Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench, yet another person, Indiana Jones
[Composite]    FacesNotInKeywords              : yet another person, Indiana Jones

C:\>exiftool -config test4.config -P -overwrite_original "-Subject+<FacesNotInKeywords" Y:\!temp\x\y\2023-08-27-120000.jpg
Warning: Shift value for XMP-pdf:Subject is not a number - Y:/!temp/x/y/2023-08-27-120000.jpg
    1 image files updated

C:\>exiftool -config test4.config -G1 -a -s -Subject -RegionPersonDisplayName -FacesNotInKeywords Y:\!temp\x\y\2023-08-27-120000.jpg
[XMP-dc]        Subject                        : Holiday, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench, yet another person, Indiana Jones
[XMP-MP]        RegionPersonDisplayName        : Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench, yet another person, Indiana Jones

And since the tag will be undefined if there is nothing to copy, you can run it again on the file and it will be skipped
C:\>exiftool -config test4.config -P -overwrite_original "-Subject+<FacesNotInKeywords" Y:\!temp\x\y\2023-08-27-120000.jpg
Warning: No writable tags set from Y:/!temp/x/y/2023-08-27-120000.jpg
    0 image files updated
    1 image files unchanged

If you are trying to add/remove additional keywords in the same command as this, you will need to use -+Subject+<FacesNotInKeywords.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

grasdk

Thanks a lot. I realize that I didn't provide a good example in the attachment, because Subject wasn't "cleaned" for the example.

I got the config working with a proper example. Thanks!

grasdk

Hmm, comming back again with a question.

I'm not sure how to debug the config-file contents... but it appears that when the custom tag list is empty "undef" is not returned.

Starting off with an already fixed file (using the command in the original answer, with extra arguments to fix the subject tag):

$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywords 2023-08-27-120000-fixed.jpg
I do not get any output from -FacesNotInKeywords:
$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywords 2023-08-27-120000-fixed.jpg
Subject                         : Holiday, Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Keywords                        : Holiday, Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Region Person Display Name      : Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench

This leads me to believe that I could list the files in need of change using the conditional that FacesNotInKeywords is not defined:
$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywords -if 'undef FacesNotInKeywords' 2023-08-27-120000-fixed.jpg

However I get
    1 files failed condition instead

I did a bit of experimenting, altering this line
return @NewList ? \@NewList : undef;to
return 'X';which should always return 'X', but it doesn't when using the if conditional on a file where there faces that are not also in keywords / subject.

No idea yet, what goes wrong - assuming some kind of exception on array operations. I'm a total newbie in this, but I will try and experiment more. Unless anyone reading here will be faster than me ;)

Thanks for all the help so far. Really appreciated.

Kind regards, Chris


PS: My experiment:

Extra functions in the config file:
        ################# Cut here #################
        FacesNotInKeywords => {
        # Takes combined keywords/subject, removes any of them that also are faces, leaving faces that
        # are not duplicated to keywords
            Require => {
                0 => 'RegionPersonDisplayName',
            },
            Desire => {
                1 => 'Keywords',
                2 => 'Subject',
            },
            ValueConv => q{
                # combine Keywords and Subject and remove duplicates
                my @Faces        = ref $val[0] eq 'ARRAY' ? @{$val[0]} : ($val[0]);
                # Combine Keywords and Subject
                my @list  = (ref $val[1] eq 'ARRAY' ? @{$val[1]} : (defined $val[1] ?( $val[1] ):()), ref $val[2] eq 'ARRAY' ? @{$val[2]} : (defined $val[2] ?( $val[2] ):()) );
                @list = uniq(@list);
                @Faces= uniq(@Faces);
                my @NewList = @{Diff_Array(\@list,\@Faces)} ;
                # remove blank lines
                @NewList = grep(/\S/, @NewList);
                return @NewList ? \@NewList : undef;
            }
        },
FacesNotInKeywordsX => {
        # Takes combined keywords/subject, removes any of them that also are faces, leaving faces that
        # are not duplicated to keywords
            Require => {
                0 => 'RegionPersonDisplayName',
            },
            Desire => {
                1 => 'Keywords',
                2 => 'Subject',
            },
            ValueConv => q{
                # combine Keywords and Subject and remove duplicates
                my @Faces        = ref $val[0] eq 'ARRAY' ? @{$val[0]} : ($val[0]);
                # Combine Keywords and Subject
                my @list  = (ref $val[1] eq 'ARRAY' ? @{$val[1]} : (defined $val[1] ?( $val[1] ):()), ref $val[2] eq 'ARRAY' ? @{$val[2]} : (defined $val[2] ?( $val[2] ):()) );
                @list = uniq(@list);
                @Faces= uniq(@Faces);
                my @NewList = @{Diff_Array(\@list,\@Faces)} ;
                # remove blank lines
                @NewList = grep(/\S/, @NewList);
                return 'X';
            }
        },
FacesNotInKeywordsY => {
        # Takes combined keywords/subject, removes any of them that also are faces, leaving faces that
        # are not duplicated to keywords
            Require => {
                0 => 'RegionPersonDisplayName',
            },
            Desire => {
                1 => 'Keywords',
                2 => 'Subject',
            },
            ValueConv => q{
                return 'Y';
            }
        },
        ################# Cut here #################

Added a Y-version that always return Y, and it does! :)

exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywordsX -FacesNotInKeywordsY 2023-08-27-120000-fixed.jpg

$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywordsX -FacesNotInKeywordsY 2023-08-27-120000-fixed.jpg
Subject                         : Holiday, Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Keywords                        : Holiday, Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Region Person Display Name      : Person Carrying ChildInBlueShirt, Person Carrying ChildInYellowDress, RedHairedPerson SittingOnBench
Faces Not In Keywords Y         : Y

StarGeek

Quote from: grasdk on November 03, 2023, 03:22:05 PMI'm not sure how to debug the config-file contents... but it appears that when the custom tag list is empty "undef" is not returned.

Starting off with an already fixed file (using the command in the original answer, with extra arguments to fix the subject tag):

$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywords 2023-08-27-120000-fixed.jpg
I do not get any output from -FacesNotInKeywords:

That is as intended.  All the region names have matches in Keywords and/or Subject, so the result is an empty set. This is when undef is the returned value.

QuoteThis leads me to believe that I could list the files in need of change using the conditional that FacesNotInKeywords is not defined:
$ exiftool -config ../facesnotinkeywords.config -r -Subject -Keywords -RegionPersonDisplayName -FacesNotInKeywords -if 'undef FacesNotInKeywords' 2023-08-27-120000-fixed.jpg

However I get
    1 files failed condition

This will always fail as undef FacesNotInKeywords (should be undef $FacesNotInKeywords) is setting the value of $FacesNotInKeywords to be undefined.  This is why your return 'X'; fails.

The proper -if would be
-if "defined $FacesNotInKeywords
but this can be simplified to just
-if "$FacesNotInKeywords
for 99.999+% of all cases.  The only time (I think) this would fail would be there is only one region name that is equal to "0".
C:\>exiftool -config test4.config -G1 -a -s -Subject -RegionPersonDisplayName -FacesNotInKeywords -sep ## Y:\!temp\x\y\2023-08-27-120000.jpg
[XMP-dc]        Subject                         : Holiday##Person Carrying ChildInBlueShirt##Person Carrying ChildInYellowDress##RedHairedPerson SittingOnBench
[XMP-MP]        RegionPersonDisplayName         : 0
[Composite]     FacesNotInKeywords              : 0

C:\>exiftool  -config test4.config -G1 -a -s -if "$FacesNotInKeywords"  -Subject -keywords -RegionPersonDisplayName -FacesNotInKeywords -sep ## Y:\!temp\x\y\2023-08-27-120000.jpg
    1 files failed condition

Technically, there are three times when an -if "$TAG" would fail.  When the tag does not exist (is undefined), when it is equal to 0, or when it is an empty, 0-length string.  The last case cannot happen in this tag because any empty or all white space string is stripped out by the grep line just before the return.

The one other possible point of failure would be if Keywords and Subject were not in sync but still contained all the region names
C:\>exiftool  -config test4.config -G1 -a -s -Subject -keywords -RegionPersonDisplayName -FacesNotInKeywords -sep ## Y:\!temp\x\y\2023-08-27-120000.jpg
[XMP-dc]        Subject                         : Person Carrying ChildInBlueShirt
[IPTC]          Keywords                        : Person Carrying ChildInYellowDress
[XMP-MP]        RegionPersonDisplayName         : Person Carrying ChildInYellowDress##Person Carrying ChildInBlueShirt
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

grasdk