cannot update certain keyword tags

Started by outdoormagic, October 11, 2022, 12:39:49 PM

Previous topic - Next topic

outdoormagic

This rookie (me) is clearly missing the trivially obvious and needs a little help, please. It looks to me like SetNewValue is setting the tags, so I need to find a way to force WriteInfo to write them. Thank you in advance for any pointers.

I have a file—in this case a JPEG—and keywords are messed up with duplicate keywords.

exiftool -a -G1 -s  -"*Subject" -"*keywords*" FILE.JPG

shows

[XMP-dc]        Subject                         : <comma-separated list with duplicate keywords>
[XMP-dc]        Subject                         : <another comma-separated list with duplicate keywords>
[XMP-lr]        HierarchicalSubject             : <yet another comma-separated list with duplicate keywords>
[IPTC]          Keywords                        : <final comma-separated list with duplicate keywords>

My code goes like this (I stripped away calls to Dumper, other clean up steps, and such for readability):

use Image::ExifTool ':Public';
our $exifTool = new Image::ExifTool;                                    # open global instance
$exifTool->Options(Composite=>0);                                       # don't load non-writeable Composite tags

my @tags        = qw( Keywords Subject HierarchicalSubject );           # adding groups as group:tag didn't work
my @tagsCopy    = @tags;
my $info        = $exifTool->ImageInfo($file, \@tagsCopy);              # open file in exifTool
$exifTool->SetNewValue();                                               # reset change log

foreach my $tag ( @tags ){
    my $group   = $exifTool->GetGroup($tag);
    my @keywords = split ',', $$info{$tag};                             # convert to list
    @keywords   = uniq @keywords;                                       # remove duplicates

    my @setResult = $exifTool->SetNewValue( $tag => \@keywords, Group=>$group, Replace => 1 );   # update tags
}
my $result = $exifTool->WriteInfo( $file, $updatedFile );  # all tags have been set; write the changes to file


Code runs with WriteInfo returning with $result = 1 and no errors/warnings.

Result:
Tag: Keywords
Group: IPTC
Unique keywords in list: 72
Returned from SetNewValue: # tags set: 72

Tag: Subject
Tag: Subject (1)
Group: XMP
Unique keywords in list: 72
Returned from SetNewValue: # tags set: 145

Tag: HierarchicalSubject
Group: XMP
Unique keywords in list: 27
Returned from SetNewValue: # tags set: 27

The above looks like SetNewValue did it's job. (I'm guessing the tags set for "Subject" is the total for both instances of the tag, +1 for a keyword that is only spaces and gets removed.)

exiftool -a -G1 -s  -"*Subject" -"*keywords*" UPDATED_FILE.JPG

shows that those changes didn't make it into the updated file.

[XMP-dc]        Subject                         : <comma-separated list with UPDATED keywords>
[XMP-dc]        Subject                         : <UNMODIFIED comma-separated list with duplicate keywords>
[XMP-lr]        HierarchicalSubject             : <UNMODIFIED comma-separated list with duplicate keywords>
[IPTC]          Keywords                        : <comma-separated list with UPDATED keywords>
So Keywords and one of the Subject get updated, but the second Subject and HierarchicalSubject don't.

What I tried:
  • Different group syntaxes: '$group:$tag' syntax, 'Group=>$group' syntax, and leaving out the Group altogether in SetNewValue().
  • Replace => 1 and Replace => 0.
  • Writing to tag 'Subject (1)' (i.e., explictely adding the ' (1)').
  • Deleting the duplicate tag with 'Subject (1)' => undef and 'Subject' => undef. That didn't work, but setting 'Keywords' => undef will delete the 'Keywords' tag completely.
  • Resetting tags with $exifTool->SetNewValue( $tag ) followed by WriteInfo. Again, only deletes 'Keywords'.
  • Deleting all the tags with $tag => undef and calling WriteInfo, then setting the tags to their values prior to another call to WriteInfo. Again, only 'Keywords' get deleted.
  • Ran with @tags as only HierarchicalSubject and WriteInfo returns with Code 2, so no changes made to the file, which is correct: it didn't update the tag.
  • I have no idea why HierarchicalSubject doesn't update at all. It is as if it not writeable, but I don't get any error nor warning.

Most relevant posts that I found and tried to implement:
  • Checked to see that tags are valid tags (b/c I was a little surprised to see XMP tags in a JPEG): https://exiftool.org/TagNames/index.html
  • Found chat about duplicates and Replace options: https://exiftool.org/forum/index.php?topic=7173.0
  • Checked options and return values from SetNewValue: https://exiftool.org/ExifTool.html#SetNewValue
  • Deleting duplicate tags: https://exiftool.org/forum/index.php?topic=4710.0

Phil Harvey

There are problems with your starting file.  Try running this command:

exiftool -validate -warning -a FILE

I think you have XMP stored in a non-standard location somewhere that ExifTool isn't able to write.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

outdoormagic

Hi Phil,
Thanks for the quick reply.

I don't understand most of this, but the '1000 dc:subject items' warning is part of the issue I am trying to fix. Somehow, sometime, I ended up with files with keywords being duplicated dozens and dozens of times (!).

exiftool -validate -warning -a FILE

Validate                        : 10 Warnings (9 minor)
Warning                         : [minor] Non-standard ExifIFD tag 0x882a TimeZoneOffset
Warning                         : [minor] Undefined value for MakerNotes:ClearRetouchValue
Warning                         : [minor] Non-standard IFD0 tag 0xc6d2 PanasonicTitle
Warning                         : [minor] Non-standard IFD0 tag 0xc6d3 PanasonicTitle2
Warning                         : [minor] Boolean value for XMP-xmpRights:Marked should be capitalized
Warning                         : [minor] Boolean value for XMP-photomech:Tagged should be capitalized
Warning                         : [minor] IPTC TimeCreated too short (6 bytes; should be 11)
Warning                         : IPTCDigest is not current. XMP may be out of sync
Warning                         : [Minor] Extracted only 1000 dc:subject items. Ignore minor errors to extract all
Warning                         : [Minor] Extracted only 1000 lr:hierarchicalSubject items. Ignore minor errors to extract all


StarGeek

Quote from: outdoormagic on October 11, 2022, 02:45:06 PMI don't understand most of this, but the '1000 dc:subject items' warning is part of the issue I am trying to fix. Somehow, sometime, I ended up with files with keywords being duplicated dozens and dozens of times (!).

Wow, that's a lot of keywords.

Exiftool doesn't do any bookkeeping for you and if you tell it to add the same keyword multiple times, it will do so.  On the command line, you can prevent duplicates by structuring your commands as shown in FAQ #17 where it says "To prevent duplication".

Using the Perl API, you would make sure to remove duplicates before writing.

On the command line, you can use the NoDups helper function to correct the files.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

outdoormagic

Quote from: StarGeek on October 11, 2022, 07:09:31 PMWow, that's a lot of keywords.

Yep. I usually have at most 2 dozen, including names and qualifiers. This file, for some reason, ended up with over 6 thousand! Hence my desperate need for a clean up. My problem seems to be that I cannot overwrite the crazy long erroneous list with the corrected, trimmed down list.

I usually use PhotoMechanic and HoudahGeo (including sometimes alpha/beta releases), but I'm tempted to open the offending photos in Lightroom or Bridge and then overwrite the data. Hopefully the XMP creators will know how to fix it.  ;)

Thank you for the command line examples. I am considering calling them via `exiftool ...` in Perl, but if as Phil suggested earlier it is the file itself that is non-standard or corrupted (my words), then I wouldn't expect the command line invocation of exiftool to work any differently than Perl, as the problem isn't exiftool itself.

Phil Harvey

We're getting further away from solving things for you.  The 1000+ Subject items is a whole different problem.  Even with 1000+ items, ExifTool shouldn't have a problem overwriting the tag as long at it writes XMP in this location.  But you need to figure out where the 1000 items came from.  WriteInfo will overwrite existing entries in a file unless you use the AddValue option (which you weren't).  So this means that you must have queued up thousands of items in your SetNewValue calls (ie. called SetNewValue repeatedly for XMP:Subject without ever setting the Replace option to 1.).

But I don't see a warning about non-standard XMP, which I would have expected since XMP:Subject is duplicated somehow.

I think you need to start from a known, good file, then add the keywords then check the file at each stage using the exiftool command-line app.

Or if you want to figure out what is wrong with the existing file and how to fix it, upload it somewhere so we can take a look.

- Phil

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

outdoormagic

Hi Phil,

This image was tagged in 2019, so I honestly don't remember how the duplicates got in there. I was trying out many DAM options at the time, across platforms, as well as discovering exiftool and perl. It is safe to assume it was user error on my part—either misusing exiftool or being stupid with test software found in cyberspace.

I have put one of the most corrupted images in Dropbox. It is a JPEG from a Lumix G9.

https://www.dropbox.com/s/irdfih7a2nlkych/Thor%27s%20Well_20191015_Thor%E2%80%99s%20Well%20and%20Foreground_6243.JPG?dl=0

Thank you.

- Paul

Phil Harvey

Hi Paul,

For that particular image, the problem is that XMP:Subject is duplicated, which causes a "Minor" warning and prevents any XMP tags from being written without the command-line -m option (or API IgnoreMinorErrors option):

> exiftool "Thor's Well_20191015_Thor's Well and Foreground_6243.JPG" -subject=test
Warning: [minor] Excessive number of items for dc:subject. Processing may be slow - Thor's Well_20191015_Thor's Well and Foreground_6243.JPG
Warning: [minor] Excessive number of items for lr:hierarchicalSubject. Processing may be slow - Thor's Well_20191015_Thor's Well and Foreground_6243.JPG
Warning: [Minor] Duplicate XMP property: dc:subject/rdf:Bag/rdf:li 270 - Thor's Well_20191015_Thor's Well and Foreground_6243.JPG
    0 image files updated
    1 image files unchanged
> exiftool "Thor's Well_20191015_Thor's Well and Foreground_6243.JPG" -subject=test -m
    1 image files updated
> exiftool "Thor's Well_20191015_Thor's Well and Foreground_6243.JPG" -subject
Subject                         : test

Writing IPTC Keywords works fine for this file.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

outdoormagic

Thanks for checking it out, Phil. I had also noted that IPTC was being updated fine.

I'll try adding $exifTool->Options(IgnoreMinorErrors => 1); to my code and see how that works out. Else, back to a clean version and build up again.

Again, thank you for your time!