Remove duplicate [XMP-mwg-rs] Face tag names; keep only "smallest" one

Started by josjonkeren, July 27, 2020, 05:27:23 AM

Previous topic - Next topic

josjonkeren

Hello.

I have tagged my pictures with different software programs through the years. Those programs wrote the tags to the XMP-mwg-rs tag inside the picture file, which is good.
At this time I use Picasa to tag all faces correctly. Although it is old software, I find picasa to have the easiest and fastest user interface for doing this.
Also have tried Digikam and iMatch.

What I find now is that after confirming faces in Picasa, many of the 15.000+ files have duplicate face tags assigned to them.

Often times, one face tag is displayed on the correct position on the picture; and another at a completely wrong location (so a part of the picture where the face does not appear).
The duplicate name tag itself is correct, but on the wrong location in the picture.

Or, both face tags are set on the correct location (over the "face" part of the picture), but one of the rectangles is a bit larger than the other.

One example:
[XMP-mwg-rs]    RegionAppliedToDimensionsW      : 3264
[XMP-mwg-rs]    RegionAppliedToDimensionsH      : 2448
[XMP-mwg-rs]    RegionAppliedToDimensionsUnit   : pixel
[XMP-mwg-rs]    RegionName                      : Ulrike Nagel, Lotta Jonkeren, Ulrike Nagel
[XMP-mwg-rs]    RegionType                      : Face, Face, Face
[XMP-mwg-rs]    RegionAreaX                     : 0.872702, 0.206036, 0.857843
[XMP-mwg-rs]    RegionAreaY                     : 0.839665, 0.696691, 0.860294
[XMP-mwg-rs]    RegionAreaW                     : 0.0719975, 0.0818015, 0.10049
[XMP-mwg-rs]    RegionAreaH                     : 0.0959967, 0.109069, 0.160948
[XMP-mwg-rs]    RegionAreaUnit                  : normalized, normalized, normalized



Is there a way to:
1. find duplicate XMP-mwg-rs face tags; and
2. "magically" know which one of the duplicates is the "wrong" one; and
3. delete the duplicate face tags, keeping only the correct ones?

I understand that maybe number 2. above might be not easy :)
Then, might we, of all duplicate face tags, find the duplicates, and delete the "largest dimension" face tag boxes, so that only the "smallest in dimension" face tags are kept?
Reason is, I find many of the pictures with duplicate face tags have one tag that is correct, and another face tag is defined to a much too large area (maybe half or more of the picture).
If we can delete the largest duplicate face tag, then I think the correct (smaller) ones remain.

Has anyone maybe already written a nice exiftool command or script for this?
I have found a "face tag deduplication script" in this forum somewhere, but that only deletes the duplicates, without having any control over which of the duplicates get removed.

Thank you.



Phil Harvey

Where is the "face tag deduplication script" you found?  I don't see it when I search for that phrase.  It could be a good starting point.

What you want is possible, but it will take me some time to explain how to do it, and it may help if some of the work has already been done in the other script.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

This config file I made a few years ago.  It simply removed exact duplicates. 

Quote from: jonkeren1 on July 27, 2020, 05:27:23 AM
Is there a way to:
1. find duplicate XMP-mwg-rs face tags; and
2. "magically" know which one of the duplicates is the "wrong" one; and
3. delete the duplicate face tags, keeping only the correct ones?

I understand that maybe number 2. above might be not easy :)

I would say number 2 is impossible without actually being able to directly run face recognition to see if the region below had a face or not.  I do know that in IMatch5, such a region would show up with a small circle at the top (see "True Manual Face Annotations" on this Imatch help page), but I don't know if it's possible to filter on those faces (I guess it isn't).

QuoteThen, might we, of all duplicate face tags, find the duplicates, and delete the "largest dimension" face tag boxes, so that only the "smallest in dimension" face tags are kept?

While that would be possible, it's not a task I'm willing to invest in.  I've long since forgotten everything I learned about dealing with regions.

QuoteReason is, I find many of the pictures with duplicate face tags have one tag that is correct, and another face tag is defined to a much too large area (maybe half or more of the picture).

As I'm just now moving from Picasa to IMatch for facial recognition, I will say this.  IMatch has much, much better identification ability (and if this post is to be believed, DigiKam may be even better).  Additionally, it's regions are much smaller than Picasas.  Picasa is almost certainly the source of those very large regions.


"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

josjonkeren

Hi,

QuoteWhile that would be possible, it's not a task I'm willing to invest in.  I've long since forgotten everything I learned about dealing with regions.
I understand.

QuoteIMatch has much, much better identification ability (and if this post is to be believed, DigiKam may be even better)
Yes, I have tried IMatch about a month ago -- it works, but the user interface is much slower than Picasa's. IMatch is generally veeerrryyy slow....
Also, I find the user interface (the way that IMatch works with faces) not so easy to use as Picasa's. It is nice that IMatch stores people including birth dates etc., but it is slow and it takes me many more clicks to tag all faces correctly than with Picasa, so I'm staying with the software from the stone age :-)

Thanks.