How to merge keywords of duplicate photos?

Started by Koala, November 22, 2016, 07:05:51 AM

Previous topic - Next topic

Koala

Hello forum,

I have successfully transferred my photo collection to Piwigo photo gallery. There I detected ~1000 photos which have one or more duplicates as image but they are different in keywords.
I have added keyword "duplicate" as EXIF-data to them, to identify them again.

What I would like to do next:

  • synchronize keywords between two (or more) duplicate photos, so no keywords are lost
  • as IPTC-subject is sometimes different from EXIF-keywords I would like to synchronize (merge) them as well
  • keep one photo of each duplicate group and delete the rest of the group

Is there a way to perform this with exiftool?

Thanks for your support and regards

Koala

Hayo Baan

You won't be able to do this without some (probably a lot of) scripting, but it is certainly possible.

To begin, how do you identify the duplicate files? Do you have e.g. a file that contains a list of duplicates for each file?
Hayo Baan – Photography
Web: www.hayobaan.nl

Koala

Hello Hayo,

thanks for your reply.
Quote from: Hayo Baan on November 22, 2016, 07:12:01 AM
To begin, how do you identify the duplicate files? Do you have e.g. a file that contains a list of duplicates for each file?
Yes, with dupeguru https://www.hardcoded.net/dupeguru/ I can generate a csv file that contains all relevant information:
eg:
Group ID,file name,folder,size (KB),resolution,matching %
0,P1050179.JPG,/home/koala/Bilder/Der unendliche Garten/Der_unendliche_Garten,500,2048 x 1536,100
0,P1050179.JPG,/home/koala/Bilder/Der unendliche Garten,500,2048 x 1536,100
1,P1050181.JPG,/home/koala/Bilder/Der unendliche Garten/Der_unendliche_Garten,630,2048 x 1536,100
1,P1050181.JPG,/home/koala/Bilder/Der unendliche Garten,629,2048 x 1536,100
[...]


All files with the same Group ID are duplicates of each other.

Now, the script has to digest this information.
Although I have a rough idea about the concept (some sort of loop inside a loop, some sort of regex,....), unfortunately I am not so deeply familiar with scripting (Linux Bash) to succeed in this challenge.
Probably the full script overstretches this forum, but If you could provide me the key lines, where exiftool extracts the keywords out of all duplicates an writes it into the first file.

This would help me a big step further.

Thanks a lot and kind regards

Koala

Hayo Baan

What you want to do isn't a straightforward task but here are some starting points (using mac quotes):

To add all keywords from another file:
exiftool -addtagsfromfile OTHERFILE -keywords'+<keywords' FILE

To remove duplicates:
exiftool -keywords'<${keywords; my (@a,%h);$h{lc $_} or push(@a,$_),$h{lc $_}=1 foreach split /, /;$_=join ", ",@a}'
hmm, this doesn't work as this will not keep the keywords as separate keywords. Phil, I'm sure there is a way to have it interpret the comma separated values as list again...
Hayo Baan – Photography
Web: www.hayobaan.nl

StarGeek

Here's my inline remove duplicate command:
exiftool -sep "##" "-keywords<${keywords;my %seen; my $new=join('##',grep { ! $seen{ $_ }++ } split /##/);$_ = $_ ne $new ? $new : undef}" DirOrFile

It throws a minor error "No writable tags" when there are no duplicate keywords, so that might cause confusion.

Same thing, doesn't throw an error but rewrites everything even if there aren't any duplicate keywords:
exiftool -sep "##" "-keywords<${keywords;my %seen;$_=join('##',grep { ! $seen{ $_ }++ } split /##/)}" DirOrFile


"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

@Hayo:  I think all you needed to do was add -sep ", "

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Hayo Baan

Quote from: Phil Harvey on November 23, 2016, 08:04:16 PM
@Hayo:  I think all you needed to do was add -sep ", "

Doh! :o I thought I had tried that, but I guess not...

Anyway, @Koala, do you need more help, or do you think you can work with the steps we gave you so far?
Hayo Baan – Photography
Web: www.hayobaan.nl

Phil Harvey

This question has come up a few times.  ExifTool 10.51 will add a "NoDups" utility function to make this a bit simpler:

exiftool -sep "##" "-keywords<${keywords;NoDups}" DirOrFile

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).