Hello forum,
I have successfully transferred my photo collection to Piwigo photo gallery. There I detected ~1000 photos which have one or more duplicates as image but they are different in keywords.
I have added keyword "duplicate" as EXIF-data to them, to identify them again.
What I would like to do next:
- synchronize keywords between two (or more) duplicate photos, so no keywords are lost
- as IPTC-subject is sometimes different from EXIF-keywords I would like to synchronize (merge) them as well
- keep one photo of each duplicate group and delete the rest of the group
Is there a way to perform this with exiftool?
Thanks for your support and regards
Koala
You won't be able to do this without some (probably a lot of) scripting, but it is certainly possible.
To begin, how do you identify the duplicate files? Do you have e.g. a file that contains a list of duplicates for each file?
Hello Hayo,
thanks for your reply.
Quote from: Hayo Baan on November 22, 2016, 07:12:01 AM
To begin, how do you identify the duplicate files? Do you have e.g. a file that contains a list of duplicates for each file?
Yes, with dupeguru https://www.hardcoded.net/dupeguru/ (https://www.hardcoded.net/dupeguru/) I can generate a csv file that contains all relevant information:
eg:
Group ID,file name,folder,size (KB),resolution,matching %
0,P1050179.JPG,/home/koala/Bilder/Der unendliche Garten/Der_unendliche_Garten,500,2048 x 1536,100
0,P1050179.JPG,/home/koala/Bilder/Der unendliche Garten,500,2048 x 1536,100
1,P1050181.JPG,/home/koala/Bilder/Der unendliche Garten/Der_unendliche_Garten,630,2048 x 1536,100
1,P1050181.JPG,/home/koala/Bilder/Der unendliche Garten,629,2048 x 1536,100
[...]
All files with the same Group ID are duplicates of each other.
Now, the script has to digest this information.
Although I have a rough idea about the concept (some sort of loop inside a loop, some sort of regex,....), unfortunately I am not so deeply familiar with scripting (Linux Bash) to succeed in this challenge.
Probably the full script overstretches this forum, but If you could provide me the key lines, where exiftool extracts the keywords out of all duplicates an writes it into the first file.
This would help me a big step further.
Thanks a lot and kind regards
Koala
What you want to do isn't a straightforward task but here are some starting points (using mac quotes):
To add all keywords from another file:
exiftool -addtagsfromfile OTHERFILE -keywords'+<keywords' FILE
To remove duplicates:
exiftool -keywords'<${keywords; my (@a,%h);$h{lc $_} or push(@a,$_),$h{lc $_}=1 foreach split /, /;$_=join ", ",@a}'
hmm, this doesn't work as this will not keep the keywords as separate keywords. Phil, I'm sure there is a way to have it interpret the comma separated values as list again...
Here's my inline remove duplicate command:
exiftool -sep "##" "-keywords<${keywords;my %seen; my $new=join('##',grep { ! $seen{ $_ }++ } split /##/);$_ = $_ ne $new ? $new : undef}" DirOrFile
It throws a minor error "No writable tags" when there are no duplicate keywords, so that might cause confusion.
Same thing, doesn't throw an error but rewrites everything even if there aren't any duplicate keywords:
exiftool -sep "##" "-keywords<${keywords;my %seen;$_=join('##',grep { ! $seen{ $_ }++ } split /##/)}" DirOrFile
@Hayo: I think all you needed to do was add -sep ", "
- Phil
Quote from: Phil Harvey on November 23, 2016, 08:04:16 PM
@Hayo: I think all you needed to do was add -sep ", "
Doh! :o I thought I had tried that, but I guess not...
Anyway, @Koala, do you need more help, or do you think you can work with the steps we gave you so far?
This question has come up a few times. ExifTool 10.51 will add a "NoDups" utility function to make this a bit simpler:
exiftool -sep "##" "-keywords<${keywords;NoDups}" DirOrFile
- Phil