How to clean remove all the tags which do not contain a word 'Top'?

Started by moth3r, July 21, 2013, 07:15:05 AM

Previous topic - Next topic

moth3r

Hi,

I am having a pretty big database of images sorted into subfloders. I would like to clean remove all tags - keywords from all of them through all subfolders with a rule to keep tags - keywords inside photos which contain a string or word 'Top'.

I have everything tagged by "Top-Portraits, Top-Clothes, Top-Acts etc.) in that way, I'd guess I could keep my private tags and clean automatically all the other - custom ones. I am also using my 'Top' tags to have my albums uploaded automatically via Picasa to Google+ photos.

[EDITED] I should probably clarify that I would like only to nuke only specific tags which appear as 'keywords' inside Bridge, Lightroom and Picasa. Picasa is probably using just IPTC Information Interchange Model (IPTC) keyword data to JPEG files, but I'm not really sure what's the case w Bridge and Lightroom. [EDITED]

I would really really appreciate your help!
Thank in advance!

Ivan

moth3r

I manage to find a way to clean my keywords even though some are still left behind.

exiftool -keywords= -iptc:keywords= -exif:xpkeywords= -xmp= -overwrite_original -r "c:\my folder"

I am still looking for the answer to my 1st post. I'd guess, that just takes a minute for someone who is more experienced with Exiftool but it would be such a great help for me!

Thanks!

Phil Harvey

I would have suggested

exiftool -keywords= -subject= -xpkeywords= -hierarchicalkeywords= ...

But read FAQs 2 and 3 to learn how to determine the actual tag names for the keywords in your images.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

moth3r

Okay,

If I would like to nuke all the 'keywords' than this would be perfect as you've suggested:

exiftool -keywords= -subject= -xpkeywords= -hierarchicalkeywords= -overwrite_original -r "c:\my folder"

but still I don't have a slight idea how to exclude specific tag, keyword in my example a word "Top"

exiftool -keywords= -subject= -xpkeywords= -hierarchicalkeywords= -tagsFromFile @ -title=Top* -caption=Top* -keywords=Top* -overwrite_original -r "c:\my folder"

or

exiftool -keywords= -subject= -xpkeywords= -hierarchicalkeywords=  --subject=Top* --xpkeywords=Top* --hierarchicalkeywords=Top* -overwrite_original -r "c:\my folder"

In any case, I would like to nuke all the keywords and not keywords in the files which do not contain a string or word "Top". Sorry for my ignorance but I am not that much familiar w a coding standards.

Many thanks,
Ivan

Phil Harvey

Hi Ivan,

Sorry, I missed that.

To keep the ones with "Top", you will need separate commands:

exiftool -keywords= -if "$keywords !~ /Top/" -overwrite_original -r "c:\my folder"

The -if expression returns true if Keywords does not contain the string "Top" (case sensitive).  You will have to repeat a command like this for each tag you want to delete.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

moth3r

Phil,

THANK YOU SO MUCH :)

I've put it down in a .cmd file:

exiftool -keywords= -if "$keywords !~ /Top/" -overwrite_original -r "c:\Alien"
exiftool -subject= -if "$subject !~ /Top/" -overwrite_original -r "c:\Alien"
exiftool -xpkeywords= -if "$xpkeywords !~ /Top/" -overwrite_original -r "c:\Alien"
exiftool -hierarchicalkeywords= -if "$hierarchicalkeywords !~ /Top/" -overwrite_original -r "c:\Alien"


Alien folder was a perfect test subject :)

Btw. just a quick question as I've noticed ExifTool reporting some jpg files to actually be png's and vice versa. Is there a flag for exiftool to automatically sets the right extension if my image file was wrongly saved-renamed?

I also have a problem with ExifTool not finding some files due naming convections so it there is also a flag for that please let me know!

Report for '333のコピー'File not found: c:/Alien/333????.jpg'

In any case, that was tremendous help already!

Thank you for the help and ExifTool as well!

Ivan

Phil Harvey

Hi Ivan,

Quote from: moth3r on July 22, 2013, 09:14:47 AM
Btw. just a quick question as I've noticed ExifTool reporting some jpg files to actually be png's and vice versa. Is there a flag for exiftool to automatically sets the right extension if my image file was wrongly saved-renamed?

You could do this:

exiftool -filename=%f.png -if "$filetype eq 'PNG'" "c:\Alien"

QuoteI also have a problem with ExifTool not finding some files due naming convections so it there is also a flag for that please let me know!

Report for '333のコピー'File not found: c:/Alien/333????.jpg'

Sorry, this is a Known problem with no easy solution.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

moth3r

Just a quick followup. Thanks to you Phil, Exiftool is now my favorite and only way to successfully clean my image database which gets updated on daily basis.

This code is from batch file I run now at least one time a week:

exiftool -keywords= -if "$keywords !~ /Top/" -overwrite_original -r "c:\Pictures"
exiftool -subject= -if "$subject !~ /Top/" -overwrite_original -r "c:\Pictures"
exiftool -xpkeywords= -if "$xpkeywords !~ /Top/" -overwrite_original -r "c:\Pictures"
exiftool -hierarchicalkeywords= -if "$hierarchicalkeywords !~ /Top/" -overwrite_original -r "c:\Pictures"
exiftool -filename=%f.png -if "$filetype eq 'PNG'" -r "c:\Pictures"
exiftool -filename=%f.jpg -if "$filetype eq 'JPG'" -r "c:\Pictures"


Considering that each command in a row makes ExifTools go through 250 000 files every time, I would just ask is there a better way to automatize this process even more?

Thanks!

Phil Harvey

Wow, that's a lot of processing!

Is there any way to run these commands only to the files before they are added to the database?

The only way to combine all of these commands into one would be to write a custom script and use the ExifTool API.

I think your last command is missing an "E" in "JPEG":

exiftool -filename=%f.jpg -if "$filetype eq 'JPEG'" -r "c:\Pictures"

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

moth3r

QuoteWow, that's a lot of processing!
So true :) I think that takes about to two hours :)

QuoteThe only way to combine all of these commands into one would be to write a custom script and use the ExifTool API.
I am lacking knowledge to write it but I would really appreciate it if you could specify the code. I'd guess that would be a great reference so if there are some other people w same problem they could edit it for their purposes as well. Maybe just for two command passes, cleaning and renaming if that makes it any easier. In any case, that's just a wish. I don't wanna push it as I am already glad I have something indeed working!

QuoteI think your last command is missing an "E" in "JPEG":
Corrected, thanks Phil!

Ivan

Phil Harvey

Hi Ivan,

I normally don't have time to write scripts like this, but I had a bit of free time and was feeling generous:

#!/usr/bin/perl -w

use Image::ExifTool;

sub ProcessFile($);

my $exifTool = new Image::ExifTool;
my $count = 0;
my $total = 0;

sub ProcessFile($)
{
    local $_;
    my $file = shift;
    $file =~ s/\\/\//g; # convert all backslashes to forward slashes
    if (-d $file) {
        opendir DIR, $file or die;
        my @files = readdir DIR;
        closedir DIR;
        foreach (@files) {
            # process all files unless they start with a "."
            ProcessFile("$file/$_") unless /^\./;
        }
    } else {
        my $write;
        my @tags = ( 'Keywords', 'Subject', 'XPKeywords', 'HierarchicalKeywords');
        $exifTool->SetNewValue();   # reset previous new values
        my $info = $exifTool->ImageInfo($file, @tags, 'FileType');
        ++$total;
        foreach my $tag (@tags) {
            next unless defined $$info{$tag} and $$info{$tag} !~ /Top/;
            $exifTool->SetNewValue($tag);   # delete this tag
        }
        if ($$info{FileType} and $file =~ /^.*?([^\/]*?)(\.[^.\/]*)?$/) {
            my $name = $1;
            if ($$info{FileType} eq 'JPEG' and $file !~ /\.jpg/) {
                $exifTool->SetNewValue(FileName => "$name.jpg", Protected => 1);
            } elsif ($$info{FileType} eq 'PNG' and $file !~ /\.png/) {
                $exifTool->SetNewValue(FileName => "$name.png", Protected => 1);
            }
        }
        if ($exifTool->CountNewValues()) {
            ++$count if $exifTool->WriteInfo($file) == 1;
        }
    }
}

ProcessFile($_) foreach @ARGV;

printf("%5d total files scanned\n", $total);
printf("%5d files updated\n", $count);

# end


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).