ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: Kugelblitz on June 18, 2018, 05:26:20 AM

Title: Search and remove Tags based on Regular Expressions
Post by: Kugelblitz on June 18, 2018, 05:26:20 AM
Hello,

this is my First post and I am new to ExifTool.

I have a lot of Travel Photography Images in Various Image Formats. Mainly JPG, CR2, DNG, PNG. And have tracked the Tours with a GPS logger.
I had used the software https://www.geosetter.de (https://www.geosetter.de) to match the Images to the Location of the GPS Logger and get all kinds of location based metadata of the GPS Coordinates saved in the EXIF / IPTC Tags, like Country, City, District and so on.
And also the Geotags for Flickr that I was using back then (geotagged; geo:lat=xx.xxxxxxxx; geo:lan=xx.xxxxxxxx;)

(https://kisd.de/~martinb/foren/exiftool/tags.png)

Now I have about 25.000 geotagged images and all that geo:lat and geo:lon mess up the browsing and editing tags in other programs like Lightroom, Picasa or Diffractor. There are just to many of them.
The only issue I have is that it adds the tags geo:lat, geo:lon and geotagged which seems redundant because the GPS information is already in the files. And I do not need the "flickr" Geotaggs anymore as I do not use Flickr anymore.
(https://kisd.de/~martinb/foren/exiftool/picasa.png)

So I like to get rid of the geotags in the Keywords section of the Exif/IPTC Tag with the exiftool

Something like
Code: [Select]
exiftool -keywords-="geotagged" -xmp:subject-=geotagged -xmp:subject-=geo:lat= -xmp:subject-=geo:lon=  d:\pictures

I sort of have figured out the regular expression to get the Tags
Code: [Select]
geo\:lat\=[0-9]{1,2}\.[0-9]{1,8};geo\:lon\=[0-9]{1,2}\.[0-9]{1,8};geo\:lat\=\-[0-9]{1,2}\.[0-9]{1,8};geo\:lon\=\-[0-9]{1,2}\.[0-9]{1,8};geotagged;
But I have no clue how to write that so it works in exiftool as a batch search and remove for all files in all subfolders.

Thank you for reading this post and thank you for your help.
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Phil Harvey on June 18, 2018, 07:25:35 AM
That's a long explanation.

To remove the "geo:lat=...", "geo:lon=..." and "geotagged" from the XMP subject, you could do this:

exiftool -sep xxx "-subject<${subject;s/(^|xxx)(geo:lat=|geo:lon=|geotagged).*?(xxx|$)/xxx/;s/(^xxx|xxx$)//}" DIR

- Phil
Title: Re: Search and remove Tags based on Regular Expressions
Post by: StarGeek on June 18, 2018, 10:41:11 AM
This would work as well, would it not (with ver 10.87+)

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged).*?/}" DIR

I'm just trying to get the hang of the @ option.
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Phil Harvey on June 18, 2018, 10:55:58 AM
Yes.  That's a better solution.  It was a bit fidgety taking care of the edge cases when processed as a single string.  You can even simplify a bit further:

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" DIR

(the ".*?" was not needed)

- Phil
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Kugelblitz on June 18, 2018, 10:57:13 AM
Hello Phil,
thank you very much for your reply.

I have tried the code you provided and it did something - looks like it has rearranged the geo:lat geo:lon and geotagged Tags but not removed them.

I have added a Sample Image to this reply. Maybe you can see it for yourself and get the right code quicker than if I write back and forth..

Thank You very much Phil

(https://kisd.de/~martinb/foren/exiftool/CIMG0461.jpg)
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Phil Harvey on June 18, 2018, 11:03:06 AM
Code: [Select]
% exiftool CIMG0461.jpg -subject
Subject                         : Deutschland, geo:lat=50.32272323, geo:lon=6.93208420, geotagged, M├╝llenbach, Rheinland-Pfalz
% exiftool CIMG0461.jpg -sep xxx '-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}'
    1 image files updated
% exiftool CIMG0461.jpg -subject
Subject                         : Deutschland, M├╝llenbach, Rheinland-Pfalz

(I'm on Mac, so I use single quotes)
Title: Re: Search and remove Tags based on Regular Expressions
Post by: StarGeek on June 18, 2018, 11:28:52 AM
I have tried the code you provided and it did something - looks like it has rearranged the geo:lat geo:lon and geotagged Tags but not removed them.

Phil's command only removed the Subject keywords.  Your file also has them in the Keywords tag, so that needs to be added.

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" DIR

It didn't rearrange the tags, the Keywords tag had them in a different order than in the Subject tag.  So whatever program you used to look at them ended up reading them a different way.
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Kugelblitz on June 18, 2018, 02:03:53 PM
Hello Phil,
Hello StarGeek,

Thank you for your replies.

Have tried the code from StarGeek and that worked perfectly.
Thank you very much.

I decided just to remove the geo lat and geo lon tags and keep the geotagged tag so I can filter all Images with GPS coordinates.

How can I use it on a folder with all Subfolders that contain the pictures?  "-r"  If I recall it right?
Code: [Select]
exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=)/}" -r DIR
I noticed the "original" jpg are still there called "CIMG0461.jpg_original"
How can that be automatically deleted too?

Code: [Select]
exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=)/}" -r -overwrite_original d:\geotagged
Thank you again that was really helpful from you guys.

Cheers

Title: Re: Search and remove Tags based on Regular Expressions
Post by: Phil Harvey on June 18, 2018, 02:15:56 PM
How can I use it on a folder with all Subfolders that contain the pictures?  "-r"  If I recall it right?

Yes.

Quote
I noticed the "original" jpg are still there called "CIMG0461.jpg_original"
How can that be automatically deleted too?

-overwrite_original

- Phil
Title: Re: Search and remove Tags based on Regular Expressions
Post by: Kugelblitz on June 20, 2018, 05:36:03 PM
SUCCESS

Ok the Process took a little more than two days but is finished now. Without any crashes or such.

 5549 directories scanned
95869 image files updated
98375 image files unchanged
  215 files weren't updated due to errors

I am not sure about the 215 files with the errors. I was not watching all the time and I did not "log" the whole process. But I have copied some Warning Messages when I saw them:

Warning: Invalid PrintIM header - DIR

Warning: [minor] Error reading PreviewImage from file - DIR

Warning: [Minor] IPTC:Keywords exceeds length limit (truncated) - DIR

Warning: [minor] Fixed incorrect URI for xmlns:MicrosoftPhoto - DIR

Warning: [minor] Advanced formatting expression returned undef for 'subject' - DIR

Warning: Bad NikonScanIFD SubDirectory start - DIR

Warning: Can't read MakerNotes data. Ignored. - DIR

Error: [minor] Bad MakerNotes offset for NEFBitDepth - DIR

Warning: [minor] Tag 'subject' not defined - DIR

Guess that one just means there are no geotaggs in the Subject (Tags) Metadata of the Picture.

Is there a way to log the error messages Appart from the "[minor] Tag 'subject' not defined". Or maybe it is easier to log everything and then just delete all lines with the "subject' not defined" error.

Thank you for your help.
Cheers
 


Title: Re: Search and remove Tags based on Regular Expressions
Post by: Phil Harvey on June 21, 2018, 07:09:27 AM
The files that had errors will be the ones which didn't get their file modification date/time updated to the time when you ran the command.

To see only the errors and suppress all other output, you could add -q -q to the command.

To log the warnings/errors, you may be able to add 2>error_log.txt to the end of the command, depending on what command shell you are using.

- Phil
Title: Re: Search and remove Tags based on Regular Expressions
Post by: StarGeek on June 21, 2018, 01:54:14 PM
Warning: [minor] Fixed incorrect URI for xmlns:MicrosoftPhoto - DIR

This can be safely ignored.  Microsoft is inconsistent with their own standard and exiftool will fix this if the tag is rewritten.

Quote
Warning: [minor] Advanced formatting expression returned undef for 'subject' - DIR
Warning: [minor] Tag 'subject' not defined - DIR[/b]

These are probably cases where, as you guessed, there weren't geo keywords to change.

Quote
Warning: [Minor] IPTC:Keywords exceeds length limit (truncated) - DIR

This one might require some fixing.  The IPTC:Keywords tag has a limited length according to the specs, but it is pretty much ignored by most software.  In this case it got truncated.  You might notice in one of your Digital asset management (DAM) programs where a long keyword exists that there is now an additional truncated version of it.

Quote
Warning: Bad NikonScanIFD SubDirectory start - DIR
Warning: Can't read MakerNotes data. Ignored. - DIR
Error: [minor] Bad MakerNotes offset for NEFBitDepth - DIR

There are cases where the MakersNotes might have been messed up in some way.  Picasa, for example, tends to treat Nikon MakerNotes badly, thought it usually just deletes them in my experience.