Author Topic: Search and remove Tags based on Regular Expressions  (Read 1866 times)

Kugelblitz

  • Full Member
  • ***
  • Posts: 42
Search and remove Tags based on Regular Expressions
« on: June 18, 2018, 05:26:20 AM »
Hello,

this is my First post and I am new to ExifTool.

I have a lot of Travel Photography Images in Various Image Formats. Mainly JPG, CR2, DNG, PNG. And have tracked the Tours with a GPS logger.
I had used the software https://www.geosetter.de to match the Images to the Location of the GPS Logger and get all kinds of location based metadata of the GPS Coordinates saved in the EXIF / IPTC Tags, like Country, City, District and so on.
And also the Geotags for Flickr that I was using back then (geotagged; geo:lat=xx.xxxxxxxx; geo:lan=xx.xxxxxxxx;)



Now I have about 25.000 geotagged images and all that geo:lat and geo:lon mess up the browsing and editing tags in other programs like Lightroom, Picasa or Diffractor. There are just to many of them.
The only issue I have is that it adds the tags geo:lat, geo:lon and geotagged which seems redundant because the GPS information is already in the files. And I do not need the "flickr" Geotaggs anymore as I do not use Flickr anymore.


So I like to get rid of the geotags in the Keywords section of the Exif/IPTC Tag with the exiftool

Something like
Code: [Select]
exiftool -keywords-="geotagged" -xmp:subject-=geotagged -xmp:subject-=geo:lat= -xmp:subject-=geo:lon=  d:\pictures

I sort of have figured out the regular expression to get the Tags
Code: [Select]
geo\:lat\=[0-9]{1,2}\.[0-9]{1,8};geo\:lon\=[0-9]{1,2}\.[0-9]{1,8};geo\:lat\=\-[0-9]{1,2}\.[0-9]{1,8};geo\:lon\=\-[0-9]{1,2}\.[0-9]{1,8};geotagged;
But I have no clue how to write that so it works in exiftool as a batch search and remove for all files in all subfolders.

Thank you for reading this post and thank you for your help.

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 17004
    • ExifTool Home Page
Re: Search and remove Tags based on Regular Expressions
« Reply #1 on: June 18, 2018, 07:25:35 AM »
That's a long explanation.

To remove the "geo:lat=...", "geo:lon=..." and "geotagged" from the XMP subject, you could do this:

exiftool -sep xxx "-subject<${subject;s/(^|xxx)(geo:lat=|geo:lon=|geotagged).*?(xxx|$)/xxx/;s/(^xxx|xxx$)//}" DIR

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

  • Global Moderator
  • ExifTool Freak
  • *****
  • Posts: 4051
Re: Search and remove Tags based on Regular Expressions
« Reply #2 on: June 18, 2018, 10:41:11 AM »
This would work as well, would it not (with ver 10.87+)

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged).*?/}" DIR

I'm just trying to get the hang of the @ option.
Troubleshooting hints:
* When posting, include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).
* Double all percent signs (%) in a Windows batch file.
* If your GPS coords are negative, make sure and set the GpsLatitudeRef and GpsLongitudeRef tags correctly.

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 17004
    • ExifTool Home Page
Re: Search and remove Tags based on Regular Expressions
« Reply #3 on: June 18, 2018, 10:55:58 AM »
Yes.  That's a better solution.  It was a bit fidgety taking care of the edge cases when processed as a single string.  You can even simplify a bit further:

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" DIR

(the ".*?" was not needed)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

  • Full Member
  • ***
  • Posts: 42
Re: Search and remove Tags based on Regular Expressions
« Reply #4 on: June 18, 2018, 10:57:13 AM »
Hello Phil,
thank you very much for your reply.

I have tried the code you provided and it did something - looks like it has rearranged the geo:lat geo:lon and geotagged Tags but not removed them.

I have added a Sample Image to this reply. Maybe you can see it for yourself and get the right code quicker than if I write back and forth..

Thank You very much Phil


Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 17004
    • ExifTool Home Page
Re: Search and remove Tags based on Regular Expressions
« Reply #5 on: June 18, 2018, 11:03:06 AM »
Code: [Select]
% exiftool CIMG0461.jpg -subject
Subject                         : Deutschland, geo:lat=50.32272323, geo:lon=6.93208420, geotagged, M├╝llenbach, Rheinland-Pfalz
% exiftool CIMG0461.jpg -sep xxx '-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}'
    1 image files updated
% exiftool CIMG0461.jpg -subject
Subject                         : Deutschland, M├╝llenbach, Rheinland-Pfalz

(I'm on Mac, so I use single quotes)
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

  • Global Moderator
  • ExifTool Freak
  • *****
  • Posts: 4051
Re: Search and remove Tags based on Regular Expressions
« Reply #6 on: June 18, 2018, 11:28:52 AM »
I have tried the code you provided and it did something - looks like it has rearranged the geo:lat geo:lon and geotagged Tags but not removed them.

Phil's command only removed the Subject keywords.  Your file also has them in the Keywords tag, so that needs to be added.

exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=|geotagged)/}" DIR

It didn't rearrange the tags, the Keywords tag had them in a different order than in the Subject tag.  So whatever program you used to look at them ended up reading them a different way.
Troubleshooting hints:
* When posting, include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).
* Double all percent signs (%) in a Windows batch file.
* If your GPS coords are negative, make sure and set the GpsLatitudeRef and GpsLongitudeRef tags correctly.

Kugelblitz

  • Full Member
  • ***
  • Posts: 42
Re: Search and remove Tags based on Regular Expressions
« Reply #7 on: June 18, 2018, 02:03:53 PM »
Hello Phil,
Hello StarGeek,

Thank you for your replies.

Have tried the code from StarGeek and that worked perfectly.
Thank you very much.

I decided just to remove the geo lat and geo lon tags and keep the geotagged tag so I can filter all Images with GPS coordinates.

How can I use it on a folder with all Subfolders that contain the pictures?  "-r"  If I recall it right?
Code: [Select]
exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=)/}" -r DIR
I noticed the "original" jpg are still there called "CIMG0461.jpg_original"
How can that be automatically deleted too?

Code: [Select]
exiftool -sep xxx "-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=)/}" "-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=)/}" -r -overwrite_original d:\geotagged
Thank you again that was really helpful from you guys.

Cheers

« Last Edit: June 18, 2018, 02:14:07 PM by Kugelblitz »

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 17004
    • ExifTool Home Page
Re: Search and remove Tags based on Regular Expressions
« Reply #8 on: June 18, 2018, 02:15:56 PM »
How can I use it on a folder with all Subfolders that contain the pictures?  "-r"  If I recall it right?

Yes.

Quote
I noticed the "original" jpg are still there called "CIMG0461.jpg_original"
How can that be automatically deleted too?

-overwrite_original

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

  • Full Member
  • ***
  • Posts: 42
Re: Search and remove Tags based on Regular Expressions
« Reply #9 on: June 20, 2018, 05:36:03 PM »
SUCCESS

Ok the Process took a little more than two days but is finished now. Without any crashes or such.

 5549 directories scanned
95869 image files updated
98375 image files unchanged
  215 files weren't updated due to errors

I am not sure about the 215 files with the errors. I was not watching all the time and I did not "log" the whole process. But I have copied some Warning Messages when I saw them:

Warning: Invalid PrintIM header - DIR

Warning: [minor] Error reading PreviewImage from file - DIR

Warning: [Minor] IPTC:Keywords exceeds length limit (truncated) - DIR

Warning: [minor] Fixed incorrect URI for xmlns:MicrosoftPhoto - DIR

Warning: [minor] Advanced formatting expression returned undef for 'subject' - DIR

Warning: Bad NikonScanIFD SubDirectory start - DIR

Warning: Can't read MakerNotes data. Ignored. - DIR

Error: [minor] Bad MakerNotes offset for NEFBitDepth - DIR

Warning: [minor] Tag 'subject' not defined - DIR

Guess that one just means there are no geotaggs in the Subject (Tags) Metadata of the Picture.

Is there a way to log the error messages Appart from the "[minor] Tag 'subject' not defined". Or maybe it is easier to log everything and then just delete all lines with the "subject' not defined" error.

Thank you for your help.
Cheers
 



Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 17004
    • ExifTool Home Page
Re: Search and remove Tags based on Regular Expressions
« Reply #10 on: June 21, 2018, 07:09:27 AM »
The files that had errors will be the ones which didn't get their file modification date/time updated to the time when you ran the command.

To see only the errors and suppress all other output, you could add -q -q to the command.

To log the warnings/errors, you may be able to add 2>error_log.txt to the end of the command, depending on what command shell you are using.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

  • Global Moderator
  • ExifTool Freak
  • *****
  • Posts: 4051
Re: Search and remove Tags based on Regular Expressions
« Reply #11 on: June 21, 2018, 01:54:14 PM »
Warning: [minor] Fixed incorrect URI for xmlns:MicrosoftPhoto - DIR

This can be safely ignored.  Microsoft is inconsistent with their own standard and exiftool will fix this if the tag is rewritten.

Quote
Warning: [minor] Advanced formatting expression returned undef for 'subject' - DIR
Warning: [minor] Tag 'subject' not defined - DIR[/b]

These are probably cases where, as you guessed, there weren't geo keywords to change.

Quote
Warning: [Minor] IPTC:Keywords exceeds length limit (truncated) - DIR

This one might require some fixing.  The IPTC:Keywords tag has a limited length according to the specs, but it is pretty much ignored by most software.  In this case it got truncated.  You might notice in one of your Digital asset management (DAM) programs where a long keyword exists that there is now an additional truncated version of it.

Quote
Warning: Bad NikonScanIFD SubDirectory start - DIR
Warning: Can't read MakerNotes data. Ignored. - DIR
Error: [minor] Bad MakerNotes offset for NEFBitDepth - DIR

There are cases where the MakersNotes might have been messed up in some way.  Picasa, for example, tends to treat Nikon MakerNotes badly, thought it usually just deletes them in my experience.
Troubleshooting hints:
* When posting, include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).
* Double all percent signs (%) in a Windows batch file.
* If your GPS coords are negative, make sure and set the GpsLatitudeRef and GpsLongitudeRef tags correctly.