How to remove everything after specific characters

Started by bartek, April 11, 2024, 06:32:54 AM

Previous topic - Next topic

bartek

Hi! I hope you can help me with this. I have a list of .jpg files, every file with a unique "Description" field, but in the Description there's always a specific group of characters, that I would like to remove and everything that occurs after. So for example:

Lorem ipsum dolor :-lorem ipsum 
I'd like to remove everything after :- so the result would be:
Lorem ipsum dolor


Another thing is that in some files there's an url at the beggining of the Description field that I'd also like to remove, so for example (the url may change but there's always https:// in it):

https://xyz.abc/qwe Lorem ipsum dolor :-lorem ipsum 
I'd like to remove the url in the beggining and everything after :- so the result would be:
Lorem ipsum dolor
Is it possible?

Phil Harvey

exiftool "-description<${description;s/(http\S+ )?(.*?) :-/$1/ or $_ = undef}" DIR

This will remove any string starting with "http" and its trailing space at the start of the line, and " :-" and anything afterward from the end of the line.  The "or $_ = undef" has the effect of not rewriting the description if nothing changed.

Google for "regular expressions" if you want to learn about this magic.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

bartek

Hello Phil! Thanks for your reply. However the result was the opposite of what I intended, because the command you proposed removed everything between the url and the :- characters, and the url and what comes after the :- was left  ;) can you help me fix that?
 

greybeard

#3
How about this:

exiftool "-description<${description;s/(http\S+)?(.*)( :-)(.*$)/$2/ or $_ = undef}" DIR

bartek