I have a need to remove selected keywords from image exif data. Some of mine I no longer use and on images borrowed from others for my website.
Is there some easy way? ..hard way?
To remove keywords "one" and "two" from all images in directory "DIR":
exiftool -keywords-=one -keywords-=two DIR
(that is assuming that they are stored in the IPTC keywords. Read FAQ 2 and 3 (https://exiftool.org/faq.html) to figure out the tag name you should actually use.)
- Phil
Thanks for the tip. I presume the entire keyword needs to be entered in the command line. Mine are dotted as in "pinacate.volcano.lava flow.biggest" Would just "-keywords-=pinacate" work?
I am re-introduced to the utility of batch files.
The keyword needs to be an exact match.
- Phil
I tried with one keyword and was informed that exiftool looked at 92 files but declined to change any of them. Where is this documented? I can try to figure it out myself and quit bugging you.
This is documented in the application documentation (https://exiftool.org/exiftool_pod.html) (which you see when you run exiftool with no arguments). Look at the section under -TAG[+-]=[VALUE]. Also, see FAQ number 17 (https://exiftool.org/faq.html#Q17).
If it didn't change any files, then there were no keywords that matched the ones you tried to remove. Also see FAQ number 3 if you are still having no luck with this.
- Phil
This makes no sense to me. I am using Friedemann Schmidt's Geosetter as a GUI. I brought up the edit data window for an image, copied the keyword from the keywords box and pasted it into my bat builder (Notepad++). Other images in the directory have the same keywords (added in a batch). The bat file looks like this:
ECHO OFF
CD /D H:\images\TecoRestored
exiftool -keywords-=Pinacate.Monogenetic.Fresh%20cones.Tecolote.Cone H:\images\TecoRestored
PAUSE
There is a space in one part (Fresh cones) so I put in %20 for the space character. Keywords are separated with semi-colons but leaving it or removing it makes no difference.
The result is:
1 directories scanned
0 image files updated
92 image files unchanged
I can't figure it out.
The problem I see is that you don't have keywords, you have a single keyword which encompasses all of "pinacate.volcano.lava flow.biggest". Unless all the files in the directory have the exact same keyword, it wouldn't be easy to remove just "pinacate" as a single command.
If you want to convert the single grouped keyword into individual keywords, you can use the -sep option as mentioned in the FAQ that Phil pointed to. In this case
exiftool -sep "." -tagsfromfile @ -keywords H:\images\TecoRestored
That will separate the keywords for you. You can then use
exiftool -keywords-=pinacate H:\images\TecoRestored
to remove pinacate from the keywords.
If you want to remove a keyword with a space, such as "lava flow", you can put quotes around the word
exiftool -keywords-="lava flow" H:\images\TecoRestored
I would suggest making sure you have a backup of your images before trying full directory operations just in case you remove data that you want to keep. Also, separating the keywords like that might not be readable in the original program that put in the keywords that way.
It occurs to me that, in this case, I might be ahead by cleaning out all the keywords and starting from an empty object. Can I do that?
To remove all IPTC keywords, use -keywords= on the command line. To remove XMP keywords, use -xmp:subject=. I'm not sure where your keywords are stored. You can remove both at the same time if you want.
- Phil
QuoteThe keyword needs to be an exact match.
- Phil
Is this statement still valid ?
I would like to be able to filter the "Hierarchical Subject".
Quote---- XMP-lr ----
Hierarchical Subject : gens|famille|famille Beatrice|Mike, places|Brasil|Minas Gerais|Itabira
in two different ways:
1. remove every keyword starting with "gens" (people).
2. remove the first level of the keyword everywhere else (in some tools this level is considered as category) because they have no interest as keyword. in the above example "places" should be removed and we would keep only "Brasil|Minas Gerais|Itabira".
What do you think ?
Thanks, Philippe
Hi Philippe,
Since this original post there is a new advanced formatting feature that allows you to do arbitrary filtering of tag values. Try this:
exiftool "-HierarchicalSubject<${HierarchicalSubject;s/^gens\|[^,]+(, )?//;s/(^|, )[^\|]+\|/$1/}" FILE
- Phil
There is one thing to watch for, though. Programs like Lightroom will also write the base keywords to iptc keywords and XMP subject. So in the case of the gens keyword Mike, it would have to be removed separately from keywords and subject.
Also, I realized that I needed to add a -sep ", " to the command so the list items get separated back properly again.
- Phil
Thank you very much Phil !
It works perfectly with the first example.
When I add some tags it doesn't make all the job.
Before:
Hierarchical Subject : construction|bâtiment|fenêtre, gens|famille|famille Beatrice|Mike, places|Brasil|Minas Gerais|Ipoema, style|portrait
After:
Hierarchical Subject : bâtiment|fenêtre, gens|famille|famille Beatrice|Mike, places|Brasil|Minas Gerais|Ipoema, style|portrait
Only Construction has gone.
Is there a place where I can find the syntax ? (My neurones will make some knots ... :-[)
I've not understood the place / the need of -sep ", " ...
@StarGeek, yes, that will be the next topic. And remove all "people" related simple keywords may be a challenge...
Unless it was possible to recreate them from the Hierarchical Subject after cleaning ... ;)
You're right, my expression doesn't work as indended. It is complicated by the fact that the list items are joined into a single string. But you are in luck: I have added a new (as-yet undocumented) feature to ExifTool 10.87 which allows the expression to work on individual list items by adding a "@" after the tag name. So the command may be simplified to this with ExifTool 10.87:
exiftool "-Hierarchicalsubject<${Hierarchicalsubject@;/^gens\|.*/ ? $_=undef : s/[^\|]+\|//}" -sep ", " FILE
The regular expression syntax is explained in a number of places (here is one (https://perldoc.perl.org/perlre.html)), but it is very powerful, so the documentation is very lengthy.
StarGeek often recommends Regular-Expressions.info (http://www.regular-expressions.info/) as a site to learn about regular expressions. And Regex101.com (https://regex101.com/) is a great site where you can test out your regex.
You can test your regular expressions in ExifTool using the -p option before actually rewriting the file:
exiftool -p "-${Hierarchicalsubject@;/^gens\|.*/ ? $_=undef : s/[^\|]+\|//}" FILE
- Phil
Your formula works great, even with version 10.86 !
I get now the desired result:
Hierarchical Subject : bâtiment|fenêtre, Brasil|Minas Gerais|Ipoema, portrait
I'll now try to do the same for Subject.
---- XMP-dc ----
Subject : Brasil, Mike, Ipoema, Minas Gerais, bâtiment, construction, famille, famille Beatrice, fenêtre, gens, places, portrait, style
Do you think that is possible to reuse the output of Hierarchical Subject ?
That would be the most generic way.
I should not be alone to feel lucky to be able use your tool ! Impressive tool indeed !
Thanks
You're right. This feature was actually added in version 10.53, but it didn't work with the -p option until 10.87.
After you have edited HierarchicalSubject, you can write the components back to Subject like this:
exiftool "-subject<hierarchicalsubject" -sep "|" FILE
But this will have to be done in a separate command.
- Phil
Sorry, I haven't checked everything.
After cleaning, ExifView shows this (which seems perfect):
Hierarchical Subject : bâtiment|fenêtre, Brasil|Minas Gerais|Ipoema, portrait
(I don't see difference using -sep ", " or not)
But if I look at xmp data from xnview I see different lines before but only one line after (see attached files).
So I guess something is not perfect yet.
I'm using an argument file (also attached).
There is a difference if you don't add the -sep option. Read FAQ 17 (https://exiftool.org/faq.html#Q17) for details. You need the -sep option to write it properly.
- Phil
I understand that -sep is needed but I would like to see the effect :)
Directly on command line:
"C:\Program Files (x86)\Exiftool\exiftool.exe" -P -overwrite_original "-Hierarchicalsubject<${Hierarchicalsubject@;/^gens\|.*/ ? $_=undef : s/[^\|]+\|//}" -sep ", "
that works as I can see (as on attached file) the 3 records instead of the unique one.
But the same command in the argument file (attached in the previous post) produces a unique line, as if -sep ", " was ignored (unique record as shown on previous post snaphot).
I've tried to put -sep ", " before or after the command but that doesn't change anything.
-sep ", "
-XMP:Hierarchicalsubject<${Hierarchicalsubject@;/^gens\|.*/ ? $_=undef : s/[^\|]+\|//}
or
-XMP:Hierarchicalsubject<${Hierarchicalsubject@;/^gens\|.*/ ? $_=undef : s/[^\|]+\|//}
-sep ", "
The corresponding command line with argument file is:
"C:\Program Files (x86)\Exiftool\exiftool.exe" -k -@ "c:\Documents\Darktable\ExifWeb.txt"
Where is my mistake ?
From the docs on arg files
"The file contains one argument per line (NOT one option per line -- some options require additional arguments, and all arguments must be placed on separate lines). "
Move the CommaSpace to a seperate line and remove the double quotes.
Thank you StarGeek, I'd seen that but not understood properly :-[
Works great now.
Philippe
Just a quick update. Thanks to the interesting links you've shared, I've understood I could not iterate to get Subject for HierarchicalSubject in one pass.
As I wanted to use ExifTool integrated with darktable export, I've found simpler to use lua to prepare the data. Then Exiftool write them on exported images.
Works great !
Thank you again
New update
Based on the first formula you gave me I've succeeded in getting, not only the HierarchicalSubject, but also the Subject.
Here are the formulas:
-XMP:Subject<${HierarchicalSubject;s/((gens|piwigo)\|[^,]+, |, (gens|piwigo)\|[^,]+|$)//g;s/(^|, )[^\|]+\|/$1/g;s/\|/, /g;NoDups}
-XMP:HierarchicalSubject<${HierarchicalSubject;s/((gens|piwigo)\|[^,]+, |, (gens|piwigo)\|[^,]+|$)//g;s/(^|, )[^\|]+\|/$1/g}
That removes the HierarchicalSubject starting by gens or piwigo.
That removes the first level of HierarchicalSubject like places, ...
That transforms the HierarchicalSubject in Subject.
And last that removes the duplicates.
What a show !
Thanks again for the tool and your quick support.
Hi Phil,
Going forward I've a new question.
I'm setting some IPTC this way:
-IPTC:Country-PrimaryLocationName < ${HierarchicalSubject;s/^.*places\|([^\|,]*).*/$1/ or $_=undef}
-IPTC:Province-State < ${HierarchicalSubject;s/^.*places\|([^\|,]*)\|([^\|,]*).*/$2/ or $_=undef}
-IPTC:City < ${HierarchicalSubject;s/^.*places\|([^\|,]*)\|([^\|,]*)\|([^\|,]*).*/$3/ or $_=undef}
-IPTC:Sub-location < ${HierarchicalSubject;s/^.*places\|([^\|,]*)\|([^\|,]*)\|([^\|,]*)\|([^\|,]*).*/$4/ or $_=undef}
But the 4th level is rarely present and I get an empty tag:
City : Ipoema
Sub-location :
Province-State : Minas Gerais
Country-Primary Location Name : Brasil
Is there a way no to create the tag in case of undef value ?
Thanks
Philippe
Hi Philippe,
Setting $_=undef will cause the tag not to be written. Somehow your Sub-location pattern must be matching with $4 being an empty string for this to get set to an empty string. This works for me on MacOS:
> exiftool a.jpg -hierarchicalsubject
Hierarchical Subject : construction|bâtiment|fenêtre, gens|famille|famille Beatrice|Mike, places|Brasil|Minas Gerais|Ipoema, style|portrait
> exiftool a.jpg '-iptc:sub-location<${HierarchicalSubject;s/^.*places\|([^\|,]*)\|([^\|,]*)\|([^\|,]*)\|([^\|,]*).*/$4/ or $_=undef}'
Warning: [minor] Advanced formatting expression returned undef for 'HierarchicalSubject' - a.jpg
Warning: No writable tags set from a.jpg
0 image files updated
1 image files unchanged
- Phil
Thank you Phil for the answer.
The simulator (the site you sent me the reference) doesn't return anything, but I'll continue to investigate.
I'm using windows 10 64bit.
I've made a try with:
-IPTC:Sub-location < ${HierarchicalSubject;s/^.*places\|([^\|,]*)\|([^\|,]*)\|([^\|,]*)\|([^\|,]*).*/$4/}
the result is, instead of empty:
Sub-location : construction|bâtiment|fenêtre, gens|famille|famille Beatrice|Mike, piwigo|2010s|2018|03-10 Ipoema, places|Brasil|Minas Gerais|Ipoema
If I understand properly that means the match hasn't been found...
Correct. So when you add " or $_=undef" when setting IPTC:Sub-location, then it shouldn't get written.
So I don't understand the problem. Everything seems to be working properly.
- Phil
Hi Phil,
I'm trying to find a way to add the IPTC Location tag on the 4th level of HierarchicalSubject below "places".
The following code returns a syntax error. I haven't found any example of such embedded {} but it's worth the try :)
exiftool -p "${HierarchicalSubject;s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)/$1\|${XMP-iptcCore:Location}/}" D:\Documents\Images\Photos\2010\2017\20171118_Tiradentes\20171119_Tiradentes_004.xmp
Warning: syntax error for 'HierarchicalSubject' - D:/Documents/Images/Photos/2010/2017/20171118_Tiradentes/20171119_Tiradentes_004.xmp
Is there a way to achieve this ?
Thank you
Sorry for the delay in responding.
You can't use other tags by name in these expressions. Instead, you would need to do something like this:
exiftool -p "${HierarchicalSubject;my $loc=$self->GetValue('Location');s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)/$1\|$loc/}"
But I don't have time to try this out to see if it would work. Also, it would get more complicated if there is more than one type of Location tag.
- Phil
Hi Phil,
This works great straight ahead !! :)
New things to learn ...
I've just to treat the case when Location is not present.
exiftool -p "${HierarchicalSubject;my $loc=$self->GetValue('Location');s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)/$1\|$loc/}" D:\Documents\Images\Photos\2010\2017\20171118_Tiradentes\20171119_Tiradentes_004.xmp
genre|graine, piwigo|2010s|2017|11-18 Tiradentes, places|Brasil|Minas Gerais|Tiradentes|Pousada Alma Serra
Thank you so much.
Hi Phil,
Thanks to your guiding, the following line adds IPTC:Location to IPTC:HierarchicalSubject if IPTC:Location exists. Otherwise lets the file unchanged. Great !
exiftool -overwrite_original -tagsFromFile @ -m -if "$XMP-iptcCore:Location" "-HierarchicalSubject<${HierarchicalSubject;my $loc=$self->GetValue('Location');s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)\|.*|$/$1\|$loc/}" -sep ", " D:\Documents\Images\Photos\2010\2017\20171118_Tiradentes\*.xmp
Works also inside argument file.
However, if I add the command -P (command line or argument file) the substitution doesn't happen. Is there a reason for that ?
Thank you
Adding -P shouldn't affect things unless you put it in between another option and its argument. What was the command you used?
- Phil
I put the command -P as the first command.
I've not tried in other position. If it behaves another way on other position I'll report it.
Philippe
Oh. That's the problem. You used -p, not -P. It must be capitalized.
Or, wait. What are you trying to do? -P preserves the file modification date/time when writing. -p specifies a print format string.
(The documentation explains all this.)
- Phil
Yes, I use -P (capitalized) to preserve the modification time.
For the other cases you helped me to solve, it was not an issue.
But in the first case below, file is not updated (location not written) while the second case works fine:
C:\Users\philippe>exiftool -P -overwrite_original -tagsFromFile @ -m -if "$XMP-iptcCore:Location" "-HierarchicalSubject<${HierarchicalSubject;my $loc=$self->GetValue('Location');s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)\|.*|$/$1\|$loc/}" -sep ", " D:\Documents\Images\Photos\2010\2017\20171118_Tiradentes\*.xmp
1 files failed condition
2 image files updated
C:\Users\philippe>exiftool -overwrite_original -tagsFromFile @ -m -if "$XMP-iptcCore:Location" "-HierarchicalSubject<${HierarchicalSubject;my $loc=$self->GetValue('Location');s/(places\|[^\|,]*\|[^\|,]*\|[^\|,]*)\|.*|$/$1\|$loc/}" -sep ", " D:\Documents\Images\Photos\2010\2017\20171118_Tiradentes\*.xmp
1 files failed condition
2 image files updated
Could you add a -v2 to each command so I can see the differences?
Thanks.
- Phil
Hi Phil,
The 2 cases return the same data with -v2.
I've just understood my mistake. :-[
I thought the file was not updated because Notepad++ did not prompt me for change with -P (of course because the modification date is preserved !)
If I reload from disk manually the file is correctly modified.
Sorry for inconvenience...
Thank you again.
Great. Glad you figured it out
- Phil
Quote from: phweyland on April 08, 2018, 09:14:54 AM
I thought the file was not updated because Notepad++ did not prompt me for change with -P (of course because the modification date is preserved !)
I've had that happen with Notepad++ on occasion. I find that if I'm expecting a change and don't get the prompt, clicking something outside of Notepad++ so it isn't the active window, and then reselecting it will get it to prompt for reload.
I'm guessing that won't work in this case since it seems as if Notepad is checking the file modification date/time to decide whether or not to reload.
- Phil
Yep, it looks like. I had assumed that it was using some feature of the NTFS that can notify a program when there's a change in the file. I don't know too much about the details but I remember seeing some other program mention it.
On MacOS.
I'm trying to do your example below with about 60 more specific keywords to remove, but keep the good ones.
The individual keyword removal example below works fine. I'd like to run exiftool once per file and remove all unwanted keywords. There are also duplicate keywords. The application used to write the IPTC data to the original jpg and tif files seemed to randomly add words from the Captions field to the Keywords field in the original. Many, not all. Trying to clean that up and leave the correct keywords.
I read the documentation -TAG[+-]<=DATFILE Write tag value from contents of file
Tried that with a list of the unwanted keywords in DATFILE, one per line.
exiftool '-keywords-<=keywords.txt' /Users/Neil/Desktop/CT02_58_60009\ copy.jpgThis only seems to work if there is only one keyword in keywords.txt, removing that one keyword.
Is the approach correct?
What if any is the format for multiple keywords in the contents of DATFILE (keywords.txt).
What is the correct approach?
exiftool -s /Users/Neil/Desktop/CT02_58_60009\ copy.jpg | grep Keywordsshows the following output:
Keywords : Grandma Hannie Hanson, Erling Warren, Hannie family, Erling, Erling
Advice would be appreciated. Thank you.
Quote from: Phil Harvey on December 15, 2012, 08:52:52 PM
To remove keywords "one" and "two" from all images in directory "DIR":
exiftool -keywords-=one -keywords-=two DIR
(that is assuming that they are stored in the IPTC keywords. Read FAQ 2 and 3 (https://exiftool.org/faq.html) to figure out the tag name you should actually use.)
- Phil
Quote from: Neil on June 13, 2022, 01:44:49 AM
I read the documentation -TAG[+-]<=DATFILE Write tag value from contents of file
Tried that with a list of the unwanted keywords in DATFILE, one per line.
exiftool '-keywords-<=keywords.txt' /Users/Neil/Desktop/CT02_58_60009\ copy.jpg
This only seems to work if there is only one keyword in keywords.txt, removing that one keyword.
Is the approach correct?
No. Doing that would tell exiftool to look and remove a
single tag with the
entire contents of the file. There is no parsing done on a DATFILE. It is treated as a single value, as is.
What you want to do is create a ARGFILE and use the
-@ (Argfile) option (https://exiftool.org/exiftool_pod.html#ARGFILE). Using your example keywords, it you wanted to remove "Erling Warren" and "Erling", your ARGFILE would be
-Keywords-=Erling Warren
-Keywords-=Erling
And then you would run the command
exiftool -@ keywords.txt /path/to/files/Note that even though there's a space in "Erling Warren", you do not use quotes around it. The quotes are needed on the command line, but not used in a ARGFILE.
Thanks.
Will this also eliminate the duplicate keywords?
It's pretty random which words got added how many times to which files.
Thanks again.
Quote from: Neil on June 13, 2022, 02:48:49 PM
Will this also eliminate the duplicate keywords?
It will remove all keywords that are an
exact match. If there are duplicates of keywords that you are keeping, they will remain. FAQ #17 (https://exiftool.org/faq.html#Q17) may be useful in preventing duplicates in the first place.
Or you could be like me and remove duplicates when I'm done with that set of pictures and use the
NoDups helper function (https://exiftool.org//exiftool_pod.html#Helper-functions).
It should also be noted that keywords might be in the
XMP:Subject tag, depending upon the software you're using. That would require a similar operation.
Thank you StarGeek
Excellent information and guidance.
The tip about the IPTC:Keywords AND don't forget the XMP:Subject tags was spot on. They were both there, both the same.
I've ended up with 71 wrong keywords so far, and I haven't scanned the whole library yet. That's a lot of cleaning up.
NoDups I will look into. I'm more concerned about the wrong keywords than duplicates of the correct ones.
I suspect the duplicate keywords shouldn't impact storage, function or performance that much.
Again, thank you.
Quote from: Neil on June 13, 2022, 10:25:26 PMNoDups I will look into. I'm more concerned about the wrong keywords than duplicates of the correct ones.
I suspect the duplicate keywords shouldn't impact storage, function or performance that much.
I haven't done much testing with various programs, but I agree the impact should be minimal.
In the actual file, it will only add a few bytes. In a simple image viewer, should be no difference. Maybe a bit of impact in a Digital Asset Management (DAM) program. It would be hoped for that when a DAM program re-writes data to the file it would consolidate the keywords, but if I recall correctly, I checked once using Adobe Bridge and it did not do so.
Quote from: Phil Harvey on December 15, 2012, 08:52:52 PMTo remove keywords "one" and "two" from all images in directory "DIR":
[tt]exiftool -keywords-=one -keywords-=two DIR[/tt]
(that is assuming that they are stored in the IPTC keywords. Read FAQ 2 and 3 (https://exiftool.org/faq.html) to figure out the tag name you should actually use.)
- Phil
If I specify a folder, images, but there are subfolders, will this command continue recursively through the folder hierarchy or do I need to add something to achieve this, thanks
You have to add the -r (-recurse) option (https://exiftool.org/exiftool_pod.html#r-.--recurse).