Hi there, I am using exiftool to recursively go through my directories and photo files and set the subject tag based on the words found in the directory name and set the person in image tag based on the words found in the photo filename.
The command is:
exiftool -r -m -addtagsfromfile @ '-subject<${directory;s(.*Photo.*?/)()}' '-personinimage<${filename;s(\S*\..*)()}' -api listsplit='[ /]' -overwrite_original DIR
This command will yield the keywords Holidays, Egypt, 1997 and the person in image Pyramids, Me, Wife for photo file /Users/me/Photos/Holidays/Egypt 1997/Pyramids Me Wife IMG_1099.JPG
(see my earlier post: https://exiftool.org/forum/index.php?topic=12435.msg67286#msg67286 (https://exiftool.org/forum/index.php?topic=12435.msg67286#msg67286))
I now would like to enhance this command to have it match words between brackets as keywords, while still matching words that are not between brackets as person in image.
So the first step to achieve this is to have the regex in the command ignore any word between brackets (that is just before the original filename) but at the same time have it take the other words into account.
I have rewritten my command line to this:
exiftool -r -m -addtagsfromfile @ '-subject<${directory;s(.*Photo.*?/)()}' '-personinimage<${filename;s(\s*(\(.*\))?\s*\S*\..*)()}' -api listsplit='[ /]' -overwrite_original DIR
This command should yield the keywords: Holidays, Egypt, 1997; and the persons in image: Me, Wife for photo file /Users/me/Photos/Holidays/Egypt 1997/Me Wife (Pyramids) IMG_1099.JPG
(note that, once this works, I will have to further enhance the command to have it match the words between brackets once more to add them to the keywords too)
I have tested the regex on https://www.regexplanet.com/advanced/perl/index.html (https://www.regexplanet.com/advanced/perl/index.html) and it seems to work correct: I see that the words Me and Wife are extracted to $.
However, it does not work in exiftool. When I use this new command, it still extracts the keywords correctly but the personinimage tag remains empty.
I must have made a mistake somewhere, but where?
What is wrong about the regex used for the personinimage tag?
It seems to work for me
C:\>exiftool -P -overwrite_original -api listsplit="[ /]" -addtagsfromfile @ "-subject<${directory;s(.*Photo.*?/)()}" "-personinimage<${filename;s(\s*(\(.*\))?\s*\S*\..*)()}" "Y:\!temp\aaaa\Photos\Holidays\Egypt 1997\Me Wife (Pyramids) IMG_1099.JPG"
1 image files updated
C:\>exiftool -G1 -a -s -xmp:all "Y:\!temp\aaaa\Photos\Holidays\Egypt 1997\Me Wife (Pyramids) IMG_1099.JPG"
[XMP-x] XMPToolkit : Image::ExifTool 12.25
[XMP-iptcExt] PersonInImage : Me, Wife
[XMP-dc] Subject : Holidays, Egypt, 1997
The only changes I would make would be to maybe use BaseName instead of Filename if you're using ver 12.22+. That would simplify the regex so you could drop the need to worry about the extension. Saving the text between the parenthesis will require a separate tag copy. Something like
'-Subject<${BaseName;m/\((.*?)\)/;$_=$1}'
C:\>exiftool -P -overwrite_original -api listsplit="[ /]" -addtagsfromfile @ "-subject<${directory;s(.*Photo.*?/)()}" "-personinimage<${Basename;s(\s*(\(.*\))?\s*\S*$)()}" "-Subject<${BaseName;m/\((.*?)\)/;$_=$1}" "Y:\!temp\aaaa\Photos\Holidays\Egypt 1997\Me Wife (Pyramids) IMG_1099.JPG"
1 image files updated
C:\>exiftool -G1 -a -s -xmp:all "Y:\!temp\aaaa\Photos\Holidays\Egypt 1997\Me Wife (Pyramids) IMG_1099.JPG"
[XMP-x] XMPToolkit : Image::ExifTool 12.25
[XMP-iptcExt] PersonInImage : Me, Wife
[XMP-dc] Subject : Holidays, Egypt, 1997, Pyramids
Hi Stargeek, thanks for yor answer. On my iMac, it really does not work. But when I tried out the same command on my MacBook, indeed it worked. Both iMac and MacBook are using the same exiftool version (12.24). At this point I am not sure why there is a difference in outcome between the two. Is that a known issue? Anyway, for now I can proceed on my MacBook and will also take your other comments into account in further enhancing my command.
Could it be the perl version? I have compared both of my computers since they are on different version of Mac OSX too.
iMac --> v5.18.4 (OSX: Mojave)
MacBook --> 5.28.2 (OSX: Big Sur)
I would think that regex would be in the Perl core and shouldn't change so much between versions.
Try comparing the output of
exiftool -ver -v
Beyond that, there's not much more I can help with as I don't use a Mac.
I did that and found that the same modules are installed but all versions are different between the two.
A quick check in the perl version history tells me that in perl 5.26 the following was changed in relation to regex: New regular expression modifiers and capture groups
I think that the parentheses ( and ) are part of the capture groups and may indeed influence the outcome of this regex on my iMac which has perl 5.18.
Yeah, this part (\(.*\))? does capture (Pyramids), but that shouldn't change anything, since you don't do anything with the capture.
Maybe change it to a non-capture grouping
(:?\(.*\))?
and see if that helps.
This was very difficult to understand, because people always saying 'keywords' instead of $Subject, so its very confusing to me.
Also, Im not know anything about this newer regex version being used, because Im still using the exiftool version 12.11.
So now I think Im finally understanding that you want to sometimes match (words) inside $Filename for the $Subject ????
But in all of the commands, $Subject is only asking $Directory for words, so (Pyramids) could never be matched.
Also, Im guessing that $PersonInImage should always be prejudice against (words) inside of $Filename ????
So if the guessing is true, this some regexs to experiment with, that still work for me on the older regex version...
-Subject'<${Directory;s|.*Photo.*?/||}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}'
-PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/ \S*\.[^.]*$//}'
@Luuk2005 Initially I indeed meant "Keywords" however, Phil pointed out that Subject is more often used nowadays so I switched to that. My intention, in any case, is to extract the words from the folder path and use those words as (ahem) keywords and put them in the Subject XMP tag. And for the words found in the filename, to use those to insert in the PersonInImage XMP tag. But at the same time, to use any word between parentheses in the filename as a keywords in Subject XMP tag again.
For example:
- /Holidays/Egypt 1997/Me Wife IMG_1099.JPG --> must yield Subject: Holidays, Egypt, 1997; and PersonInImage: Me, Wife
- /Holidays/Egypt 1997/Me Wife (Pyramids) IMG_1099.JPG --> must yield Subject: Holidays, Egypt, 1997, Pyramids; and PersonInImage: Me, Wife
Furthermore, I got exiftool running with perl version 5.32.1 now so I can work around the regex problems.
@Stargeek Thank you for proposing an additional tag copy. It works, but only if the photo file does have a word between parentheses. If it does not, it copies the text from the additional tag copy command literally to the Subject XMP tag. Like this:
- /Holidays/Egypt 1997/Me Wife IMG_1099.JPG --> will yield Subject: Holidays, Egypt, 1997, m, \((.*?)\), ;$_=$1; and PersonInImage: Me, Wife
- /Holidays/Egypt 1997/Me Wife (Pyramids) IMG_1099.JPG --> will yield Subject: Holidays, Egypt, 1997, Pyramids; and PersonInImage: Me, Wife
I am trying a lot of different things, but cannot work around this problem.. Do you have any suggestion?
Quote from: Luuk2005 on May 14, 2021, 02:32:45 PM
This was very difficult to understand, because people always saying 'keywords' instead of $Subject, so its very confusing to me
There's the more generic term "Keywords" or "Tags" which is what programs like Lightroom or Windows would display when entering data. I tend to call this a Property, which I take from the fact that image metadata can be found under the Properties window.
Then there's the actual tags,
Keywords and
Subject. I'll always use the (https://exiftool.org/forum/Themes/default/images/bbc/tele.gif) to mark these so as their more distinct.
Most people who don't get into the minute details of metadata are just going to use "Keywords", as that is what it will say on the program they use to input the data.
Quote from: brightwolf on May 18, 2021, 03:57:14 PM
@Stargeek Thank you for proposing an additional tag copy. It works, but only if the photo file does have a word between parentheses. If it does not, it copies the text from the additional tag copy command literally to the Subject XMP tag. Like this:
Ooops, you did make the parenthesis enclosed word optional in your original post, but I forgot about it.
Offhand I can offer this
'-Subject<${BaseName;$_=(m/\((.*?)\)/) ? $1 :undef}'You will get a minor warning
Advanced formatting expression returned undef for files that don't have the parenthesis, but that can be ignored or you can suppress it with the
-m (
-ignoreMinorErrors) option (https://exiftool.org/exiftool_pod.html#m--ignoreMinorErrors).
C:\>exiftool -P -overwrite_original -subject= "-Subject<${BaseName;$_=(m/\((.*?)\)/) ? $1 :undef}" Y:\!temp\aaa
Warning: [minor] Advanced formatting expression returned undef for 'BaseName' - Y:/!temp/aaa/Me Wife IMG_1099.jpg
Warning: No writable tags set from Y:/!temp/aaa/Me Wife IMG_1099.jpg
1 directories scanned
1 image files updated
1 image files unchanged
C:\>exiftool -g1 -a -s -subject Y:/!temp/aaa/
======== Y:/!temp/aaa/Me Wife (Pyramids) IMG_1099.JPG
---- XMP-dc ----
Subject : Pyramids
======== Y:/!temp/aaa/Me Wife IMG_1099.jpg
1 directories scanned
2 image files read
Greetings everyone! All I can say is that with exiftool v12.11, the two regexs I give, will conduct like the below descriptions.
Its unfortunate, but Im not wanting to update because afraid this new regex version might destroy some of my expressions.
With v12.11, this was my command line (except using double-quotes for Windows)
exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit='[ /]' -if '$Directory=~/Photo/' -Subject'<${Directory;s|.*Photo.*?/||}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}' -PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/ \S*\.[^.]*$//}' '.'
So for pathnames ending like...
Photos/Holidays/Egypt 1997/Me Wife IMG_1099.jpg
Subject: Holidays, Egypt, 1997
PersonInImage: Me, Wife
Photos/Holidays/Egypt 1997/Me Wife (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, Pyramids
PersonInImage: Me, Wife
Photos/Holidays/Egypt 1997/Me (reading) Wife (eating) Joe (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, reading, eating, Pyramids
PersonInImage: Me, Wife, Joe
Hi Luuk, thanks for your contribution! And indeed, your regex works real well and offers the added benefit that all words between parentheses are added as subject fields (keywords), no matter their position in the photo filename. With my solution, the words between parentheses had to be just before the basename; if they were not, other words following after them would be omitted.
With regard to the perl version, I did not update the version but installed a later version alongside the default version. Then, I start up exiftool as follows:
perl /usr/local/bin/perl5.32.1 exiftool ...
@Luuk2005 I am testing the regex in the exiftool command a bit more extensive now, and have come across a problem. For the personinimage tag, if the filename has a person in the filename (for example: Me Wife IMG_1099.JPG) then it works. But if there's no such name in the filename (for example: IMG_1099.JPG) then the filename (IMG_1099.JPG) gets added as the person in the image. How could I extend the regex to solve that problem? It should optionally match, if there's no name it should not match anything. Any help appreciated!
Yes, Im was going to say changing Space into Space* to make the space optional, but \s* also conducts, nice work!
If getting more troubles from $PersonInImage, can always use the -p option for troubleshooting like...
exiftool -p '$Filename ------- ${Filename;s| *\(.*?\) *| |g;s/ *\S*\.[^.]*$//}' DIR
Like, if your filenames can have "bad words" like Pyramids or Egypt without parenthesis, that should never go inside $PersonInImage.
You could invent a 'bad-words list' and then test $PersonInImage like...
exiftool -p '$Filename ------- ${Filename; s| *\(.*?\) *| |g; s/ *(Pyramids|Egypt|MoreBadWords) */ /g; s/ *\S*\.[^.]*$//; s/(^ *| *$)//}' DIR
It might be better to invent a "good words" list instead, but really Im just trying to present the way Im often conduct the troubleshooting.
Because usually Im not smart enough to consider the troubles ahead of time, but the -p option will present them for me.
Hi Luuk, actually my solution was not really working performance-wise and I am back at your solution. There's only one problem left and that's when there's no additional word in the filename, in that case the filename gets copied to the personinimage tag.
For example:
Holidays/Egypt 1997/Me Wife IMG_1099.JPG --> subject=Holidays, Egypt, 1997 personinimage=Me, Wife
Holidays/Egypt 1997/Me Wife (Pyramids) IMG_1099.JPG --> subject=Holidays, Egypt, 1997, Pyramids personinimage=Me, Wife
Holidays/Egypt 1997/IMG_1099.JPG --> subject=Holidays, Egypt, 1997 personinimage=IMG_1099.JPG
When ran from my iMac it works with the \s+ addition, but that's veeeeeeery slow since the files are on my NAS.
When ran from the NAS it works the way I explained in the example.
How could I work around this problem?
[EDIT] Regex is driving me crazy. It *does* seem to work, also on my NAS. I must have made some other (unintended) change.
So this is the final, working, command (credits and big thanks to luuk2005): exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit='[ /]' -if '$Directory=~/Photo/' -Subject'<${Directory;s|.*Photo.*?/||}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}' -PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/\s*\S*\.[^.]*$//}' DIR
Also, I wanted to ask: where can I find more information about this regex formatting? I understand parts of it, but still cannot grasp the use of s| and |g; and //. No clue. Much appreciated!
Yes, Im was going to say that your regex with \s* conducts perfectly to make the space optional.
So Im guessing that PersonInImage=IMG_1099.JPG was left over from another experiment.
Im not know where this regex documentation is, but thinking 'perl' does probably give the best explanations.
But for substitutions with regex, this some different formats that the exiftool does seem to grant...
s(match)(replace)modifiers;
s/match/replace/modifiers;
s|match|replace|modifiers; (except | can be many non[a-z0-9] characters!)
Im learned most regex from conducting the experiments with sed.exe, so Im always preferring the s///; format.
But sometimes Im use s|||; when I need put / in my match, because its better than using \/ for the eyesight.
The 'g' modifier means "global", but global is just synonym for meaning "all", so replacing all-matches.
There is also 'i' for "case-insensitive" and 'r' for "respect $_" (I dont know the synonym, but the english is like "dont modify $_" )
Regarding this command and its regex' I have one more question: how could I enhance it to ignore words between brackets?
For example:
/Users/me/Photos/Holidays/Egypt 1997 [comment]/Me Wife (Pyramids) IMG_1099.JPG
Would yield Subject: Holidays, Egypt, 1997, Pyramids; and PersonInImage: Me, Wife; But not: comment
I have tried various versions of the command, including s|\[.*/]*| |; and s|^[^\[\]]*$| |; and s|\[.*\]| $1|g; but I cannot get it to work: the comment keeps on appearing as a keyword.
Any suggestions to point me in the right direction?
Your first expression was good, except the typo with / instead of \ right before ] so otherwise probably conducts perfectly!
So if not wanting to match [words] in $Directory, I would make the second s///; like... s/ *\[.*?\] */ /g;
So this could replace all of the [words] with just one space.
But with having [word1] [word2] or [lastword], you can add some final s///; 's like... s/ +/ /g; s/ $//g;
So this could fix ManySpaces-->1Space; and remove the trailing space if $Directory ends with some [lastword].
So to present your 'keywords' coming from $Directory, you could experiment like...
-p '$Directory --- ${Directory; s|.*Photo.*?/||; s/ *\[.*?\] */ /g; s|[/ ]+| |g; s/ $//g}'
(The third s||| is also replacing "/" with a "space" for the eyesight)
So if that presents ok, then a command like...
exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit=' ' -if '$Directory=~/Photo/'
-Subject'<${Directory;s|.*Photo.*?/||;tr|/| |;s/ *\[.*?\] */ /g;s/ +/ /g;s/ $//}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}'
-PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/ *\S*\.[^.]*$//}' DIR
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Deep/Deepest/abc.jpg
Subject: Holidays, Egypt, 1997, Deep, Deepest
PersonInImage:
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Me (eating) Wife (reading) Joe (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, eating, reading, Pyramids
PersonInImage: Me, Wife, Joe
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Me Wife (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, Pyramids
PersonInImage: Me, Wife
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Pyramids Me Wife IMG_1099.jpg
Subject: Holidays, Egypt, 1997
PersonInImage: Pyramids, Me, Wife
Im forget to include some s/// for $PersonInImage to destroy 'BadWords' in the filename (like 'Pyramids' in the last example).
The first post has "Pyramids Me Wife IMG_1099.JPG", so Im guessing that $PersonInImage should never match Pyramids????
So if needed, this can be a way to destroy 'BadWords' in the filename for $PersonInImage...
exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit=' ' -if '$Directory=~/Photo/'
-Subject'<${Directory;s|.*Photo.*?/||;tr|/| |;s/ *\[.*?\] */ /g;s/ +/ /g;s/ $//}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}'
-PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/\b(Pyramids|Egypt|BadWords)\b//g;s/ +/ /g;s/(^ | $)//g;s/ *\S*\.[^.]*$//}' DIR
So then conducting like....
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Deep/Deeper/Deepest/aaaa.jpg
Subject: Holidays, Egypt, 1997, Deep, Deeper, Deepest
PersonInImage:
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Me (eating) Wife (reading) Joe (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, eating, reading, Pyramids
PersonInImage: Me, Wife, Joe
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Me Wife (Pyramids) IMG_1099.jpg
Subject: Holidays, Egypt, 1997, Pyramids
PersonInImage: Me, Wife
Photos/Holidays/[xxx]Egypt[xxx]1997 [xx1] [xx2]/Pyramids Me Wife IMG_1099.jpg
Subject: Holidays, Egypt, 1997
PersonInImage: Me, Wife
There might be much better ways to conduct this using other perl commands, but all Im really know is the s/// and tr///.
I know there is some perl commands like split() but Im not good enough to depend on them, so always using s/// instead.
This using a whole lot of s///; so Im thinking there might be some better ways, especially for the eyesight.
Thanks very much, Luuk2005! Your suggestion works like a charm.
My final command is now:
exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit='[ /]' -if '$Directory=~/Photos/' -Subject'<${Directory;s|.*Photos.*?/||;tr|/| |;s/ *\[.*?\] */ /g;s/ +/ /g;s/ $//}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}' -PersonInImage'<${Filename;s| *\(.*?\) *| |g;s/ *\[.*?\] */ /g;s/\s*\S*\.[^.]*$//}' DIR
Nice work! I didnt even realize that you needed to destroy [words] along with (words) inside the filename for $PersonInImage.
And my last -PersonInImage could never remove any trailing spaces with $, because Im forgetting about the file-extension!
So really, with also removing [words] for $PersonInImage, the -PersonInImage should change like...
exiftool -r -m -overwrite_original -AddTagsFromFile @ -api listsplit=' ' -if '$Directory=~/Photo/'
-Subject'<${Directory;s|.*Photo.*?/||;tr|/| |;s/ *\[.*?\] */ /g;s/ +/ /g;s/ $//}${Filename;s|^[^()]*$||;s|[^()]*?\((.+?)\)[^()]*| $1|g}'
-PersonInImage'<${Filename;s/\.[^.]*$//;s| *\(.*?\) *| |g;s/ *\[.*?\] */ /g;s/\b(Pyramids|Egypt|BadWords)\b//g;s/ +/ /g;s/(^ | $)//g;s/ *\S*$//}' DIR
The first s/// removes the extension, so then letting $ conduct properly, and the last s/// doesnt worry about the extension.
The underlined s///'s could fix any troubles coming from "[word1] [word2]" or [words] at the beginning or end of a filename.
So for example, if having any words like "IMG_1099(xxx).JPG", they could never set a 'keyword' for $PersonInImage.
Also I forgot to describe that tr|/| |; converts all "/" --> space, so then listsplit only needs the space.