Regex substitution and IPTC truncation

Started by chuft-captain, December 06, 2017, 12:05:38 PM

Previous topic - Next topic

chuft-captain

Got it!

... and confirmed:
Quoteexiftool -if "$By-Line=~/AUTHOR*/i" .
and
Quoteexiftool -if "$By-Line=~/AUTHOR/i" .
give same result.

Quoteexiftool -if "$By-Line=/AUTHOR/i" .
does not give a match.

My knowledge of Perl syntax would probably be enhanced if I was Linux based instead of Windows based.
(That's my excuse anyway!  8))

Thanks for your patience.
Cheers.
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

StarGeek

Quote from: chuft-captain on December 09, 2017, 10:34:12 PM
I'm assuming the use of single quotes is a linux syntax, and Windows instead uses double quotes instead.

Yes, reverse single and double quotes on a Windows system.  So -if '$make eq "Canon" ' would be -if "$make eq 'Canon' " on Windows.  If you need to use double quotes as part of the statement inside of double quotes, you can try either (backslash)(double quote) \" or three double quotes """.  I've had success with a limited number of these but at a certain point Windows will choke on too many of them and you have to find some other way of doing it.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

StarGeek

Quote from: chuft-captain on December 09, 2017, 11:05:23 PM
Quoteexiftool -if "$By-Line=~/AUTHOR*/i" .
and
Quoteexiftool -if "$By-Line=~/AUTHOR/i" .
give same result.

Here you have to be careful.  The asterisk has a special meaning in Regex and isn't a wildcard as you would think of in Windows.  It says that the previous character will appear 0 or more times.  So that first listing will match "AUTHOR", but it will also match "AUTHO" (match previous character R zero times) and "AUTHORRRRRRRRR" (or more).  If you use any of these characters .^$*+?()[{\| in a regex and you want an exact match, you must escape them by placing a backslash \ in front of them.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

chuft-captain

Good point.

I do understand very well the difference between * and + in regex, as I often do quite complex "search and destroy/replace" missions using regex in NotePad+, but thanks for the reminder.  :)

eg. AUTHOR+ would match "AUTHOR" but not "AUTHO"

CC
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

chuft-captain

Just a quick question about your comment regarding -P above.

Because I don't like to mess with the sort order of my images (ie. I like to sort by date last modified), I tend to leave all the "safety" switches like -P on my EXIF commands (even when they're read commands). This is because I typically will use a read command to check I'm getting the right set of images according to the search criteria (as in the example we've been discussing), before doing the write command, and then, once satisfied, I'll convert the read command to a write command, like so...
exiftool -P -if "$By-Line=/AUTHOR_BYLINE/i" -api "Filter=s/AUTHOR_BYLINE/<something else>/" *.jpg

The reason to leave that switch on the read command is simply to avoid accidentally forgetting it when I do the write operation (which would mess with my file dates) ... especially as I also use the -overwrite_original switch!
It seems to me that although obviously unnecessary for read ops, a -P has no adverse affect if you leave it on.

CC
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

StarGeek

Yes, you're correct.  It's perfectly fine to leave it there if that's how your workflow... works.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

chuft-captain

#21
One weird thing just happened when I tried this just now:

READ:
exiftool -P -m -overwite_original -By-Line -if "$By-Line=~/AUTHOR_BYLINE/" .
Quote1 directories scanned
352 files failed condition
   1 image files read

Now I'll try to selectively update ONLY the (single) image from above which satisfies the -IF condition...
WRITE:
exiftool -P -m -overwite_original -By-Line -if "$By-Line=~/AUTHOR_BYLINE/" -api "Filter=s/AUTHOR_BYLINE/<something else>/" -tagsfromfile @ .
QuoteIgnored superfluous tag names or invalid options: -overwite_original ...
    1 directories scanned
  353 files failed condition
    0 image files read

Criteria exactly the same, but this doesn't work. (When write actions added, 0 image files read, no change to the tag value.)

Any ideas what I'm doing wrong?

If I remove the -IF condition, it appears to do the substitution correctly on the 1 image, but pointlessly updates all 353 images, instead of just the one.
exiftool -P -m -overwite_original -By-Line -api "Filter=s/AUTHOR_BYLINE/<something else>/" -tagsfromfile @ .
Quote1 directories scanned
353 image files updated

Not sure why this message occurs either....but maybe doesn't matter:
QuoteIgnored superfluous tag names or invalid options: -overwite_original ...

PS.
Just noticed I'm now a "Jr. Member?". Yesterday I'm sure I was a "Newbie".
Why the sudden promotion? ... is it based on #posts?
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

StarGeek

Quote from: chuft-captain on December 10, 2017, 12:28:58 AM
Now I'll try to selectively update ONLY the (single) image from above which satisfies the -IF condition...
WRITE:
exiftool -P -m -overwite_original -By-Line -if "$By-Line=~/AUTHOR_BYLINE/" -api "Filter=s/AUTHOR_BYLINE/<something else>/" -tagsfromfile @ .
QuoteIgnored superfluous tag names or invalid options: -overwite_original ...
    1 directories scanned
  353 files failed condition
    0 image files read

Criteria exactly the same, but this doesn't work. (When write actions added, 0 image files read, no change to the tag value.)

Because you used the -api Filter option, the if condition is no longer satisfied.  When you check to see if By-Line matches AUTHOR_BYLINE, it has already been changed to <something else>.  Filter is a very power tool and affects tags globally.  If you wish to target a tag individually, it's best if you just process that tag. For example, copy the tag to itself with the changes: "-By-Line<${By-Line;s/AUTHOR_BYLINE/<something else>/}" or if you use the filter, -TagsFromFile @ -By-Line

When you use filter (or -d or similar options) and need to check against its original value, use the hashtag at the end of the tag (see the -n option).  This will disable the conversion for that tag.  So you could use -if "$By-Line#=~/AUTHOR_BYLINE/" or my favorite way of doing things with the Filter option, -if "$By-Line# ne $By-Line".  That last one compares the original non-converted version of the tag to the converted version.  This can be useful in cases where your regex gets extremely complex or if you throw in some straight perl into the Filter.

QuoteIf I remove the -IF condition, it appears to do the substitution correctly on the 1 image, but pointlessly updates all 353 images, instead of just the one.
exiftool -P -m -overwite_original -By-Line -api "Filter=s/AUTHOR_BYLINE/<something else>/" -tagsfromfile @ .
Quote1 directories scanned
353 image files updated

Now here's where you need to be very careful.  As I said, Filter affects tags globally.  It would have updated By-Line correctly, but what if you had AUTHOR_BYLINE in another tag, say in Keywords or Description.  Additionally, since you didn't list a specific tag after -TagsFromFile, it would copy the change for all the tags that it found AUTHOR_BYLINE in.  Which is fine if that's what you want, but it's something to keep in mind

And that's not all.  When you don't list a tag after -TagsFromFile, exiftool assumes you mean to copy -All-All will not directly copy tag to tag.  It copies from tag to its preferred location (EXIF, IPTC, and then XMP).  So if you have a tag in one group, it might end up in a different group if the tags have the same name.  As an example, if you had data in XMP:City, after such a copy, it would end up in IPTC:City, which may not be what you want.  To copy tags to the same location, you should use -All:All, though I feel that targeting a specific tag is best if that's what you are trying to change.



Quoteis it based on #posts?

I believe so, there's details around here somewhere.  Under "Help" I think.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

chuft-captain

#23
Quote from: StarGeek on December 10, 2017, 02:12:20 AM
Quote from: chuft-captain on December 10, 2017, 12:28:58 AM
Now I'll try to selectively update ONLY the (single) image from above which satisfies the -IF condition...
WRITE:
exiftool -P -m -overwite_original -By-Line -if "$By-Line=~/AUTHOR_BYLINE/" -api "Filter=s/AUTHOR_BYLINE/<something else>/" -tagsfromfile @ .
QuoteIgnored superfluous tag names or invalid options: -overwite_original ...
    1 directories scanned
  353 files failed condition
    0 image files read

Criteria exactly the same, but this doesn't work. (When write actions added, 0 image files read, no change to the tag value.)

Because you used the -api Filter option, the if condition is no longer satisfied.  When you check to see if By-Line matches AUTHOR_BYLINE, it has already been changed to <something else>.  Filter is a very power tool and affects tags globally.  If you wish to target a tag individually, it's best if you just process that tag. For example, copy the tag to itself with the changes: "-By-Line<${By-Line;s/AUTHOR_BYLINE/<something else>/}" or if you use the filter, -TagsFromFile @ -By-Line
So basically what  you're saying is that to restrict it to the single field, I can either:
1. get rid of the "-IF" and just copy the tag to itself,
(I did this:
exiftool -P -m -overwite_original "-By-Line<${By-Line;s/AUTHOR_BYLINE/something else/}"
It made the change OK, but it still "updated" all 353 images -- I guess it's copying 352 unchanged images, and 1 changed image, then deleting temp copies.)
2. get rid of the "-IF" and continue to use the filter but restrict it to the field by appending -"By-Line" after the @. (I did have "-By-Line" earlier in the command which I thought would restrict it, but you're saying this has to come after the @ ?)

One other thing is a little confusing... you said: "it has already been changed to <something else>"(by the filter), which interfered with the "-IF" condition, but in this case I would still expect it to have changed, but I checked afterward, and it had not changed. Do you mean it gets changed "in memory" by the filter, the "IF" then fails, so the change made by the filter gets rolled back because of that?

Thanks for all the additional comments. It will take me a while to absorb all that, but will be very helpful I'm sure.


Thanks a lot!!
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

StarGeek

Quote from: chuft-captain on December 10, 2017, 03:00:09 AM
So basically what  you're saying is that to restrict it to the single field, I can either:
1. get rid of the "-IF" and just copy the tag to itself,
(I did this:
exiftool -P -m -overwite_original "-By-Line<${By-Line;s/AUTHOR_BYLINE/something else/}"
It made the change OK, but it still "updated" all 353 images -- I guess it's copying 352 unchanged images, and 1 changed image, then deleting temp copies.)

No, keep the -IF.  With out it, it's re-writing all those images that aren't changed by the regex.

Quote2. get rid of the "-IF" and continue to use the filter but restrict it to the field by appending -"By-Line" after the @. (I did have "-By-Line" earlier in the command which I thought would restrict it, but you're saying this has to come after the @ ?)

Again, keep the -IF, it helps prevent re-writing of images that don't need it.  And yes, directly from the second line of -TagsFromFile:
"Tag names on the command line after this option specify the tags to be copied"

QuoteDo you mean it gets changed "in memory" by the filter, the "IF" then fails, so the change made by the filter gets rolled back because of that?

Almost.  By-Line gets changed, the -IF fails because it no longer matches, so the file doesn't get changed.

Remember, "Filter" isn't just for changing tags for copying.  You can use it to change the output when just reading tags.  For example, maybe a tag has trailing spaces but you don't realize it at first.  You could then use something like exiftool -api "filter=s/ /_/g" -Description to detect them.

Example output:
C:\>exiftool -g1 -a -s -api "filter=s/ /_/g" -Description# -Description y:\!temp\Test3.jpg
---- XMP-xmp ----
Description                     : There are trailing spaces
Description                     : There_are_trailing_spaces___________
---- XMP-dc ----
Description                     : There are trailing spaces
Description                     : There_are_trailing_spaces___________
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

chuft-captain

#25
Quote from: StarGeek on December 10, 2017, 03:46:09 AM
Quote from: chuft-captain on December 10, 2017, 03:00:09 AM
So basically what  you're saying is that to restrict it to the single field, I can either:
1. get rid of the "-IF" and just copy the tag to itself,
(I did this:
exiftool -P -m -overwite_original "-By-Line<${By-Line;s/AUTHOR_BYLINE/something else/}"
It made the change OK, but it still "updated" all 353 images -- I guess it's copying 352 unchanged images, and 1 changed image, then deleting temp copies.)

No, keep the -IF.  With out it, it's re-writing all those images that aren't changed by the regex.
OK,
Changed it back with:
exiftool -P -m -overwite_original -if "$By-Line=~/something else/" "-By-Line<${By-Line;s/something else/AUTHOR_BYLINE/}" .
and this time, it only updated the 1 file:
Quote1 directories scanned
  352 files failed condition
    1 image files updated

Thanks for all your help. Don't think I'd get far without it!!  :D

CC

Ninja Edit:
I've found out why the message:
QuoteIgnored superfluous tag name or invalid option: -overwite_original
is happening. (I couldn't work out why this command was insisting on creating .jpg_original files.)
It seems that "overwite" should be spelt "overwrite".  ;D :P :-\
I wonder if a future version of EXIFTOOL might allow this to be abbreviated to perhaps "-o"  (It appears that "-o" has not been used yet) ... and I guess "-oip" for -overwrite_original_in_place.
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

Phil Harvey

-o is used to specify an output file name.  But I would prefer not to have a short form for a potentially dangerous option like -overwrite_original.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

chuft-captain

#27
Quote from: Phil Harvey on December 10, 2017, 08:02:14 AM
-o is used to specify an output file name.  But I would prefer not to have a short form for a potentially dangerous option like -overwrite_original.
Yep,

I 99% expected that response, and actually I totally agree.
I think I was just getting a bit sick of typing it (and mis-typing  :-\ it), so I thought I would throw that suggestion out there.
I'll get over it, and actually, I think your reticence is well justified.

:)

CC

PS. That reddit comment: "... it is total fucking gibberish to me." is totally fucking priceless!  ;D
EXIFTOOL Documentation: https://exiftool.org/exiftool_pod.html

StarGeek

Quote from: chuft-captain on December 10, 2017, 12:42:19 PM
I think I was just getting a bit sick of typing it (and mis-typing  :-\ it)

There are two things I use to deal with this.  One is a text expansion program (see this old Lifehacker article).  Even something like AutoHotKey or Autoit (if on Windows) can be used to reduce errors.  Another is a clipboard history program.  Anything recent I've copied into the clipboard, I can re-paste quickly.  Both can be especially useful when dealing with long, complex commands.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

Quote from: chuft-captain on December 10, 2017, 12:42:19 PM
PS. That reddit comment: "... it is total fucking gibberish to me." is totally fucking priceless!  ;D

Yeah.  I really liked that comment too. :)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).