Extracting part of a Windows JPG filename and copying into an Exif tag

Started by mpegleg, June 14, 2019, 01:48:36 AM

Previous topic - Next topic

mpegleg

Hello.

Long time no see.  :) I hope you are all well since I last visited.

Unfortunately I'm a bit rusty in the ol' Regex department.  :-\

I was wondering if someone could please advise me of the modification to this code, to achieve the following:


An example Windows filename: 1963-09-01 [1965-08 S#01] [PIX945034] test1.jpg

This code:
-comment<${filename;m/(\[(.+?)\])/;$_=$1}

will result in the comment tag containing [1965-08 S#01] 


I would like to only have everything between the first square brackets like so... 1965-08 S#01

ie. with no square brackets included in the result.


Cheers,
-Paul  :)

OS: Windows 11 Pro

StarGeek

Try this
-comment<${filename;m/\[([^]]+)\]/;$_=$1}

The breakdown
Match the first bracket \[
Start a capture group (
Start a character set [
Negate the this character set, in other words, match any character not in this set ^
Close bracket character as part of the set (may or may not need a backslash escape) ] or maybe \]
Close bracket which closes the character set ]
Match one or more of the previous character, which was any character that is not a close bracket +
Close the capture group )
The final matching bracket isn't necessary, but it can be left in if it's easier to read that way.

Example on Regex101
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

mpegleg

Thanks StarGeek. That did the trick.  :)

.. and thanks also for the excellent breakdown!
OS: Windows 11 Pro

mpegleg

... and if I wanted to substitute the # for a SPACE character?

ie. 1965-08 S#01

becomes: 1965-08 S 01
OS: Windows 11 Pro

Phil Harvey

Quote from: StarGeek on January 30, 2020, 12:26:06 AM
Close bracket character as part of the set (may or may not need a backslash escape) ] or maybe \]

I would definitely use \] here.  But if it works without the \, then I have learned something.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Quote from: mpegleg on January 30, 2020, 12:54:52 AM
... and if I wanted to substitute the # for a SPACE character?

Try this:

-comment<${filename;m/\[([^\]]+)\]/;$_=$1;tr/#/ /}

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mpegleg

Quote from: Phil Harvey on January 30, 2020, 07:14:58 AM

I would definitely use \] here.  But if it works without the \, then I have learned something.

- Phil

Yes, I just tried it both ways, and it does work with or without the \ escape.


Quote from: Phil Harvey on January 30, 2020, 07:16:42 AM

Try this:

-comment<${filename;m/\[([^\]]+)\]/;$_=$1;tr/#/ /}

- Phil

Thanks Phil. Perfect  :)
OS: Windows 11 Pro

mpegleg

... and finally, while I'm on it... what if I wanted to remove the # so that:

ie. 1965-08 S#01

becomes: 1965-08 S01  <-- ie. the # just gets removed with no space nor character substitution.
OS: Windows 11 Pro

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mpegleg

OS: Windows 11 Pro

StarGeek

Quote from: Phil Harvey on January 30, 2020, 07:14:58 AM
I would definitely use \] here.  But if it works without the \, then I have learned something.

Yeah, I would tend to add the escape as well but it didn't make it into my notes when I was first researching what characters needed escaping. I wasn't sure if it might have been dependent upon the language involved or not.  It's not a character I use outside of it's regex usage.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

From the Perl documentation:

A ] is normally either the end of a POSIX character class (see POSIX Character Classes below), or it signals the end of the bracketed character class. If you want to include a ] in the set of characters, you must generally escape it.

However, if the ] is the first (or the second if the first character is a caret) character of a bracketed character class, it does not denote the end of the class (as you cannot have an empty class) and is considered part of the set of characters that can be matched without escaping.


Learn something new every day. :)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).