Using Regular Expressions in TAG names

Started by amw, March 11, 2024, 08:34:29 AM

Previous topic - Next topic

amw

I've got loads of GoPro videos and the date/time often appears to be wrong (perhaps my fault). I have found a way of using the GPS date/time recorded in the EXIF data to correct various TAGs so I need to extract certain ones.

if I use:
EXIFTOOL -ALL
then I get all the TAGs to process in the correct order.  This takes far too long to get to the ones I actually need as there are hundreds of TAGs and hundreds of files.

If I use:
EXIFTOOL -GPSMeasureMode -GPSDateTime
then I get just the two TAGs I need but they are in the wrong order.  I get dozens of the first TAG and then an equivalent number of the second TAG so I lose the pairing.

So I thought, let's try a REGEX. Just to add, this is on Windows hence the escape characters.

If I use:
EXIFTOOL -(GPSMeasureMode^|GPSDateTime)
then I get the following error message:
Invalid TAG name: "(GPSMeasureMode|GPSDateTime)"
However, EXIFTOOL appears to correct this (or ignores the error) and it then lists the two TAGs I require plus they are still in the correct order. That is to say alternating. One GPSMeasureMode followed by one GPSDateTime repeated many times.

I need the value of the first TAG to indicate that the second TAG is a usable date/time.

So this is really a REGEX question. Seeing as EXIFTOOL appears to have corrected my REGEX and given me the results I need, what should I have put for the REGEX for there to be no error message. I'm obviously happy it is working but am curious as to what I have wrongly coded and, in any case, it would be nice to avoid the error message.

By the way, an actual command issued by my CMD file is as follows but I simplified it above merely to explain the issue:

C:\#Links\ExifTool\exiftool -a -s -m -ee -api quicktimeutc "D:\Photos\GoPro10\2024\2024_03_02\GX010070.MP4" -(GPSMeasureMode^|GPSDateTime)

StarGeek

Honestly, I'm not sure what's happening. The -TAG option allows the use of the asterisk * and question mark ? as wildcards, but doesn't say anything about RegEx.  The error response shows that it should be treated as a tag with the name "(GPSMeasureMode|GPSDateTime)" and it shouldn't proceed beyond that.  It even seems to allow using RegEx to write multiple tags.

I've dug through the docs and searched the forum and can't find anything about this.  So this will require Phil's input.

As for removing the error, it doesn't look like it can be removed.  I tried the -m (-ignoreMinorErrors) option with the -q (quiet) option and the -api NoWarning option and neither would suppress the error.

Normally, I would have suggested the -p (-printFormat) option
exiftool -a -s -m -ee -api QuickTimeUTC -p "$GPSMeasureMode$/$GPSDateTime" /path/to/files/
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

Quote from: amw on March 11, 2024, 08:34:29 AMEXIFTOOL -(GPSMeasureMode^|GPSDateTime)

The behaviour of this command is entirely shell dependent.  Put the argument in quotes and you'll get the "Invalid TAG name" warning from ExifTool.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

It's just really strange that it seems to work on Windows, which, AFAIK, doesn't do any regex on the command line.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

amw

#4
Thanks for the replies.  I really don't see how it can be shell dependent seeing as Windows has absolutely no support for REGEX.  It must be getting into the PERL somewhere along the line. That's why I tried the REGEX seeing as I know PERL.

Anyway, if there is no way of getting rid of the error message then I'll just have to live with it.

I tried putting the TAG names in quotes and nothing changed as far as the error message was coincerned.  However, I no longer needed the Windows escape characters so that's nice.  Especially since in the CMD file I actually needed two lots (^^^|) to get them through (plus single escape characters on the parentheses as well).  Windows for you.

Phil Harvey

From a bit of googling:

The Windows command-line interpreter uses a caret character (^) to escape reserved characters that have special meanings (in particular: &, |, (, ), <, >, ^)

It is not a Perl difference.

I don't have time to fire up the Windows virtual machine now to sort this out, but there must be a simple explanation.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

amw

I'm not too worried about the escape character on Windows.  I can live with that issue. It was really about how to use regular expressions on the command line correctly so as to extract the TAGs I require without losing the order. The order they had when using -ALL (or defaulting it).

I did the following simple test:

C:\#Links\ExifTool\exiftool -a -s -m -ee "D:\Photos\GoPro10\2022\2022_01_13\GOPR0030.JPG" "-(g"
Invalid TAG name: "(g"
Unmatched ( in regex; marked by <-- HERE in m/^( <-- HERE g( \(|$)/ at Image/ExifTool.pm line 4701.

So it seems that as soon as one includes a parenthesis then it becomes a regular expression (maybe there are other characters of course). This is an extremely useful, but seemingly undocumented, feature. At least I have not found it yet.

Phil Harvey

That is not a regular expression.  It is an invalid tag name (brackets are not allowed in tag names).  The regular expression can't be used in that context.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

amw

Phil,

You say a REGEX cannot be used in this context so why is it generating a Perl REGEX error message then?

To prove my point, taking it a little further, how about this output?

C:\#Links\ExifTool\exiftool -a -s -m -ee "D:\Photos\GoPro10\2022\2022_01_13\GOPR0030.JPG" "-(.*Date)"
Invalid TAG name: "(.*Date)"
FileModifyDate                  : 2022:01:13 12:53:26+01:00
FileAccessDate                  : 2024:03:20 21:15:05+01:00
FileCreateDate                  : 2024:03:19 18:21:14+01:00
ModifyDate                      : 2022:01:13 12:53:24
CreateDate                      : 2022:01:13 12:53:24
SubSecCreateDate                : 2022:01:13 12:53:24.8230
SubSecModifyDate                : 2022:01:13 12:53:24.8230

It is definitely processing it as a regular expression and I get every TAG ending win "Date". The parenthesis appears to get it to work as if I leave that out then it no longer works.

We can see from the earlier post that it adds the string start (^) and the string end ($) to the REGEX itself so one just has to allow for that.

I'm certainly not complaining about this working in this way as it is a great option to have.

Phil Harvey

This is not a feature, and I'll close this door in the next version.  The invalid tag name is just generating some errors internally in ExifTool.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).