regex end-of-line problem

Started by camerond, July 30, 2016, 10:41:30 AM

Previous topic - Next topic

camerond

I have been handed a few digital images, scanned from old transparencies, that were created with some time ago by somebody else with an unknown program.
The header says the camera model is a Zoran COACH, and there is plenty of non-standard formatting of metadata, so I am not sure if it a general bug in exiftool or something triggered by the bad headers. The software is listed as "Coachware 1.0"

The exif dates are all given as a single space character. The html dump reports that they are two-byte strings - space followed by null.
My problem is creating a general test with exiftool.

Test version: 10.24
OS: Confirmed on Linux as well as ms-windows: bash and tsch under CentOS 6.8, as well as bash under msys on windows-7 x64

The first command works as expected (the -X was used simply to reveal the presence of the space):
exiftool -X -EXIF:CreateDate  -if '$EXIF:CreateDate =~ /^ / ' -ext jpg .
But while all images that match this condition will have invalid dates, I had initially tried the following command to get the date string that is exactly a single space.

exiftool -X -EXIF:CreateDate  -if '$EXIF:CreateDate =~ /^ $/ ' -ext jpg .
However, this failed to match.
Anything with the EOL marker $ also failed to match: /^ *$/  and /^ +$/.
I was finally able to get sensible results by using the alternate syntax: m{^ $}, with or without repetition characters.

Am I missing something, or is there a bug?

I eventually found a non-human image that I feel the owner would be happy for me to attach as an example.

Hayo Baan

Interesting... It works when you use e.g., \z. My guess is the $ gets interpreted by exiftool and it just is not handling the $ as you expected here. Exiftool has to interpret all $ otherwise you could not use e.g., $EXIF:CreateDate or refer to tags in case insensitive ways. Knowing this, catering for use of the $ as end of line matcher might not be trivial...

Anyway, by using either \Z or \z your problem is solved. However in this case (since you are looking at a single space, always) it would be even simpler to write: -if '$EXIF:CreateDate eq " "'

Note these files are quite messed up by the software that was used to edit them, exiftool reports a number of issues:

  • [minor] Overlapping MakerNotes values
  • Bad MakerNotes offset for tag 0x003c
  • Bad MakerNotes offset for tag 0x0100

Anyway, I hope I solved the $ mystery for you.
Hayo Baan – Photography
Web: www.hayobaan.nl

StarGeek

I believe it's the $/ combo, which is getting interpreted as a new line (see -p option).  You can also fix it by putting a capture around the dollar sign /^ ($)/
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Stargeek is correct about the $/.  ExifTool parses the expression before passing it to Perl.  The expression should be /^ $$/.  From the documentation for the -if option:

            3) Tags in the string are interpolated the same way as with -p
            before the expression is evaluated.  In this interpolation, $/ is
            converted to a newline and $$ represents a single "$" symbol (so
            Perl variables, if used, require a double "$").


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

camerond

Thank you all for the explanations - sorry I didn't read the manual properly.

I note in passing that m{^ +$} and m{^ +$$} both result in the same expression, and seem to evaluate the same way.


Quote from: Hayo Baan on July 30, 2016, 11:42:04 AM
However in this case (since you are looking at a single space, always) it would be even simpler to write: -if '$EXIF:CreateDate eq " "'
Yes, I had known that, but I was looking for a more general solution, as I have seen cameras filling other fields with different numbers of spaces.

QuoteNote these files are quite messed up by the software that was used to edit them, exiftool reports a number of issues:
Yes, my solution there was to simply delete the Makernotes, as there seemed to be nothing Exiftool could decode anyway. I suspect the contents were mainly rubbish.

Phil Harvey

Quote from: camerond on August 02, 2016, 09:07:56 AM
I note in passing that m{^ +$} and m{^ +$$} both result in the same expression, and seem to evaluate the same way.

True.  $} doesn't currently have any meaning for ExifTool, so it leaves it alone.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

RobertBew

I have the same problem as EddiGordo. I am behind a company firewall and a proxy my 2 cents on a proxy authentication problem. I have tried the command line execution : it fails but does not write any log.

Any suggestion ?

Peter

Hayo Baan

Hi Peter,

I think you replied to the wrong thread...
Hayo Baan – Photography
Web: www.hayobaan.nl

Phil Harvey

Nope.  It was sleeper spam (he added a spam link in his signature after posting).  Account has been deleted.

This is the thing to watch for with random posts.  (Bitzear is another one to watch -- I'll delete his account too if he adds a link to his signature.  Usually I'll delete all his posts as well, but I've left RobertBew's above since we're talking about it here.)

I wrote a script to scan the forum for members with Website or signature URL's for the express purpose of weeding out the sleeper spam because the SMF forum has no automated tools to help with this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

I went and googled some of the text of that post and found they lifted it from a four and a half year old post on a different forum on flower-platform.com.  How odd.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Hayo Baan

Right, I had almost deleted Bitzgear as his post is almost certainly spam...
Hayo Baan – Photography
Web: www.hayobaan.nl

StarGeek

No almost about it.  Google the text of the post "Content Awareness spot" and it pops up on a lot of random forums from the past couple months.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Hayo Baan

Quote from: StarGeek on August 09, 2016, 11:35:06 AM
No almost about it.  Google the text of the post "Content Awareness spot" and it pops up on a lot of random forums from the past couple months.

Interesting way to use google, neat!  :)
I just deleted his account and post.
Hayo Baan – Photography
Web: www.hayobaan.nl

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).