Filename wildcards

Started by stan, May 23, 2015, 03:24:12 PM

Previous topic - Next topic

stan

When I use dir *.tif? and dir *.tif* it shows me both .TIF and .TIFF files. But while exiftool -p "$FileName" *.tif* shows me the same files (.TIF and .TIFF), exiftool -p "$FileName" *.tif? shows me only .TIFF files.

So ExifTool is not treating ? as a wildcard for an optional character?

Phil Harvey

#1
I never understood the way Windows handled wildcards.  A bit of googling gives this:

QuoteThe * wildcard will match any sequence of characters
               (0 or more, including NULL characters)

The ? wildcard will match a single character
               (or a NULL at the end of a filename)

OK.  That's weird.

I can probably emulate this behaviour.  Let me look into this.

ExifTool currently matches '?' with one character (no matter where it is.  The null character is non-existent, and it doesn't make sense to match this).

- Phil

Edit:  I've looked into this a bit.  The root problem is that wildcards are handed by the individual application in Windows, while they are handled by the command shell on other systems.  ExifTool has historically used File::Glob::bsd_glob() to expand wildcards, but recently with the addition of Windows Unicode filename support, ExifTool now does the expansion manually.  I will have to test to see how bsd_glob() behaves, then try to make ExifTool consistent with this behaviour.  I don't know if this will give the behaviour you expect, but at least it will be consistent with older versions, and in Windows each app can (apparently) do what it wants.

Thanks for pointing this out.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

I've taken a quick look at this.  I think that the ExifTool behaviour is consistent with previous versions, and I'm not convinced that this should change.  With ExifTool, the "?" represents and single character.   I ran some tests using the "dir" command, and the Windows behaviour doesn't match with the quote I found above...

If I have a directory containing 2 files, "a.tif" and "a.tiff", this is what I see:


arg   Windows   ExifTool
*.t?  -  -
*.ti?  a.tif, a.tiff  a.tif
*.tif?  a.tif, a.tiff  a.tiff

I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

The ExifTool rules are easier to understand, but I should see about documenting this behaviour since it differs from Windows.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

stan

Quote from: Phil Harvey on May 25, 2015, 03:00:48 PM
I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

Phil, I have had the Windows command line's seemingly anomalous behavior explained to me by an expert and understand it now. A single ? means "zero or one character". So "*.t?" will match both "*.t" and "*.ti" (among others) but not "*.tif". Similarly "*.ti?" should match both "*.ti" and "*.tif" but not "*.tiff". Instead it does match "*.tiff" also.

The reason is because even though Windows allows long file names it's still influenced by its DOS predecessor that used 8.3 file names. So any wildcard that matches a 3 letter extension also matches >3 letters, i.e. "*.???" is as good as writing "*.*" and "*.ti?" will match an extension of any length as long as it starts with "ti".

Similarly, 7 ?s for the file name will match up to 7 letters, but as soon as you use 8, it will match >8 letters also. So "????????.*" is again as good as writing "*.*" and will match filenames (and extensions) of any length.

Hayo Baan

Quote from: stan on June 07, 2015, 11:37:35 PM
Quote from: Phil Harvey on May 25, 2015, 03:00:48 PM
I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

Phil, I have had the Windows command line's seemingly anomalous behavior explained to me by an expert and understand it now. A single ? means "zero or one character". So "*.t?" will match both "*.t" and "*.ti" (among others) but not "*.tif". Similarly "*.ti?" should match both "*.ti" and "*.tif" but not "*.tiff". Instead it does match "*.tiff" also.

The reason is because even though Windows allows long file names it's still influenced by its DOS predecessor that used 8.3 file names. So any wildcard that matches a 3 letter extension also matches >3 letters, i.e. "*.???" is as good as writing "*.*" and "*.ti?" will match an extension of any length as long as it starts with "ti".

Similarly, 7 ?s for the file name will match up to 7 letters, but as soon as you use 8, it will match >8 letters also. So "????????.*" is again as good as writing "*.*" and will match filenames (and extensions) of any length.

I guess only Microsoft can think of something so twisted and illogical... :o

Anyway, thanks for looking into this  :)
Hayo Baan – Photography
Web: www.hayobaan.nl

Phil Harvey

Thanks for the explanation.  But if nobody objects, I think I'll leave the ExifTool behaviour unchanged (ie. a ? matches a single character).  If you want to emulate the Windows matching, you can use * instead where appropriate.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).