ExifTool Forum

ExifTool => Bug Reports / Feature Requests => Topic started by: stan on May 23, 2015, 03:24:12 PM

Title: Filename wildcards
Post by: stan on May 23, 2015, 03:24:12 PM
When I use dir *.tif? and dir *.tif* it shows me both .TIF and .TIFF files. But while exiftool -p "$FileName" *.tif* shows me the same files (.TIF and .TIFF), exiftool -p "$FileName" *.tif? shows me only .TIFF files.

So ExifTool is not treating ? as a wildcard for an optional character?
Title: Re: Filename wildcards
Post by: Phil Harvey on May 23, 2015, 06:47:36 PM
I never understood the way Windows handled wildcards.  A bit of googling gives this:

QuoteThe * wildcard will match any sequence of characters
               (0 or more, including NULL characters)

The ? wildcard will match a single character
               (or a NULL at the end of a filename)

OK.  That's weird.

I can probably emulate this behaviour.  Let me look into this.

ExifTool currently matches '?' with one character (no matter where it is.  The null character is non-existent, and it doesn't make sense to match this).

- Phil

Edit:  I've looked into this a bit.  The root problem is that wildcards are handed by the individual application in Windows, while they are handled by the command shell on other systems.  ExifTool has historically used File::Glob::bsd_glob() to expand wildcards, but recently with the addition of Windows Unicode filename support, ExifTool now does the expansion manually.  I will have to test to see how bsd_glob() behaves, then try to make ExifTool consistent with this behaviour.  I don't know if this will give the behaviour you expect, but at least it will be consistent with older versions, and in Windows each app can (apparently) do what it wants.

Thanks for pointing this out.
Title: Re: Filename wildcards
Post by: Phil Harvey on May 25, 2015, 03:00:48 PM
I've taken a quick look at this.  I think that the ExifTool behaviour is consistent with previous versions, and I'm not convinced that this should change.  With ExifTool, the "?" represents and single character.   I ran some tests using the "dir" command, and the Windows behaviour doesn't match with the quote I found above...

If I have a directory containing 2 files, "a.tif" and "a.tiff", this is what I see:


arg   Windows   ExifTool
*.t?  -  -
*.ti?  a.tif, a.tiff  a.tif
*.tif?  a.tif, a.tiff  a.tiff

I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

The ExifTool rules are easier to understand, but I should see about documenting this behaviour since it differs from Windows.

- Phil
Title: Re: Filename wildcards
Post by: stan on June 07, 2015, 11:37:35 PM
Quote from: Phil Harvey on May 25, 2015, 03:00:48 PM
I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

Phil, I have had the Windows command line's seemingly anomalous behavior explained to me by an expert and understand it now. A single ? means "zero or one character". So "*.t?" will match both "*.t" and "*.ti" (among others) but not "*.tif". Similarly "*.ti?" should match both "*.ti" and "*.tif" but not "*.tiff". Instead it does match "*.tiff" also.

The reason is because even though Windows allows long file names it's still influenced by its DOS predecessor that used 8.3 file names. So any wildcard that matches a 3 letter extension also matches >3 letters, i.e. "*.???" is as good as writing "*.*" and "*.ti?" will match an extension of any length as long as it starts with "ti".

Similarly, 7 ?s for the file name will match up to 7 letters, but as soon as you use 8, it will match >8 letters also. So "????????.*" is again as good as writing "*.*" and will match filenames (and extensions) of any length.
Title: Re: Filename wildcards
Post by: Hayo Baan on June 08, 2015, 02:39:37 AM
Quote from: stan on June 07, 2015, 11:37:35 PM
Quote from: Phil Harvey on May 25, 2015, 03:00:48 PM
I can't explain or understand why "*.ti?" matches "a.tiff" in Windows, while "*.t?" doesn't match "a.tif".

Phil, I have had the Windows command line's seemingly anomalous behavior explained to me by an expert and understand it now. A single ? means "zero or one character". So "*.t?" will match both "*.t" and "*.ti" (among others) but not "*.tif". Similarly "*.ti?" should match both "*.ti" and "*.tif" but not "*.tiff". Instead it does match "*.tiff" also.

The reason is because even though Windows allows long file names it's still influenced by its DOS predecessor that used 8.3 file names. So any wildcard that matches a 3 letter extension also matches >3 letters, i.e. "*.???" is as good as writing "*.*" and "*.ti?" will match an extension of any length as long as it starts with "ti".

Similarly, 7 ?s for the file name will match up to 7 letters, but as soon as you use 8, it will match >8 letters also. So "????????.*" is again as good as writing "*.*" and will match filenames (and extensions) of any length.

I guess only Microsoft can think of something so twisted and illogical... :o

Anyway, thanks for looking into this  :)
Title: Re: Filename wildcards
Post by: Phil Harvey on June 08, 2015, 07:25:51 AM
Thanks for the explanation.  But if nobody objects, I think I'll leave the ExifTool behaviour unchanged (ie. a ? matches a single character).  If you want to emulate the Windows matching, you can use * instead where appropriate.

- Phil