New API WindowsLongPath option

Started by Phil Harvey, November 01, 2024, 04:19:32 PM

Previous topic - Next topic

FrankB

Quote from: Phil Harvey on November 04, 2024, 04:45:40 PMYes, I'm not adding the "\\?\" prefix unless the length gets close the the limit, or if it is needed for a UNC drive.

In my view the only difference between UNC and Non-UNC is the prefix itself \\?\ <=> \\?UNC\. UNC paths, just like Non-UNC paths, work fine without prefixing if the length is < 247.

But that's only for the record. You can leave it like it is.
I will continue to use this fix while testing GUI, and if anything comes up, I will notify you.

Frank

Phil Harvey

I see. OK.  No worries though.  I'll leave it as is.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

FrankB

Problem found with -Api WindowsLongPath and proposed solution

Hi Phil,

The problem I found with -Api WindowsLongPath is that it doesn't accept International (Wide) characters in the directory name. (EG: Greek = Ελληνικα) The undocumented option -Api debug=1 shows that getcwd is the culprit.
K:\TestPath\SHORT DIR\SUB DIR1\SUB DIR2\SUB DIR3\Ελληνικα>exiftool -filename -api windowslongpath=1 -api debug=1 Greek.jpg
WindowsLongPath input : Greek.jpg
WindowsLongPath getcwd: K:\TestPath\SHORT DIR\SUB DIR1\SUB DIR2\SUB DIR3\???????a

Side Note: Later on when I scanned where getcwd is used in ExifTool, I noticed that the -filepath is not supported for these directory names, because getcwd is unable to cope with wide characters.

Thinking of a solution I came up with the Win32API method GetFullPathName in kernel32.dll. This function can handle wide characters, and as bonus it handles all the special cases. Like relative paths on another drive, dos device filenames, UNC paths etc.

I ran tests on Windows XP, 7, 8, 10 and 11, on filenames containing combinations of short, long, Ansi and Wide characters, on local hard drives, mapped network drives and unc paths. In a CMD prompt and from ExifToolGui. They all behaved well.

See the attached zip for the full exiftool.pm and the directory with filenames I used for testing.

Curious to hear what you think of this.

Frank
#------------------------------------------------------------------------------
# Rebuild a path as an absolute long path to be usable in Windows system calls
# Inputs: 0) ExifTool ref, 1) path string
# Returns: normalized long path
# Note: this should only be called for Windows systems
# References:
# - https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats
# - https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
#
# GetFullPathName supported by Windows XP and later. It handles:
# full path names                EG: c:\foto\sub\abc.jpg
# relative                       EG: .\abc.jpg,  ..\abc.jpg
# full UNC paths                 EG: \\server\share\abc.jpg
# relative UNC paths             EG: .\abc.jpg,  ..\abc.jpg
# Dos device paths               EG: \\.\c:\fotoabc.jpg
# relative path on other drives  EG: z:abc.jpg (working dir on z: z:\foto called from c:\foto)
# Wide chars                     EG: Chars that need UTF8.
#
# Dont know exactly how Win32::API::More->new works, but I would imagine it does a LoadLibrary and a GetProcAddress.
# So I only want to do that once. The variable $GetFullPathName should by defined globally to achieve this.
#
sub WindowsLongPath($$)
{
    my ($self, $path) = @_;
    my $debug = $$self{OPTIONS}{Debug};
    my $out = $$self{OPTIONS}{TextOut};

    $debug and print $out "WindowsLongPath input : $path\n";
    $path =~ tr(/)(\\); # convert slashes to backslashes

    if ($path =~ /^\\\\\?\\/) {                                                               # already a device path in the format we want
      $debug and print $out "WindowsLongPath (Already prefixed) return: $path\n";
      return $path;
    }

    if (!defined($GetFullPathName)) {                                                         # Need to import (once) GetFullPathName?
      $GetFullPathName = Win32::API::More->new('kernel32.dll',                                # Note: Last param lpFilePart not used!
      'DWORD GetFullPathNameW(LPCWSTR lpFileName,' .                                          # We need the W(ide) version.
      ' DWORD nBufferLength,' .                                                               # Length buffer provided. If you pass 0, the return value is the length required.
      ' LPWSTR lpBuffer,' .                                                                   # Receives output from GetFullPathName
      ' LPWSTR *lpFilePart);');                                                               # Pointer within the buffer, where the filename starts.

      $debug and print $out "GetFullPathName loaded : defined($GetFullPathName) \n";
    }

    my $enc = $$self{OPTIONS}{CharsetFileName};
    my $encPath = $self->Encode($path, 'UTF16', 'II', $enc);                                  # Need to encode to UTF16
    my $LenReq = $GetFullPathName->Call($encPath, 0, 0, 0) + 1;                               # first pass gets length required. Add +1 for safety, Needs Null terminator?
    my $FullPath = \0 x $LenReq x 2;                                                          # create buffer to hold Full Path
    $GetFullPathName->Call($encPath, $LenReq, $FullPath, 0);                                  # FullPath is UTF16 now
    $path = $self->Decode($FullPath, 'UTF16', 'II', $enc);                                    # Decode

    if ($path =~ /^\\\\/) {
      $path = '\\\\?\UNC' . substr($path, 1) unless length($path) <= 247;
    } else {
      $path = '\\\\?\\' . $path unless length($path) <= 247;
    }

    $debug and print $out "WindowsLongPath return: $path\n";
    return $path;
}

TestLongPath.zip

Phil Harvey

Hi Frank,

Excellent!  Thanks!!  It will probably be Monday before I can take a close look at this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Hi Frank,

I found a bit of time today.

Wow, this is excellent!  Much cleaner that what I was doing before.  There were a couple of minor things that I fixed.  I also changed Win32::API::More->new to Win32::API->new to match the way it is used in the rest of the code.  It seems to work for me, but I couldn't figure out how to cd into your test directory with Unicode characters so my testing wasn't complete.  Attached is my new version of the code.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

FrankB

Thanks for your high opinion Phil.
I felt I had to come up with something that works, after I bet on the wrong horse by proposing Cwd!

I may be able to learn you something. How to CD into a Unicode directory.
- Select the text from the address bar in Windows Explorer, and copy that CTRL/C.
- Type CD " in the CMD prompt, and use Right click, or from the File menu Edit/Paste

copy_and_paste.jpg

Will look into your changes, but no doubt I'll be fine with them. And after all, I'm not a Perl programmer.

Frank


Phil Harvey

Thanks.  I was able to "cd" into the Unicode directories and all my tests ran without a problem.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).