New API WindowsLongPath option

Started by Phil Harvey, November 01, 2024, 04:19:32 PM

Previous topic - Next topic

FrankB

Quote from: Phil Harvey on November 04, 2024, 04:45:40 PMYes, I'm not adding the "\\?\" prefix unless the length gets close the the limit, or if it is needed for a UNC drive.

In my view the only difference between UNC and Non-UNC is the prefix itself \\?\ <=> \\?UNC\. UNC paths, just like Non-UNC paths, work fine without prefixing if the length is < 247.

But that's only for the record. You can leave it like it is.
I will continue to use this fix while testing GUI, and if anything comes up, I will notify you.

Frank

Phil Harvey

I see. OK.  No worries though.  I'll leave it as is.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

FrankB

Problem found with -Api WindowsLongPath and proposed solution

Hi Phil,

The problem I found with -Api WindowsLongPath is that it doesn't accept International (Wide) characters in the directory name. (EG: Greek = Ελληνικα) The undocumented option -Api debug=1 shows that getcwd is the culprit.
K:\TestPath\SHORT DIR\SUB DIR1\SUB DIR2\SUB DIR3\Ελληνικα>exiftool -filename -api windowslongpath=1 -api debug=1 Greek.jpg
WindowsLongPath input : Greek.jpg
WindowsLongPath getcwd: K:\TestPath\SHORT DIR\SUB DIR1\SUB DIR2\SUB DIR3\???????a

Side Note: Later on when I scanned where getcwd is used in ExifTool, I noticed that the -filepath is not supported for these directory names, because getcwd is unable to cope with wide characters.

Thinking of a solution I came up with the Win32API method GetFullPathName in kernel32.dll. This function can handle wide characters, and as bonus it handles all the special cases. Like relative paths on another drive, dos device filenames, UNC paths etc.

I ran tests on Windows XP, 7, 8, 10 and 11, on filenames containing combinations of short, long, Ansi and Wide characters, on local hard drives, mapped network drives and unc paths. In a CMD prompt and from ExifToolGui. They all behaved well.

See the attached zip for the full exiftool.pm and the directory with filenames I used for testing.

Curious to hear what you think of this.

Frank
#------------------------------------------------------------------------------
# Rebuild a path as an absolute long path to be usable in Windows system calls
# Inputs: 0) ExifTool ref, 1) path string
# Returns: normalized long path
# Note: this should only be called for Windows systems
# References:
# - https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats
# - https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
#
# GetFullPathName supported by Windows XP and later. It handles:
# full path names                EG: c:\foto\sub\abc.jpg
# relative                       EG: .\abc.jpg,  ..\abc.jpg
# full UNC paths                 EG: \\server\share\abc.jpg
# relative UNC paths             EG: .\abc.jpg,  ..\abc.jpg
# Dos device paths               EG: \\.\c:\fotoabc.jpg
# relative path on other drives  EG: z:abc.jpg (working dir on z: z:\foto called from c:\foto)
# Wide chars                     EG: Chars that need UTF8.
#
# Dont know exactly how Win32::API::More->new works, but I would imagine it does a LoadLibrary and a GetProcAddress.
# So I only want to do that once. The variable $GetFullPathName should by defined globally to achieve this.
#
sub WindowsLongPath($$)
{
    my ($self, $path) = @_;
    my $debug = $$self{OPTIONS}{Debug};
    my $out = $$self{OPTIONS}{TextOut};

    $debug and print $out "WindowsLongPath input : $path\n";
    $path =~ tr(/)(\\); # convert slashes to backslashes

    if ($path =~ /^\\\\\?\\/) {                                                               # already a device path in the format we want
      $debug and print $out "WindowsLongPath (Already prefixed) return: $path\n";
      return $path;
    }

    if (!defined($GetFullPathName)) {                                                         # Need to import (once) GetFullPathName?
      $GetFullPathName = Win32::API::More->new('kernel32.dll',                                # Note: Last param lpFilePart not used!
      'DWORD GetFullPathNameW(LPCWSTR lpFileName,' .                                          # We need the W(ide) version.
      ' DWORD nBufferLength,' .                                                               # Length buffer provided. If you pass 0, the return value is the length required.
      ' LPWSTR lpBuffer,' .                                                                   # Receives output from GetFullPathName
      ' LPWSTR *lpFilePart);');                                                               # Pointer within the buffer, where the filename starts.

      $debug and print $out "GetFullPathName loaded : defined($GetFullPathName) \n";
    }

    my $enc = $$self{OPTIONS}{CharsetFileName};
    my $encPath = $self->Encode($path, 'UTF16', 'II', $enc);                                  # Need to encode to UTF16
    my $LenReq = $GetFullPathName->Call($encPath, 0, 0, 0) + 1;                               # first pass gets length required. Add +1 for safety, Needs Null terminator?
    my $FullPath = \0 x $LenReq x 2;                                                          # create buffer to hold Full Path
    $GetFullPathName->Call($encPath, $LenReq, $FullPath, 0);                                  # FullPath is UTF16 now
    $path = $self->Decode($FullPath, 'UTF16', 'II', $enc);                                    # Decode

    if ($path =~ /^\\\\/) {
      $path = '\\\\?\UNC' . substr($path, 1) unless length($path) <= 247;
    } else {
      $path = '\\\\?\\' . $path unless length($path) <= 247;
    }

    $debug and print $out "WindowsLongPath return: $path\n";
    return $path;
}

TestLongPath.zip

Phil Harvey

Hi Frank,

Excellent!  Thanks!!  It will probably be Monday before I can take a close look at this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Hi Frank,

I found a bit of time today.

Wow, this is excellent!  Much cleaner that what I was doing before.  There were a couple of minor things that I fixed.  I also changed Win32::API::More->new to Win32::API->new to match the way it is used in the rest of the code.  It seems to work for me, but I couldn't figure out how to cd into your test directory with Unicode characters so my testing wasn't complete.  Attached is my new version of the code.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

FrankB

Thanks for your high opinion Phil.
I felt I had to come up with something that works, after I bet on the wrong horse by proposing Cwd!

I may be able to learn you something. How to CD into a Unicode directory.
- Select the text from the address bar in Windows Explorer, and copy that CTRL/C.
- Type CD " in the CMD prompt, and use Right click, or from the File menu Edit/Paste

copy_and_paste.jpg

Will look into your changes, but no doubt I'll be fine with them. And after all, I'm not a Perl programmer.

Frank


Phil Harvey

Thanks.  I was able to "cd" into the Unicode directories and all my tests ran without a problem.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Frank,

 I've just released 13.03 with a few minor changes, but it should behave the same as the last version I sent you unless I broke something.  I'm much happier with the WindowsLongPath code in this release vs. 13.02.  Thanks for figuring this out.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

FrankB

Thanks a million Phil.

FYI I will release ExiftoolGui V636 soon, within a couple of days.
That will have this as a default option!

Did some more testing meanwhile, also performance comparisons. All OK.

Frank

herb

Hello Phil, hello Frank,

Great, absolutly great!
Thanks for this feature.

But please allow a question.
Searching for "support of long path in windows" I always read
- a special registry key has to be set and
- the application needs a "long path aware" setting in its manifest

How does ExifTool fulfill these requirements?

Thanks in advance
Best regards
herb

FrankB

@Phil: Allow me to answer that one, correct me if I'm wrong.

@Herb: Short answer ExifTool (And Gui) dont do it that way. There is also the option to prefix Paths with \\?\ (or \\?\UNC\ for UNC paths) thereby 'opting in' for Long paths, for selected API calls. That is what ExifTool (And Gui) do.

Read the last part of this page:
https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry#functions-without-max_path-restrictions

Frank

herb

Hello,

thanks for your quick reply, thanks for the private lesson
and thanks again for this great feature.

Best regards
herb

FrankB

Just to manage expectations...

The total length of the path can exceed 260 chars, but you may still experience limitations. To name a few:

- The individual parts of the Path (sub directories, or filename) can not exceed 260 chars.
(c:\dir1\sub dir1\sub dir2\sub di4\file.ext; individual part = dir1, or sub dir1 etc.)
- Currently the CreateProcess does not support a CurrentDirectory (AKA Working directory) longer than 260 chars. This prevents GUI from starting ExifTool in a directory longer that 260 chars.
- Depending on the device/filesystem used (Hard drive, Network share, fat32, ntfs etc) you may get different results, or even errors.
- You may even notice that Windows Explorer will not always accept a long file name.
- etc.

But I think it's an important step forward, and we will just have to find out what the limits are. That's why it's useful to test this feature.

Frank