Author Topic: Automated tests for Unicode file names?  (Read 1106 times)

obetz

  • Sr. Member
  • ****
  • Posts: 244
Automated tests for Unicode file names?
« on: June 24, 2019, 11:15:09 AM »
Hi Phil,

are there automated tests to check the handling of Unicode filenames of Windows ExifTool?

It seems that the Strawberry Perl based version shows different results than the standard Windows ExifTool.

Since I personally use only 7 bit filenames (usually without spaces), I'm not very familiar with such issues.

Oliver

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 16519
    • ExifTool Home Page
Re: Automated tests for Unicode file names?
« Reply #1 on: June 24, 2019, 11:24:10 AM »
Windows Unicode filename support has been a real thorn in my side.  ActivePerl really sucks here, and I've had to code many work-arounds to patch these problems although I wasn't able to solve them all.  This alone will require a lot of testing if I switch Perl versions in the Windows distro.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

obetz

  • Sr. Member
  • ****
  • Posts: 244
Re: Automated tests for Unicode file names?
« Reply #2 on: June 24, 2019, 01:38:30 PM »
the difference I found was not caused by Strawberry Perl but MinGW doing file globbing although I had int _CRT_glob = 0; in my code.

MinGW seems to have a "read only" alias from _dowildcard to _CRT_glob so if you write to _CRT_glob, reading from it does not work!

Code: [Select]
int _CRT_glob = 0;
[...]
i = int _CRT_glob;
creates an error "error: '_dowildcard' undeclared (first use in this function)"!

Code: [Select]
_dowildcard = 0;
[...]
i = int _CRT_glob;
works. Strange, isn't it?

So back to my initial question: Do you have automated tests for Unicode file name handling?

My first experiment was to run exiftool -filename *.jpg > test.txt in a directory with a mix of weird file names. But I guess there are much more demanding tasks.

Oliver

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 16519
    • ExifTool Home Page
Re: Automated tests for Unicode file names?
« Reply #3 on: June 24, 2019, 01:46:10 PM »
Hi Oliver,

Yes, strange.

Sorry, I didn't answer your original question.  No, I don't have any automated tests for file name handling.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

obetz

  • Sr. Member
  • ****
  • Posts: 244
Re: Automated tests for Unicode file names?
« Reply #4 on: June 24, 2019, 02:13:39 PM »
any suggestions what I should test?

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 16519
    • ExifTool Home Page
Re: Automated tests for Unicode file names?
« Reply #5 on: June 24, 2019, 05:00:00 PM »
As well as Unicode file names, try files and folders with Unicode characters in the directory specification.

You could try renaming files as well, by writing the FileName tag with a Unicode name.

Also, try setting some filesystem parameters on a Unicode file name. (Like FileModifyDate.)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

obetz

  • Sr. Member
  • ****
  • Posts: 244
Re: Automated tests for Unicode file names?
« Reply #6 on: July 04, 2019, 06:06:57 PM »
the tests I created yield the same result for the installed version and the pp version so I don't think that the calling method makes a difference in parameter handling.

If anybody wants to extend them, I attached an archive.

BTW: Running "ExifTool -ver" takes 245ms with the standard (pp) ExifTool and 148ms with the installed (Strawberry) ExifTool. This is mostly calling overhead: The selftests from the t directory ran with similar speed.

Oliver