Proposal: Give Windows ExifTool access to full Unicode command-line arguments

Started by johnrellis, July 01, 2019, 05:13:14 PM

Previous topic - Next topic

johnrellis

QuoteAs long as a test script doesn't work correctly
What test script are you referring to?

johnrellis

My proposal doesn't address general Windows Perl issues with Unicode command-line arguments, and it makes no sense to evaluate it on that basis. Rather, it fixes a very particular problem with ExifTool, whose implementation doesn't use Perl Unicode strings.

The proposal makes a very small surgical change to how command-line arguments are passed to ExifTool. 

Currently, when a user does "chcp 65001", ExifTool receives the command-line arguments encoded in UTF-8.  However, the C standard library main() will replace those argument characters not in the system code page with "?".  The modified "ppl.c" launcher eliminates that  replacement with "?" -- that's all it does.

obetz

Quote from: johnrellis on July 05, 2019, 04:57:18 PM
QuoteAs long as a test script doesn't work correctly
What test script are you referring to?

This one: https://exiftool.org/forum/index.php/topic,10128.msg53670.html#msg53670 (fixed sources attached to today's post)

As I wrote there: I couldn't even find the correct settings to get the environment correctly parsed and/or printed together with the other text and args.

I don't want to be responsible for the code of A. Sinan Unur. He wrote "I got very few test failures due to these changes", but I'm used to deliver code with no test failures, not a few. He doesn't reveal the problems. He writes "I know what needs to be fixed, but, as the title says, this post is the first in a series" but never wrote another article.

diff my today's version with your submission to see what I changed. For example "malloc(len + 1)" was not needed since WideCharToMultiByte() already returns the correct length. I also dislike the many malloc(), mostly missing free().

I would do it differently, but I'm not gonna do the work until I know it's needed in the end. For the moment, this version should be good for tackling the Perl aspects.

Oliver

obetz

Quote from: Hayo Baan on July 05, 2019, 05:22:28 AM
WOW, that took them a very long time to get right in Python. I don't think you can expect to get a similar fix (though work-around is probably a better description since most issues seemed to be related to how Microsoft handles things) for Perl, let alone you be able to do so yourself.

Regarding the probability that Perl will handle this some day, see the replies to my question in the perl.win32.vanilla mailing list: https://www.nntp.perl.org/group/perl.win32.vanilla/2019/07.html The mailing list was down for a while, therefore I give this update now.

Regarding "fix" or "work-around": As I said, Microsoft implemented Unicode really long ago, and therefore they have some legacy issues. If they hadn't had the courage to make a decision, we could have waited much longer for Unicode to spread widely. That is why I am reluctant to criticise their solution.