Mixed character sets not working in Windows

Started by mitch, June 21, 2016, 04:31:26 PM

Previous topic - Next topic

mitch

Hello!

I'm working on an application that uses Exiftool to read/write XMP into image files. Everything works fine on Mac, but I'm having issues with different character sets on Windows.

I can use the -charset filename= option, and that appears to work, but if a filename (or path) contains a mix of foreign characters AND numbers (such as Љ219.jpg) then the file cannot be found: "No matching files". There is a similar issue if the path (or username) contains these a mix of these characters: "Wildcards don't work in the directory specification. Not matching files".

I'm able to circumvent this by creating a read stream of the file (with NodeJS). I can then pipe that data into exiftool through stdin, but assuming that the path/filename has a mix of these characters, I'm not able to write data back into a file this way.

I've also tried creating an ARG file with the parameters, but get similar errors when trying to run any command.

Any help is appreciated. Thanks!

Phil Harvey

I don't understand this.  Numbers are the same as any other ASCII character.  Are you sure your encoding is correct?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mitch

Thanks for the quick response!

Oddly enough, I can't seem to work with different charsets at all now.
I ran chcp 65001 and changed the console font to "Lucida Console" to ensure that the foreign characters display properly.

Here are some of the commands I'm trying via Windows command line:

exiftool -charset filename=cyrillic йцуке.JPG
exiftool -charset filename=cp1251 йцуке.JPG
exiftool -charset filename=russian йцуке.JPG

Both exiftool and the example image file are located in the same folder, but I get a No matching files for each of these commands.

Phil Harvey

#3
It may be easiest to put the file names on separate lines in a UTF-8 text file, and use the ExifTool -@ option to read them from there (and use -charset filename=utf8).  This avoids the problems of special characters on the command line.

- Phil

Edit:  If you aren't aware of the FAQ, FAQ's 10 and 18 talk about this.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mitch

That worked perfectly!

I appreciate you taking the time to help. :)

mitch

Hello again, I just had a follow-up question.
The command works great for files and folder names, but can't seem to handle user accounts with foreign characters. What is the best way to handle this case?

For example, if I try running..
exiftool -charset filename=UTF8 -@ C:\Users\йцуке\Desktop\exiftool.txt

I get the following error:
Error opening arg file

Any ideas?
Thanks!

Phil Harvey

For this you need to get the character set right on the command line.  If you are using -charset filename=UTF8 and ran chcp 65001 in the console and the file name looks correct then it should work.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mitch

I was able to bypass this by streaming in the text file via NodeJS into Exiftool. Here is the command for anyone interested:


var process = spawn(exiftool, ["-charset", "filename=UTF8", "-@", '-']);
process.stdin.on('data', (data) => { console.log('CHILD_PROCESS STDIN:\n' + data.toString()); });
process.stdout.on('data', (data) => { console.log('CHILD_PROCESS STDOUT:\n' + data.toString()); });
process.stderr.on('data', (err) => { console.log('CHILD_PROCESS STDERR:\n' + err.toString()); process.stdin.end(); });

var is = fs.createReadStream(TEXT_FILE_PATH);
is.on('data', (chunk) => { console.log('READ_STREAM DATA:'); if (chunk) { console.log(chunk.toString()); process.stdin.write(chunk); } });
is.on('error', (err) => { console.log('READ_STREAM ERR:\n' + err); process.stdin.end(); });
is.on('end', () => { console.log('READ_STREAM END: All data has been read'); });
is.on('close', (err) => { console.log('READ_STREAM CLOSE: Read stream has been closed'); process.stdin.end(); });


Thanks again!