Script Mode and Unicode File Names

Started by dbuchhorn, December 14, 2018, 08:51:34 AM

Previous topic - Next topic

dbuchhorn

I use exiftool in script mode from a java program (on windows). All data is written and read with the UTF8 character set. If exiftool is used in script mode and a file path with unicode characters is set then a file not found error is reported (like "File not found - ©öäütest.jpg"). If exiftool is called in "normal" mode then the file will be processed.

Error example:
exiftool -stay_open true -@ -
-g
öäütest.jpg
-execute
öäütest.jpg
File not found - ©öäütest.jpg

Phil Harvey

I think you should try adding the -charset filename=utf8 option.

This is discussed in some detail here.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dbuchhorn

Thanks for the fast reply. I test this tag.
1: exiftool -charset filename=utf8 -stay_open true -@ -
2: exiftool -stay_open true -@ -
    -charset filename=utf8
    ...

But I get always this warning: "Warning: Tag 'charset' is not defined, Nothing to do."

Phil Harvey

Quote from: dbuchhorn on December 14, 2018, 10:09:45 AM
1: exiftool -charset filename=utf8 -stay_open true -@ -

This should work.

Quote2: exiftool -stay_open true -@ -
    -charset filename=utf8

Each argument needs to be on a separate line:

exiftool -stay_open true -@ -
-charset
filename=utf8


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dbuchhorn

I found the problem "-charset filename=utf8" are two arguments not one. Now it worked. But why utf8 must be set for the filename? The exiftool documentation says UTF-8 is the default charset and if so then the filename is parsed already in the right charset. Writing metadata with unicode characters works too.

Phil Harvey

UTF-8 is the default character set for tag values.  The default for file names depends on your system settings.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dbuchhorn


dbuchhorn

I still have problem with this. All works fine if a file with utf-8 characters is processed in the first request. In later requests all files with utf-8 characters will not be found. I try to find a way to reproduce this on the command line, because I use it from a java program (exiftool -charset filename=utf8 -stay_open true -@ -).

Here an example to reproduce this problem on the command line:
- need two image files: test.jpg and ©öäütest.jpg
- create a command-line argument file with UTF-8 as charater set (test.args)

-g
-j
-ExifTool:all
©öäütest.jpg
-execute
-g
-j
-ExifTool:all
test.jpg
-execute
-g
-j
-ExifTool:all
©öäütest.jpg
-execute



-start exiftool:
>exiftool -charset filename=utf8 -@ test.args

Output:

[{
  "SourceFile": "©öäütest.jpg",
  "ExifTool": {
    "ExifToolVersion": 11.23
  }
}]
[{
  "SourceFile": "test.jpg",
  "ExifTool": {
    "ExifToolVersion": 11.23
  }
}]
Error: File not found - ©öäütest.jpg


The third request will fail.

Phil Harvey

With a lot of work I have managed to reproduce this problem.

Your -charset filename=utf8 applies only to the first command.  You need to do this:

exiftool -@ test.args -common_args -charset filename=utf8

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dbuchhorn

Thank you very much. This works now for stay_open too.