[Exiftool + gallery-dl] Bypass the special/japanese characters problems in Win10

Started by vitor.mathews, July 05, 2022, 09:26:05 PM

Previous topic - Next topic

vitor.mathews

TL;DR: I use Exiftool as a gallery-dl postprocessorl command to add exif metadata to the downloaded files, and I need a Powershell/CMD command (that I'll use as another postprocessor command before the Exiftool one) to write the downloaded file path and metadata values generated by gallery-dl to a text file (just the chosen ones, one per line and with an Exiftool command appended to the beginning), so I can use this generated text file as an arg file in my Exiftool postprocessor command to overcome problems with special characters.

Whenever a special character like ⧸ or any Japanese character is written to the exif file metadata using Exiftool, they are replaced by a bunch of "???". And when they are present in the filename, Exiftool can't even find the files. It looks like it's a problem on the Windows side, which can't handle these characters well. The Exiftool documentation says here and here that using a arg file is a good way to bypass this problem, so a solution I thought of would be to use an exec.command (a gallery-dl postprocessor command) to first write the metadata values given by gallery-dl ​​and the file path to a text file, each one preceded by the corresponding Exiftool command (1), and then run a second exec.command using this arg file (2) to correctly write special characters in their respective fields, and also avoid problems with special characters in filenames.

1. WHAT THE TEXT FILE WOULD LOOK LIKE:

-title={content:?//}
-xpcomment={favorite_count} Likes
-keywords={hashtags!S}
-createdate={date}
{_path[:4]}


PS: The values ​​in { } comes from gallery-dl and should have already been converted to real values, so taking this tweet as an example, the arg file would become:

-title=#深夜の真剣お絵描き60分一本勝負うどんげ
-xpcomment=13119 Likes
-keywords=深夜の真剣お絵描き60分一本勝負
-createdate=2022-07-02 22:34:43
D:\Downloads\twitter\poccheinfinity 1543362559075463169 p1 [2022-07-02].jpg


2. USING THE GENERATED TEXT FILE AS AN ARG FILE FOR EXIFTOOL:

From gallery-dl config file:
"postprocessor":
{
    "exiftool-twitter":
    {
        "name": "exec",
        "async": false,
        "command": ["exiftool", "-charset", "filename=utf8", "-@", "~/gallery-dl/ARG FILE.txt"],
        "event" : "after"
    }
}


So how could I generate this text file with one argument per line using an exec.command postprocessor? it would require just a Powershell/CMD command I think, but I don't know how to do so (I'm a newbie). And it would be good if the text file was replaced on each new file downloaded, so I don't end up with hundreds of txt files.
 
And if there's an easier way to correctly read/write these special characters, please let me know.

Windows 10 Home 21H2 19044.1766 | gallery-dl 1.22.3 | Exiftool 12.42 (Windows Executable)

StarGeek

I was never able to get good results out of the options in FAQ #18.  I had to resort to using this StackOverflow answer, which unfortunately has a side effect on some older GUIs.

But one option that works well is if you can convert the higher order UTF characters into html entities and use the -E (-escapeHTML) option.  As an example, I have a short AutoIt script here which grabs the text in the clipboard, converts it, and saves it back into the clipboard.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

vitor.mathews

Quote from: StarGeek on July 05, 2022, 11:24:46 PM
I was never able to get good results out of the options in FAQ #18.  I had to resort to using this StackOverflow answer, which unfortunately has a side effect on some older GUIs.

Man, THANK YOU SO MUCH! Finally a solution that actually works! I spent so much time trying solutions that don't work... I have yet to find out if this side effect will affect me or not tho.

QuoteBut one option that works well is if you can convert the higher order UTF characters into html entities and use the -E (-escapeHTML) option.  As an example, I have a short AutoIt script here which grabs the text in the clipboard, converts it, and saves it back into the clipboard.
"convert the higher order UTF characters into html entities" - How to do this? Need to use this script? Because I need something that works automatically through my gallery-dl post-processor, with no manual work involved.

QuoteAt some point I should probably try and figure out the Windows Subsystem for Linux and see if the problem persists there.
Also, have you tested it already?

StarGeek

Quote from: vitor.mathews on July 06, 2022, 06:05:49 PM
"convert the higher order UTF characters into html entities" - How to do this?

I have no clue as to what abilitys Gallery-dl has, so I don't know how you might be able to integrate it. The script I linked is just an example of how I do it

Quote
QuoteAt some point I should probably try and figure out the Windows Subsystem for Linux and see if the problem persists there.
Also, have you tested it already?

Did I say that at some point?  Wasn't in this thread.

Anyway, I have the subsystem installed thanks to Chocolatey, but have to no idea of how to run it.  So I haven't done any testing with it.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

vitor.mathews

QuoteI have no clue as to what abilitys Gallery-dl has, so I don't know how you might be able to integrate it. The script I linked is just an example of how I do it
Well, it can download an image and then run a postprocessor command to process the file. The postprocessor command is basically a Powershell/CMD one (with proper formatting), so I can run Exiftool commands from it to process the downloaded file. If there is an Exiftool or PS/CMD command to do what you said, it would be possible to add it to the gallery-dl postprocessor command as well.

QuoteDid I say that at some point?  Wasn't in this thread.
Yes, here.

StarGeek

Quote from: vitor.mathews on July 06, 2022, 08:58:34 PM
Well, it can download an image and then run a postprocessor command to process the file. The postprocessor command is basically a Powershell/CMD one (with proper formatting), so I can run Exiftool commands from it to process the downloaded file. If there is an Exiftool or PS/CMD command to do what you said, it would be possible to add it to the gallery-dl postprocessor command as well.

There wouldn't be anything exiftool could do in this case, as far as I know.  You'd might have to look into Powershell commands or do some scripting in something like AutoIt/AutoHotKey.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype