stay_open output file to stdout on Python

Started by dvc, October 30, 2023, 06:23:31 AM

Previous topic - Next topic

dvc

I recently learned about the option to output a file to stdout using -o -. And in a single call it works great:
proc = subprocess.Popen(['exiftool', '-jpgfromraw', '-b', '-o -', '/Volumes/Disk/0204.NEF'],
                         stdout=subprocess.PIPE)
output = proc.stdout.read()
with open('/Volumes/Disk/01.jpg', 'wb') as out_file:
    out_file.write(output)

However, when I want to achieve the same in stay_open mode, the resulting file is limited to 65KB. Also it stops working output = proc.stdout.read(). And replacing the number in the os.read argument leads to either a decrease or a maximum of 65KB.
proc = subprocess.Popen(
    ['exiftool', "-stay_open", "True", "-@", "-"],
    universal_newlines=True,
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)

exiftool_args_extract_preview = f"-b\n-jpgfromraw\n-0-\n/Volumes/Disk/0204.NEF\n"
proc.stdin.write(str.join("\n", (exiftool_args_extract_preview, "-execute\n")))

proc.stdin.flush()
output = ""
fd = proc.stdout.fileno()
output = os.read(fd, 65535)
with open('/Volumes/Disk/01.jpg', 'wb') as out_file:
    out_file.write(output)

Does anyone know how this can be fixed?

Phil Harvey

I'll assume you're on MacOS.  There should be no 65 kB limitation on MacOS.

What do you mean when you say it "stops working"?  Does it hang forever or return nothing?

You need to keep reading stdin until you get the "{ready}" message.  So you will need to put the read (and maybe the flush) in a loop.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dvc

Quote from: Phil Harvey on October 30, 2023, 06:53:25 AMWhat do you mean when you say it "stops working"?  Does it hang forever or return nothing?
Yes, MacOS. output = proc.stdout.read() hang forever and return nothing.
Doesn't go any further to even create the file:
with open('/Volumes/...

dvc

Perhaps it's all in the universal_newlines= line. In the first working example, although this line is not specified, it is implied that universal_newlines = False
Ok, let's do the same. In this case, the stdin is no longer waiting for a string, but for a bytes-like expression.

    proc = subprocess.Popen(
        ['exiftool', "-stay_open", "True", "-@", "-"],
        universal_newlines=False,
        stdin=subprocess.PIPE, stdout=subprocess.PIPE)

    exiftool_args_extract_preview = f"-b\n-jpgfromraw\n-0-\n/Volumes/Disk/0204.NEF\n"
    www = str.join("\n", (exiftool_args_extract_preview, "-execute\n")).encode()
    proc.stdin.write(www)

But the situation is the same. If output = proc.stdout.read() then does not respond at all, but if we receive through os.read then again 65 KB.

Phil Harvey

Does read() read until the EOF?  If so, it will never return.  You need to use a form of read that returns immediately.  Either that or maybe you can specify a timeout?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dvc

Timeout has no effect. How to get or not get EOF my knowledge in Python is still not enough  :'(

dvc

Perhaps I would move away from the concept of working with standard outputs/inputs towards io.BytesIO objects.
What I mean. Will this work if you substitute a pointer to such an object in memory instead of the file path? As here:
...
        exiftool_args_extract_preview = f"-b\n-jpgfromraw\n-w\n{io.BytesIO_bject}\n/Volumes/Disk/0204.NEF\n"
        self.exiftool.stdin.write(str.join("\n", (exiftool_args_extract_preview, "-execute\n")))
...
I wonder if anyone has already done this?

Phil Harvey

You might be better off asking in a Python forum.  I've never programmed Python myself.  I did write a C++ interface myself, but it is maybe more complicated than you need because it is multi-threaded.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

You might also look into PyExifTool rather than writing your own wrapper.  I don't know enough to help with coding, but I do see a closed issue on that repository where someone was extracting 3 MBs of binary data from a file, so it appears to be possible.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

dvc

I did it! I have already taken a different path and I don't need it. But if anyone is interested, here is the solution. We save the result into RAM for further processing.

import os.path
from subprocess import Popen, PIPE
from io import BytesIO

    def execute_exiftool_stdout(self, raw_file_path):
        """ExifTool task for extract preview from raw in stdout."""
        self.exiftool.stdin.write((f"-b\n-jpgfromraw\n-0-\n{raw_file_path}\n-execute\n").encode('utf-8'))
        self.exiftool.stdin.flush()
        fd = self.exiftool.stdout.fileno()
        buffer = BytesIO()
        while True:
            block = os.read(fd, 65535)
            if block.endswith("{ready}\n".encode()):
                buffer.write(block[:-8])
                break
            buffer.write(block)
        return buffer.getvalue()