Question about stdin and stdout, and writing trucated files.

Started by colkadome, April 14, 2021, 07:03:05 AM

Previous topic - Next topic

colkadome

Hi there,

I have a particular use-case with ExifTool on Amazon's Lambda and S3, where I'd like to modify the metadata of certain S3 files (mainly the description/copyright of JPEGs and MP4s) and serve the modified files to a client.

Right now on a Lambda I'm piping the input file from S3 to exiftool's stdin, then piping stdout to an output file on S3. This is fine for JPEGs but large MP4s (> 100 MB) will take a while, so I'm wondering if either of these are possible:

- Allow exiftool to start writing to stdout before stdin has finished to speed it up (apologies if there is already an option for this that I'm unaware of).
- Allow exiftool to read the first N bytes (enough to get the metadata) to write a truncated version of the file's head, then dump the rest of the input file's data to the end of the output file. This could be really hacky and may only work for some formats, but would be really fast thanks to s3's multipart CopySource feature to copy large chunks of s3 files around.

Thanks.

Phil Harvey

ExifTool should already be writing the output file before the input file is done reading.  The only case this wouldn't happen is if there are pointers at the start of the file that need to be updated based on information found later in the file.

About reading the first N bytes and writing a truncated header.  I don't even see how this would work, but it certainly wouldn't be possible for some files.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).