Writing metadata in chunks/streams

Started by evan.kennedy, May 25, 2022, 08:58:19 PM

Previous topic - Next topic

evan.kennedy

Hello! I apologize if this has been answered, but I'm not quite finding the information I need.

I need to process potentially large (largest so far is over 2GB) video files in the cloud and add metadata to them. I was wondering if it is necessary when writing metadata for the entire file to be in memory at once. I was trying some tests using a memory-constrained docker container and kept running into "Killed!" with various options.

My ideal solution would be to use the "-" option to pipe data from STDIN and not need to wait for the entire file before sending data back, then that could be placed into a larger pipeline to my CDN. Given my naivety to the whole metadata writing process, is that even possible? If not is there a way to process in chunks to the filesystem without the whole file being in memory at once? Maybe in my testing I needed to use a larger file to hit chunking limits (I only went to about 220MB)?

Any pointers would be much appreciated, and thank you for very much in advance.

Phil Harvey

Rewriting a file requires that the entire file be read.  If you are doing this via a pipe, then ExifTool may need to buffer the entire file in memory if random access is required (likely).  It would be best to copy the file to a disk then edit it there first.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

evan.kennedy

That makes sense. I'll do some tests by copying first, then working on the file on disk. If there are any conditions that could ensure a file not need to be in memory at once, that would be ideal since I have strong control over the type & encoding of the video files being processed.

I'll explore this further, that puts me in the right direction.

Thank you very much for your reply and all you do!

WilliamHerren

If you are aware that your metadata will never be divided or combined into several parts and that none of your streams do so,