Writing metadata in chunks/streams

evan.kennedy · May 25, 2022, 08:58:19 PM

Hello! I apologize if this has been answered, but I'm not quite finding the information I need.

I need to process potentially large (largest so far is over 2GB) video files in the cloud and add metadata to them. I was wondering if it is necessary when writing metadata for the entire file to be in memory at once. I was trying some tests using a memory-constrained docker container and kept running into "Killed!" with various options.

My ideal solution would be to use the "-" option to pipe data from STDIN and not need to wait for the entire file before sending data back, then that could be placed into a larger pipeline to my CDN. Given my naivety to the whole metadata writing process, is that even possible? If not is there a way to process in chunks to the filesystem without the whole file being in memory at once? Maybe in my testing I needed to use a larger file to hit chunking limits (I only went to about 220MB)?

Any pointers would be much appreciated, and thank you for very much in advance.

Phil Harvey · May 27, 2022, 10:22:56 AM

Rewriting a file requires that the entire file be read. If you are doing this via a pipe, then ExifTool may need to buffer the entire file in memory if random access is required (likely). It would be best to copy the file to a disk then edit it there first.

- Phil

evan.kennedy · May 27, 2022, 11:26:06 AM

That makes sense. I'll do some tests by copying first, then working on the file on disk. If there are any conditions that could ensure a file not need to be in memory at once, that would be ideal since I have strong control over the type & encoding of the video files being processed.

I'll explore this further, that puts me in the right direction.

Thank you very much for your reply and all you do!

WilliamHerren · January 20, 2023, 04:51:10 AM

If you are aware that your metadata will never be divided or combined into several parts and that none of your streams do so,

ExifTool Forum

News:

Writing metadata in chunks/streams

evan.kennedy

Phil Harvey

evan.kennedy

WilliamHerren