ExifTool Forum

ExifTool => Bug Reports / Feature Requests => Topic started by: andrew.grant on December 05, 2011, 01:35:53 AM

Title: High memory consumption when recursively scanning large image directories
Post by: andrew.grant on December 05, 2011, 01:35:53 AM
Hi Phil,

Firstly, many thanks for making this immensely powerful tool freely available for use. I have been using it in conjunction with Powershell to help a friend find related pictures scattered among hundreds of sub directories and thousands of images.

We have encountered a scalability issue with the way ExifTool uses memory. I have read the post '-b and -x switches - Out of memory', but unfortunately we're still this issue in 8.71 owing to the number of .JPG images we're scanning.

Issue: When retrieving all EXIF data by scanning large image folders to create a CSV file, the exiftool application memory usage quickly grows to the point where all the RAM on our 32 bit Windows XP machine is used, this limits the size of image library we can scan.

System: Windows XP x32 and Windows 7 X64

Version: 8.71

Command: exiftool.exe -q -s -ext .JPG -r -csv "c:\library\images" > exif-info.csv

Instead of holding all the tags in memory before producing the CSV output, is it feasible to produce the CSV output as the tool runs the scan?

Please let me know if you want a screenshot of the high memory usage.

Many thanks,
Andrew
Title: Re: High memory consumption when recursively scanning large image directories
Post by: Phil Harvey on December 05, 2011, 07:16:29 AM
Hi Andrew,

Yes.  The -csv option is memory intensive when reading a large number of files.  See this post (https://exiftool.org/forum/index.php/topic,3551.msg16175.html#msg16175) for an explanation.  Basically, the solution is to not use the -csv option.  I should probably add this to the performance hints.

- Phil
Title: Re: High memory consumption when recursively scanning large image directories
Post by: andrew.grant on December 09, 2011, 03:40:49 AM
Hi Phil,

Thanks for your fast response and answer. I think I understand now why information gathered while building a CSV must be accumulated in memory (as the columns in the CSV aren't known until the last picture is scanned(?)).  I will have a go at using the JSON format from PowerShell.

Best regards,
Andrew

Title: Re: High memory consumption when recursively scanning large image directories
Post by: Phil Harvey on December 09, 2011, 07:02:04 AM
Quote from: andrew.grant on December 09, 2011, 03:40:49 AM
(as the columns in the CSV aren't known until the last picture is scanned(?)).

Exactly.

- Phil