High memory consumption when recursively scanning large image directories

Started by andrew.grant, December 05, 2011, 01:35:53 AM

Previous topic - Next topic

andrew.grant

Hi Phil,

Firstly, many thanks for making this immensely powerful tool freely available for use. I have been using it in conjunction with Powershell to help a friend find related pictures scattered among hundreds of sub directories and thousands of images.

We have encountered a scalability issue with the way ExifTool uses memory. I have read the post '-b and -x switches - Out of memory', but unfortunately we're still this issue in 8.71 owing to the number of .JPG images we're scanning.

Issue: When retrieving all EXIF data by scanning large image folders to create a CSV file, the exiftool application memory usage quickly grows to the point where all the RAM on our 32 bit Windows XP machine is used, this limits the size of image library we can scan.

System: Windows XP x32 and Windows 7 X64

Version: 8.71

Command: exiftool.exe -q -s -ext .JPG -r -csv "c:\library\images" > exif-info.csv

Instead of holding all the tags in memory before producing the CSV output, is it feasible to produce the CSV output as the tool runs the scan?

Please let me know if you want a screenshot of the high memory usage.

Many thanks,
Andrew

Phil Harvey

Hi Andrew,

Yes.  The -csv option is memory intensive when reading a large number of files.  See this post for an explanation.  Basically, the solution is to not use the -csv option.  I should probably add this to the performance hints.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

andrew.grant

Hi Phil,

Thanks for your fast response and answer. I think I understand now why information gathered while building a CSV must be accumulated in memory (as the columns in the CSV aren't known until the last picture is scanned(?)).  I will have a go at using the JSON format from PowerShell.

Best regards,
Andrew


Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).