Dramatic drop of performance after 100 images processed

Started by oleg_osovitskiy, May 21, 2023, 02:47:29 AM

Previous topic - Next topic

oleg_osovitskiy

Hello, I have a problem with a performance of the perl library (Image::ExifTool)

I developed a perl script to run through my images (with File::Find) and extract only 2 tags: 'Model' and 'LensModel'.
Then I see which camera and lens was most used. I want to have statistics by years.
Normally I keep around 10,000 shots per year (+/- 50%).

When I debugged my script on a set of 12 images - it was done in less than 1 second.
But as soon as I run it on the whole year of images - each subsequent image takes longer and longer to be processed.
I run on 1080p (vertical side) images, not full size, they are small, around 1MB each.

200 images are processed in 45 seconds
373 images are processed in 414 seconds (but only 243 images returned correct tags, other threw errors)
As you can see, twice as many images took 10x more time to process.

after certain amount the processing speed settles at approx 4-5 seconds per image and stays like that.

The library certainly works fast on small number of files, do I use it wrong on bug number of files (10k-20k files)? If I remove call to ImageInfo() then execution is instant, certainly the time is spent in ImageInfo().

Here is how I use it:

# Tags: Model, LensModel
my $tag_cam  = 'Model';
my $tag_lens = 'LensModel';
my $tag_fnum = 'ApertureValue';

my @tags = ( $tag_cam, $tag_lens );

# Create the object once
my $exifTool = Image::ExifTool->new;
$exifTool->Options(Unknown => 0,
                   FastScan => 1,
                   Duplicates => 0,
                   ExtendedXMP => 0);

find( { wanted => \&process_image, no_chdir => 1 }, '.');

sub process_image
{
   my $debug_info = 0;

   if (($File::Find::name !~ /IMG_\d+/) &&
       ($File::Find::name =~ /\.jpg$/i)
       )
   {
      $total_image_count++;
      if (!($total_image_count % 100))
      {
        print(".");
      }

      my $jpg_file = $_;

      print("found: $File::Find::name\n") if($debug_info);

      # Get hash of meta information tag names/values from an image
      my $info = $exifTool->ImageInfo($jpg_file, \@tags, undef);
      ....
      my $lens_model = $$info{$tag_lens};
      my $cam_model  = $$info{$tag_cam};
      ....
      }
      else
      {
        printf("Exif read error!\n") if($debug_info);
      }
   }
}


Any ideas what Am I doing wrong?

D:\Oleg\src\_perl\exif_stat>perl --version
This is perl 5, version 32, subversion 1 (v5.32.1) built for MSWin32-x64-multi-thread

D:\Oleg\src\_perl\exif_stat>cpan Image::ExifTool
Loading internal logger. Log::Log4perl recommended for better logging
CPAN: CPAN::SQLite loaded ok (v0.219)
CPAN: LWP::UserAgent loaded ok (v6.67)
Fetching with LWP:
http://cpan.strawberryperl.com/authors/01mailrc.txt.gz
CPAN: YAML::XS loaded ok (v0.82)
Fetching with LWP:
http://cpan.strawberryperl.com/modules/02packages.details.txt.gz
Fetching with LWP:
http://cpan.strawberryperl.com/modules/03modlist.data.gz
Database was generated on Sat, 20 May 2023 06:20:09 GMT
Updating database file ... Done!
CPAN: Module::CoreList loaded ok (v5.20210123)
Image::ExifTool is up to date (12.60).

Oh and I also noticed that some images (e.g. from Huawei P30 Lite) make library crash and it NEVER returns correct information for any subsequent images anymore, until the script is restarted.

HDD speed is not an issue - I tried on both HDD and SSD - no difference

StarGeek

I recall this thread which was also about a performance hit over time.  Maybe that will help?
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

oleg_osovitskiy

#2
Quote from: StarGeek on May 21, 2023, 11:21:35 AMI recall this thread which was also about a performance hit over time.  Maybe that will help?

Ahhhh! Thanks! That was the culprit:

      my $info = $exifTool->ImageInfo($jpg_file, \@tags, undef);

I passed @tags as a reference and it was growing!

Now it works MUCH faster:

      my $info = $exifTool->ImageInfo($jpg_file, @tags, undef);

Now it processed 323 images 12 seconds, or roughly 27 images/sec.

I expect it to process the folder with 10k images in a reasonable amount of time.

Here are results to process all images for the year 2022:
Total images processed: 12193
------------------------
NIKON D750      :  9914
NIKON D850      :  1994
VOG-L04         :    25
---------------------------------------
Nikon 70.0-200.0 mm f/2.8      :  6502
Nikon 14.0-24.0 mm f/2.8       :  2377
Nikon 35.0 mm f/2.0            :   984
Sigma 35.0 mm f/1.4            :   624
Nikon 85.0 mm f/1.8            :   613
Samyang 12.0 mm f/2.8          :   584
Nikon 105.0 mm f/2.8           :   224
HUAWEI P30 Pro Rear Main Camera :    25
---------------------------------------

Processing is finished in 436 seconds!

12k images in roughly 7 minutes, that's very good!!!

During processing the perl was consuming roughly 4-6% of the CPU, 28MB of RAM and was reading 6-7 MB/s from the disk.
The HW configuration - Core i7 10th gen and 32GB RAM

Phil Harvey

I'll add a note to the documentation to hopefully help people avoid this pitfall in the future.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).