Problems with -r and -csv combined options

Started by nbsoft, January 30, 2014, 10:59:44 AM

Previous topic - Next topic

nbsoft

Hello.

I'm getting into some trouble with the combined use of the -r and -csv options on a complex directory tree.

In all cases, otherwise specified, I run the command:

exiftool -r -csv *

from the upper directory.

Under Ubuntu Linux (Deft distro, a forensic one)
- both the default (8.x) and the last exiftool versions always freeze (the process remains always at 50-60% CPU but nothing happens after hours and hours...).

With very simple directory trees the program works fine.

Under Windows (standalone exiftool version, not Perl)
- the last version (9.48) stops immediately with no output and the message:


Image/ExifTool/XMP.pm did not return a true value at Image/ExifTool/HTML.pm line
20.
BEGIN failed--compilation aborted at Image/ExifTool/HTML.pm line 20.
Compilation failed in require at Image/ExifTool.pm line 1773.


If I run simply

exiftool -r *

without -csv, it prints just the first file analysis and then stops with the previous message.

- the last two production versions (9.27 and 9.46) seem to work, but for the output I have to wait the end of the directories recursion. I guess in the -csv mode the program buffers the output.

If I run simply

exiftool -r *

without -csv, it prints the output file by file, without the need to wait the entire recursion.

Is it normal?
It seems, on large directory trees, the Windows version works better than the Linux (Perl) one. It sound so strange...
The Windows version 9.48 seems to be a little bugged.

Another question: is it possible to have an unbuffered output (i.e. "file by file") in -csv mode also? It seems to be a better choice when analyzing large directory trees. It's not a problem to have all the -csv columns (the empty ones can be deleted afterwards, after importing the csv file into a spreadsheet).

Thanks in advance for your attention.

Nicola

Phil Harvey

#1
Hi Nicola,

From the application documentation for the -csv option:

            Note that this option is fundamentally different than all other
            output format options because it requires information from all
            input files to be buffered in memory before the output is written.
            This may result in excessive memory usage when processing a very
            large number of files with a single command.


You should use some other output option when you have such a large number of files.

The error you get with HTML.pm is due to a corrupted installation.  Uninstall according to the instructions here, then re-run exiftool.

- Phil

Edit:  I have had this same question a few times recently, but if you think about it for a minute you should realize why the -csv option must cache the information in memory before it can print even the first line of the output:

> exiftool -csv -artist -title -imagesize a.jpg
SourceFile,Artist,ImageSize
a.jpg,me,3936x2608

> exiftool -csv -artist -title -imagesize a.jpg b.jpg
SourceFile,Artist,Title,ImageSize
a.jpg,Phil,,3936x2608
b.jpg,,A Title,8x8
    2 image files read
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

nbsoft

Thank you very much for your prompt reply and for your wonderful software.

I had guessed that the reason was the detection of the number of columns but... but... if you include ALL the columns? E.g. OpenOffice supports up to 1024 columns.

Actually it may be very useful an unbuffered -csv option...

Once more thanks!

Phil Harvey

ExifTool currently recognizes 18075 pre-defined tags (somewhat beyond the OpenOffice limit), and that doesn't include the non-pre-defined tags that ExifTool extracts.  Not to mention duplicate tags.

If you want unbuffered, -json is a much better format.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).