Segmentation fault on big XML file input

Started by findus, August 29, 2012, 07:53:21 AM

Previous topic - Next topic

findus

Hi!

I'm using exiftool to get the mime-type of files, and at the same time extract some exif-data from files that contain it.
I have discovered that I have a file that makes exiftool crash with a segmentation fault. It's a big XML-file in UTF-16LE encoding:

Some metadata: (produced using Cygwin)

$ file failfile.xml
failfile.xml: XML  document, Little-endian UTF-16 Unicode text, with CRLF line terminators

$ file -i failfile.xml
failfile.xml: application/xml; charset=utf-16le

$ ls -la failfile.xml
-rwx------+ 1 Findus Domain Users 131897984 Sep  7  2011 failfile.xml


When I try to run exiftool on the file (for example with the command "exiftool failfile.xml"), the result is a long processing time (like 2 minutes), a huge memory consumption (1.75GB), and a segmentation fault before any output has been produced.

If I convert the file to UTF-8, it still takes very long time, and uses very much memory, but I get a result, and the program exits clean.

If I test on a smaller (like, a few kB) file in UTF-16LE, exiftool works like a charm.

I am using exiftool 8.99 in cygwin on windows 7 (same problem when i run it outside cygwin).

I can supply the file if it would help, but it is quite big, so I'm not uploading it now according to instructions in:
https://exiftool.org/forum/index.php/topic,6.0.html

Phil Harvey

Thanks for this report.  Could you email me the file or make it available for download?  My email is philharvey66 at gmail.com

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).