I know that ExifTool is generally extremely fast, even for large files.
But I have some TIFF files from my users where ExifTool takes 30 seconds or more at 100% CPU utilization to extract the embedded EXIF, IPTC and XMP data. The command line I use is a simple:
-G1
-m
-all:all
<FileName>
The file is a 100 MB TIFF file produced by Photoshop. I use ExifTool version 9.08.
I can upload the file for Phil if required and email him the link (it's a file from one of my users and thus I cannot just publish it).
Sure. Send me the link and I'll take a look (philharvey66 at gmail.com)
- Phil
Thanks, Phil.
Uploading now, link will be in your inbox shortly.
Thanks for the sample. I can confirm that ExifTool is very slow reading this file. The reason is that the GPS IFD is badly corrupted, and points to a list of 15 million IFD's (!!) which ExifTool tries to process. Ouch!
I'll see what I can do to patch this problem at my end, but this TIFF is corrupted and should be repaired.
- Phil
Yikes. :o
I will send you another sample (JPEG this time) which exhibits the same behavior. This file is much smaller.
My application has to deal with images gathered over a decade and maimed with all kinds and combinations of software. To handle all this, or at least to flag files as "damaged, please repair" is a big effort.
I guess with "should be repaired" you mean a specific ExifTool command or an update of the GPS data?
If you try to write this TIFF with ExifTool, you will get this message and ExifTool won't write the file.
Error: Bad IFD or truncated file in GPS1
This is what happens when ExifTool finds a serious problem -- it refuses to write the file because the risk of damage to the image is too high. There is nothing ExifTool can do to fix this. You would have to load it into an image editor and rewrite the file to fix this.
JPEG images are different because the metadata is separate from the image. For these you can follow the directions in FAQ number 20 (https://exiftool.org/faq.html#Q20) to rebuild the EXIF with whatever metadata ExifTool can read.
- Phil
I got the JPEG, thanks.
This one is a mystery. I see nothing with the file, and ExifTool processes it in 0.13 seconds on my system here (Mac OS X 10.8).
Can you reproduce this problem? On what system and with what ExifTool version?
- Phil
Hi, Phil
you are correct with your observation. It took me a while to figure this out...
- When I extract data from the JPEG on my local hard disk, it works. And in about 0.2 seconds.
- When I copy the file to a network share and this share contains Czech characters, my app waits for 60 seconds for the ExifTool child process to return. Actually it waits for ET writing {ready}, which does not happen.
When I run the same ARGs file I use for the child process on the Windows command line, ExifTool just returns with "file not found".
Can it be than ExifTool does not emit a {ready} when I use it with
stay_open and there is only one file to process, and this file cannot be found?
The underlying problem of all this is of course again Perls incompatibility with Windows UNICODE file names, or other file names not containing characters Perl is aware of. Under Windows, my app uses the GetShortPathName to produce file and folder names ExifTool can digest. Unfortunately, on network shares this does not work. Here GetShortPathName always returns the original unmodified path, with Czech, Chinese or whatever characters are used. Also, in modern Windows versions, support for "short" path names can be turned off to increase file system performance and efficiency. ExifTool can then not be used to process any files containing characters not supported by Perl :(
I fear that I have to tell my users not to use their own language when naming folders or files, but stick to ASCII/ANSI or at least their local code page. Which of course will not work in international teams. Is there any news from the Perl team about true UNICODE or UTF-8 file name support?
I just verified that ExifTool does send a {ready} message when only one file is named and that file doesn't exist. So the missing {ready} is a mystery.
I have a partial solution for the Windows Unicode filename problem but I haven't implemented it because it isn't a full solution. It would allow the Unicode file names to be specified on the command line, but would still fail if a directory name was specified. Would this be useful to you?
Also, the Windows command line is notoriously bad for passing Unicode characters on the command line. Most Windows systems seem to use Windows Latin 1 by default, but you would need to use UTF-8 to have a full character set. Can you ensure that the command processor running exiftool always uses UTF-8?
- Phil
Hi, Phil
I will need to investigate this further then. It was just my guess because my application waits in a loop, checking ExifTool output for the {ready} token several times a second. Do you send the {ready} to
stdout or
stderr in this case?
Maybe a flush missing (?) and the small amount of data never reaches my app because Windows does not bother to send the few bytes (because ExifTool is still running (stay_open) and Windows thinks it will send more data soon. This idle wait until timeout is actually the bigger problem for me than the actual "can't find this file because of weird file name" problem.
I use only UTF-8 ARG files to communicate with ExifTool to work around any codepage-related command line issues. ExifTool does not handle these file names even if I put the command prompt into UNICODE or UTF-8 code page before issuing ExifTool commands, or running UTF-8 encoded ARG files.
Quotebut I haven't implemented it because it isn't a full solution.
Well, this would help in quite a number of cases. Not in all, of course. In my experience, users more often use 'locale' characters in their folder names than in their file names. File names often use the standard number sequence generated by the camera, some sort of global project/client id or they are made up from date and time stored in the file. Folder names often include person or location names, which often require locale-specific characters.
Perl should really be supporting UNICODE or UTF-8 file names on Windows by now. How hard can that be? I'm waiting for this for two years now :'(
Quote from: Mac2 on December 18, 2012, 08:25:07 AM
Do you send the {ready} to stdout or stderr in this case?
{ready} always goes to stdout.
QuoteMaybe a flush missing (?) and the small amount of data never reaches my app because Windows does not bother to send the few bytes (because ExifTool is still running (stay_open) and Windows thinks it will send more data soon. This idle wait until timeout is actually the bigger problem for me than the actual "can't find this file because of weird file name" problem.
At my end, exiftool flushes the stdout stream when the {ready} message is sent. Windows really shouldn't be holding onto output when this is done.
QuoteWell, this would help in quite a number of cases. Not in all, of course. In my experience, users more often use 'locale' characters in their folder names than in their file names.
Yes, but do you pass directory names or individual file names to exiftool? The solution works if you pass file names, even if the path to the file contains special characters.
QuotePerl should really be supporting UNICODE or UTF-8 file names on Windows by now. How hard can that be? I'm waiting for this for two years now :'(
I'm not sure if this is Perl's problem. Perl uses the standard C libraries for file I/O, and as far as I know these have never supported Unicode file names on Windows.
- Phil
I'll check why I don't get any output in stdout for this specific case. Very strange.
I use complete paths in the arg file (folder and file names). One ARG file usually processes files from different folders. For performance I use ExifTool always in batch mode.
Maybe I could change this to process files "per folder". Then set the "current directory" of the ExifTool process to that folder before processing the ARG file. And use only file names in the ARG file... This would allow me to take advantage from your "half" solution. I don't know right now if this is possible but I will look into this.
As I said, the half solution would work with special characters in the path, as long as only files were specified. So this should work:
exiftool "sømé diréctøry nåme/sømé fîlé nåmé.jpg"
but this wouldn't
exiftool "sømé diréctøry nåme"
So it shouldn't be necessary to change working directories.
- Phil
Quoteas long as only files were specified.
Ah, sorry :-[
I misread that as "only file names were specified. I never process entire directories with ExifTool, I always specify complete file names in the ARG files I use. Both the file names and the folder names may contain non-ASCII characters.
If this would work, my problem would be solved. And I'm sure many other users wait for this too (I've read past threads about UNICODE file names and potential work-around before posting).