bug: hang or fail on large quantity multiple change in win xp

Started by pb, August 04, 2011, 04:45:24 PM

Previous topic - Next topic

coz

Bogdan,

ExifTool (and ExifToolGUI!) are indeed great.  However, I have not used any other tool, other than Windows Explorer and Windows Photo Viewer, to view the files. 

I did use TeraCopy to copy the files from my laptop to my external drives and again to my desktop.  I don't think that would have done it but I will do some further research.

Chris

BogdanH

I believe you. But fact is, ExifTool gives warnings for some files.
Windows Photo Viewer... as far I can remember, this viewer isn't capable to use Exif:Orientation tag value to display (portrait oriented) images properly... hence, to see them vertically oriented, user must rotate them manually. If I'm right on this, then by doing this, jpg image file is being "corrected"... just thinking "loud". If I'm wrong, then there was something else.
Years ago, when I discovered, that metadata is being altered without my knowledge (by software I wouldn't believe), I started searching for the tool... and found ExifTool. Being impressed by it's capability (and support!), I decided to make "my private small GUI" for it -I mean, who on earth can remember these tag names and options :)
I don't say other good software doesn't exist... one just need to check consequences befure trusting fully. For example: Even I know Corel PSP is poor player in metadata area, I'm still using it for editing my photos.. but not for managing metadata.

Bogdan

pb

I believe I used ctrl-A, but the past has a habit of becoming vague over time.

I used to keep smaller folders of images, but it became a nuisance for locating or browsing images.  However a few thousand is the max in win xp because otherwise the filesystem becomes too slow.

Quote from: BogdanH on August 05, 2011, 04:08:05 PM
Yes, I suggest you wait until I fix this known bug first -don't waste energy :)
If all files are selected, then GUI simply sends (for example):
exiftool -Exif:Artist="MyName" -Exif:Copyright="C by Myname" *.*
-or in case file extension filter is used:
exiftool -Exif:Artist="MyName" -Exif:Copyright="C by Myname" *.jpg
In this case I see no way, command line could exceed ~32k characters (Windows) limit.

Yes, I can imagine one can skip first or last file when selecting all files (to be sure, Ctrl+A should be used). And from GUI's perspective: only (in GUI) visible files matter when GUI "counts" them. If some of selected files are not image files, then, after process is finished, ExifTool will report (via GUI) about that -that is, this shouldn't cause any "unexpected" error.
To save ExifTool's time for reading (not needed) files, file extension filter should be used if possible -especially if working on large amount of files.
Talking about... do you really regulary keep 2000+ image files in single folder? Just wondering.  :)

Bogdan

pb

I just counted the number of chars in the failing command line I posted earlier, and it is no more than 8236 bytes (the fastest way for me to count was to create a file of it and look at the size).  Obviously, that is nowhere near 32k, so maybe there is something else going on as well, possibly with exiftool.  However, again I'll wait til you get the known bug fixed before I try running more tests.

Quote from: BogdanH on August 05, 2011, 12:52:08 PM
Ok... I'll be as short as possible on this (you'll understand why).

I've said, since latest GUI version, there's no limit on number of selected files. But as it seems, I should add "in most cases". The thing is, when implementing this feature, I forgot doing that for Exif, Iptc and Xmp edit [ ^ ] buttons!! I still can't believe this happened  :-[
That is, these three buttons (and hopefully nothing else) still behaves in old fashion: if total length of all filenames (incl ExifTool parameters) exceeds ~30kb, then "unpredicted" things can happen.
I will fix this in next few days.
However, if all files are selected in filelist pane, then above limitation doesn't exist (because in this case, GUI passes *.* wildcard expression toward ExifTool).

Bogdan

Phil Harvey

#19
Quote from: BogdanH on August 06, 2011, 08:37:29 AM
Why there's no problem when modifying IPTC section only? I don't know..

The answer is simple:  ExifTool only modifies the EXIF if you are writing EXIF information, for 2 reasons:  1) It's faster, and 2) I don't like the idea of ExifTool changing something unless you tell it to.  If you write only IPTC tags, you shouldn't expect the EXIF to change.

Having said this, for most raw images the IPTC is actually stored inside the TIFF/EXIF structure.  So for these images, editing the EXIF it is unavoidable.  But for JPEG images, the EXIF and IPTC are in separate segments.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pb

Thanks for this clarification, though it doesn't currently affect me.  I strongly agree with the sentiment about what exiftool should and shouldn't modify.  However, for a neophyte like me, the story about raw images is new info and counterintuitive.  If you don't already mention it somewhere in the FAQ, it might prove useful for people like me if you added it.

Quote from: Phil Harvey on August 06, 2011, 02:03:42 PM
Quote from: BogdanH on August 06, 2011, 08:37:29 AM
Why there's no problem when modifying IPTC section only? I don't know..

The answer is simple:  ExifTool only modifies the EXIF if you are writing EXIF information, for 2 reasons:  1) It's faster, and 2) I don't like the idea of ExifTool changing something unless you tell it to.  If you write only IPTC tags, you shouldn't expect the EXIF to change.

Having said this, for most raw images the IPTC is actually stored inside the TIFF/EXIF structure.  So for these images, editing the EXIF it is unavoidable.  But for JPEG images, the EXIF and IPTC are in separate segments.

- Phil

Phil Harvey

Quote from: pb on August 06, 2011, 02:14:51 PM
However, for a neophyte like me, the story about raw images is new info and counterintuitive.  If you don't already mention it somewhere in the FAQ, it might prove useful for people like me if you added it.

It is true that I haven't documented this.  The reason is that explaining it properly requires an explanation of metadata structure of the various image formats, which would also be useful, but is something that I have been trying to avoid.  Also, this is an implementation-specific detail that I would prefer not to document in case I want to change the details of the implementation in a future ExifTool version.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pb

I have now tested multiple file hang using vesion 4.17 of exiftoolgui (and v 8.60 of exiftool).  The problem is exactly the same as before, and "it" hangs at exactly the same point on the same directory (well, a copy of the original directory) -- after processing 257 files.  This time I verified that the command line to exiftool indeed contains "*.*".  The second spawned exiftool  is hung -- it stops any further i/o activity and uses no more cpu.

I don't know how exiftoolgui is communicating with exiftool -- conceivably the hang could be due to some pipe problem -- so I ran the exiftool command directly from the command prompt.  Running it that way does not hang, and finishes modifying the entire directory, without any problem.

My conclusion is that the problem is probably coming from whatever method is being used to communicate between exiftoolgui and exiftool.  My guess is that you have some kind of pipe, and due to some error in handling the pipe a deadly embrace kind of situation is happening -- exiftool is blocked because the pipe is full (or waiting for it to be emptied), while exiftoolgui is blocked thinking that it needs more data in the pipe before emptying it, or maybe blocked in some intermediate operation before working on emptying the pipe again, for example some auto-refresh task (or blocked in some weird way, e.g. caused by an exiftool error message about fixing something in an image file).  As usual, there are many ways to lose;  I'm just guessing at some.

I should also mention one small irregularity, namely that using the copyright symbol from the command line that exiftoolgui sends to exiftool (as reported by Process Explorer) in the command I used to run exiftool directly from a shell results in a wrong symbol ending up in the exif field.  I'm pretty confident that this is an independent charset problem, but just in case it's not, I'm mentioning it.

BogdanH

Now, that you've mentioned that: GUI uses standard MS CreateProcess routine for calling ExifTool and pipes, of course. I can imagine, that there (pipe in/out) can be a reason for this issue.
I've ran GUI with about 1500 mixed JPG and RAW files (several GBytes) and everything is fine -so, it's hard for me to findout where the problerm is. Anyway, I will take a closer look to my piping methods and if needed I'll make some "special" GUI version where you could change some related parameters, which should help locating what's needed to be changed.

Bogdan

PS: I'm in hurry.. just in short: to enter C-right symbol in GUI, you press (and hold) Alt key and type 0169 on numerical keyborad (it's a Windows thing).

pb

One more thing that I should have mentioned in my last post was that I notice that a hang has occurred when the gui fails to redraw its window after having been occluded by something else.  This might only be a symptom of the fact that it is hung, but it might also mean that something in the user interface hangs first, and that propagates to failing to read from the exiftool pipe.  This might or might not be related to the fact that exiftool issues a fairly large number of warnings about minor problems in the exif data.

pb

#25
Quote from: BogdanH on August 10, 2011, 01:22:04 AM

Bogdan

PS: I'm in hurry.. just in short: to enter C-right symbol in GUI, you press (and hold) Alt key and type 0169 on numerical keyborad (it's a Windows thing).

Although that's useful info, that isn't what I was wondering about.  Here's what was weird:  sysinternals Process Explorer shows the command line that exiftoolgui gave to the shell to run exiftool.  What it showed was the circle-c copyright symbol.  I used the windows "copy to clipboard" (ctrl-c) and then pasted that command line into an emacs shell buffer.  Emacs also then showed the symbol correctly.  I then ran the command line, which ran just fine.  Then when I used either exiftool or exiftoolgui to show what ended up in the copyright field, it showed a different pair of characters (now a pair, perhaps reflecting a conversion to unicode or other 16 bit coding.)  What I am trying to figure out is exactly what all character conversions were taking place to cause this, and what character I should have handed to exiftool that would be exactly the character that exiftoolgui actually gave it in the command line it used.  (BTW I am running a US English Windows installation, with everything set to US English.)

BTW, based on Phil Harvey's answer to a similar question on the exiftool forum, it's probably not such a great idea to use the copyright symbol in exif data, only in xmp data, since as I understand his response, strictly speaking it is not allowed.  I think I will end up just saying "(c) Copyright" instead of using the symbol, since it is much more likely to survive obscure character set conversions.

PH Edit: Added link to thread in the exiftool board

BogdanH

@pb
First, thanks again for taking time by giving additional info. As said, I can hardly test my code, bacause I don't get this "hangs". Here is a link to "beta" version of GUI:
Link removed
-to start somewhere, I have significant increased pipe buffer.
Now, if you would be so kind to try it out... if buffer is/was the reason, then I expect that either: there's no "hang" anymore or much more than 257 of your files will be processed. If there will be no changes, I'll need to look elsewhere.
You've mentioned you get many "minor" errors.. have you tried to "check" Options>Ignore existing minor errors menu? Does it make any difference in number of processed files before "hang"?

Thank you,
Bogdan

PS: as soon you confirm downloading, BetaGUI will be deleted.

pb

I haven't had time to try the beta gui yet, but I do have more information to report.  Sorry I did not get a chance to tell you last night (my time zone - Pacific) so you would have seen it early your time.

I realized that I could check whether the problem was coming from the many warning messages that exiftool generates on my files, because as you mentioned to coz earlier in this thread, it fixes the problems on the first pass, and subsequent runs will not find those problems.  So, I ran exiftoolgui on the same directory after I had run exiftool already once.  The result was that exiftoolgui completed the entire directory of over 2000 files without hanging.  This makes me believe that the problem is related to the many warning messages.  Coz's complaint and resolution is consistent with this theory.  I can send you the full output of exiftool if you want to analyze why it could cause a problem.

Doing this did uncover some other less serious problems, though:

1.  After all exiftool runs had finished, exiftoolgui used 100% of cpu time on one cpu (I have a dual cpu machine), for more than one minute (maybe several minutes -- I did not time it) before it was able to refresh the ui and show a message box telling me about warnings/errors.

2.  Then, after I clicked 'ok' on the message box, exiftoolgui again took a very long time at 100% cpu before it finally refreshed the ui again and became available for input.  I did have autorefresh turned on, and was displaying only filesystem information (so presumably exiftoolgui did not have to read the contents of all files).  Doing a manual refresh by clicking the 'refresh' button only takes a few seconds (at most 10 sec) on the same directory, so apparently a lot of unnecessary crunching is going on.

3.  One of the error messages from exiftool was for a non-image file in the directory.  Exiftool interprets *.* as meaning all files, while exiftoolgui only displays image files, and only certain image files.  In my case, this was not a problem, but in the case of image files that are not supported by exiftoolgui's file pane, but which are supported by exiftool, this can result in exiftool processing files that the user was not aware would be processed.

pb

Ok, I downloaded the beta and confirmed that I can run it.   (Have not yet had time to test it on the problem directory yet.)

Quote from: BogdanH on August 11, 2011, 12:58:22 PM
@pb
First, thanks again for taking time by giving additional info. As said, I can hardly test my code, bacause I don't get this "hangs". Here is a link to "beta" version of GUI:
Beta1GUI
-to start somewhere, I have significant increased pipe buffer.
Now, if you would be so kind to try it out... if buffer is/was the reason, then I expect that either: there's no "hang" anymore or much more than 257 of your files will be processed. If there will be no changes, I'll need to look elsewhere.
You've mentioned you get many "minor" errors.. have you tried to "check" Options>Ignore existing minor errors menu? Does it make any difference in number of processed files before "hang"?

Thank you,
Bogdan

PS: as soon you confirm downloading, BetaGUI will be deleted.

BogdanH