Hi,
FYI there is a blog post about using ExifTool as a validation tool.
http://openpreservation.org/blog/2017/01/17/tiff-format-validation-easy-peasy/
Here is what they say:
Validation: ExifTool is not really meant for validation, either. It's for metadata extraction. The information about image errors is just a by-product if the tool runs into any problems while trying to extract metadata. So it's not really fair to treat ExifTool like a validation tool, as it would never complain about an absolute unreadable TIFF which cannot be opened by any viewer, as long as all the metadata can get extracted. That might be the reason why ExifTool has the highest percentage of presumably valid TIFF files within this test. So, "valid" for ExifTool means, that there were no warnings or errors in the metadata output.
Handling: It's a command-line-tool with quite good possibilities to batch whole folders and output human-readable csv (though the csv can have many, many columns, as images can have a myriad of metadata).
Interesting, thanks.
- Phil
Why do I get the feeling that Common Mistake 3 (http://www.exiftool.org/mistakes.html) was used during the testing :)
And this is the first I hear of the Google ImageTestSuite and it's gone :( Nevermind
Still a cool article.
Yes, it would be interesting to see the script they used.
I just downloaded the TIFF google image test suite (https://code.google.com/archive/p/imagetestsuite/downloads) from the link in the article, so it is still there. :)
I am also inspired to add a new -validate feature to the next ExifTool release. Currently only 7 of the 166 google test images pass the validation (or 49 if minor warnings are ignored).
- Phil
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
I just downloaded the TIFF google image test suite (https://code.google.com/archive/p/imagetestsuite/downloads) from the link in the article, so it is still there. :)
Ah, I didn't have GoogleApis.com whitelisted in my Noscript. I see it now.
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
Yes, it would be interesting to see the script they used.
I did not do the testing but as far as I know it was
exiftool -a -u -U –H -g1 -r -csv inputfolder > out.csv
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PMI am also inspired to add a new -validate feature to the next ExifTool release. Currently only 7 of the 166 google test images pass the validation (or 49 if minor warnings are ignored).
I think that would make a lot of people in the digital preservation community very happy. But as you see that is no easy task (or you have to limit that feature to selected file formats)
Quote from: Luiz on January 19, 2017, 02:39:06 PM
I did not do the testing but as far as I know it was
exiftool -a -u -U –H -g1 -r -csv inputfolder > out.csv
OK. That's not really a very good way to validate images.
QuoteI think that would make a lot of people in the digital preservation community very happy. But as you see that is no easy task (or you have to limit that feature to selected file formats)
I'm not thinking about writing a full validator for all file formats. I don't know if it is even feasible to implement a full validation of TIFF images. I'm just thinking about an option to add more validation at the expense of processing speed for users that are looking for this.
- Phil
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
I am also inspired to add a new -validate feature to the next ExifTool release.
If you want to go ahead with this topic, there is another blogpost called "repairing TIFF images - a preliminary report" containing a collection of real world examples with common errors in TIFF. Maybe to know these errors helps you.
https://kulturreste.blogspot.de/2017/01/repairing-tiff-images-preliminary-report.html
Quote from: Phil Harvey on January 20, 2017, 07:29:20 AM
OK. That's not really a very good way to validate images.
What would be your adivce for validating and simultaneously getting as much metadata as possible out of an image?
Quote from: Luiz on February 07, 2017, 05:47:11 AM
If you want to go ahead with this topic, there is another blogpost called "repairing TIFF images - a preliminary report" containing a collection of real world examples with common errors in TIFF. Maybe to know these errors helps you.
https://kulturreste.blogspot.de/2017/01/repairing-tiff-images-preliminary-report.html
Thanks.
QuoteWhat would be your adivce for validating and simultaneously getting as much metadata as possible out of an image?
With ExifTool 10.41 or later, I would do this:
exiftool -api validate -a -u -G1 FILEThis will add extra warnings when problems are detected.
But before the validate feature was added, I would have recommended trying to write something to the file using ExifTool since the writing code was much more strict than the reader.
- Phil
Phil, this new validation feature looks very helpfull. Thanks for the advice and explanation.