ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: Luiz on January 17, 2017, 05:18:39 AM

Title: ExifTool used for TIFF format validation
Post by: Luiz on January 17, 2017, 05:18:39 AM
Hi,

FYI there is a blog post about using ExifTool as a validation tool.
http://openpreservation.org/blog/2017/01/17/tiff-format-validation-easy-peasy/

Here is what they say:

Validation: ExifTool is not really meant for validation, either. It's for metadata extraction. The information about image errors is just a by-product if the tool runs into any problems while trying to extract metadata. So it's not really fair to treat ExifTool like a validation tool, as it would never complain about an absolute unreadable TIFF which cannot be opened by any viewer, as long as all the metadata can get extracted. That might be the reason why ExifTool has the highest percentage of presumably valid TIFF files within this test. So, "valid" for ExifTool means, that there were no warnings or errors in the metadata output.

Handling: It's a command-line-tool with quite good possibilities to batch whole folders and output human-readable csv (though the csv can have many, many columns, as images can have a myriad of metadata).
Title: Re: ExifTool used for TIFF format validation
Post by: Phil Harvey on January 17, 2017, 07:49:48 AM
Interesting, thanks.

- Phil
Title: Re: ExifTool used for TIFF format validation
Post by: StarGeek on January 17, 2017, 12:01:18 PM
Why do I get the feeling that Common Mistake 3 (http://www.exiftool.org/mistakes.html) was used during the testing :)

And this is the first I hear of the Google ImageTestSuite and it's gone :( Nevermind

Still a cool article.
Title: Re: ExifTool used for TIFF format validation
Post by: Phil Harvey on January 17, 2017, 01:41:39 PM
Yes, it would be interesting to see the script they used.

I just downloaded the TIFF  google image test suite (https://code.google.com/archive/p/imagetestsuite/downloads) from the link in the article, so it is still there. :)

I am also inspired to add a new -validate feature to the next ExifTool release.  Currently only 7 of the 166 google test images pass the validation (or 49 if minor warnings are ignored).

- Phil
Title: Re: ExifTool used for TIFF format validation
Post by: StarGeek on January 17, 2017, 02:13:48 PM
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
I just downloaded the TIFF  google image test suite (https://code.google.com/archive/p/imagetestsuite/downloads) from the link in the article, so it is still there. :)

Ah, I didn't have GoogleApis.com whitelisted in my Noscript.  I see it now.
Title: Re: ExifTool used for TIFF format validation
Post by: Luiz on January 19, 2017, 02:39:06 PM
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
Yes, it would be interesting to see the script they used.

I did not do the testing but as far as I know it was
exiftool -a -u -U –H -g1 -r -csv inputfolder > out.csv

Quote from: Phil Harvey on January 17, 2017, 01:41:39 PMI am also inspired to add a new -validate feature to the next ExifTool release.  Currently only 7 of the 166 google test images pass the validation (or 49 if minor warnings are ignored).

I think that would make a lot of people in the digital preservation community very happy. But as you see that is no easy task (or you have to limit that feature to selected file formats)
Title: Re: ExifTool used for TIFF format validation
Post by: Phil Harvey on January 20, 2017, 07:29:20 AM
Quote from: Luiz on January 19, 2017, 02:39:06 PM
I did not do the testing but as far as I know it was
exiftool -a -u -U –H -g1 -r -csv inputfolder > out.csv

OK.  That's not really a very good way to validate images.

QuoteI think that would make a lot of people in the digital preservation community very happy. But as you see that is no easy task (or you have to limit that feature to selected file formats)

I'm not thinking about writing a full validator for all file formats.  I don't know if it is even feasible to implement a full validation of TIFF images.  I'm just thinking about an option to add more validation at the expense of  processing speed for users that are looking for this.

- Phil
Title: Re: ExifTool used for TIFF format validation
Post by: Luiz on February 07, 2017, 05:47:11 AM
Quote from: Phil Harvey on January 17, 2017, 01:41:39 PM
I am also inspired to add a new -validate feature to the next ExifTool release.

If you want to go ahead with this topic,  there is another blogpost called "repairing TIFF images - a preliminary report" containing a collection of real world examples with common errors in TIFF. Maybe to know these errors helps you.
https://kulturreste.blogspot.de/2017/01/repairing-tiff-images-preliminary-report.html

Quote from: Phil Harvey on January 20, 2017, 07:29:20 AM
OK.  That's not really a very good way to validate images.

What would be your adivce for validating and simultaneously getting as much metadata as possible out of an image?
Title: Re: ExifTool used for TIFF format validation
Post by: Phil Harvey on February 07, 2017, 07:35:52 AM
Quote from: Luiz on February 07, 2017, 05:47:11 AM
If you want to go ahead with this topic,  there is another blogpost called "repairing TIFF images - a preliminary report" containing a collection of real world examples with common errors in TIFF. Maybe to know these errors helps you.
https://kulturreste.blogspot.de/2017/01/repairing-tiff-images-preliminary-report.html

Thanks.

QuoteWhat would be your adivce for validating and simultaneously getting as much metadata as possible out of an image?

With ExifTool 10.41 or later, I would do this:

exiftool -api validate -a -u -G1 FILE

This will add extra warnings when problems are detected.

But before the validate feature was added, I would have recommended trying to write something to the file using ExifTool since the writing code was much more strict than the reader.

- Phil
Title: Re: ExifTool used for TIFF format validation
Post by: Luiz on February 09, 2017, 11:27:33 AM
Phil, this new validation feature looks very helpfull. Thanks for the advice and explanation.