ExifTool Forum

General => Metadata => Topic started by: bertalanimre on February 20, 2017, 06:07:15 AM

Title: Corrupted PDF files for testing
Post by: bertalanimre on February 20, 2017, 06:07:15 AM
Hey Forum,

I was wondering... I'm creating a bash script that will clean PDF files with the combo of GhostScript, Exiftool and QPDF (will share on GitHub, but later). However I wish to test it with PDF files where exiftool would fail with the following command:
exiftool -all= -XMPToolkit= -overwrite_original filename.pdf

When exactly the exiftool returns an error stating that it cannot write the metadata on the file? I assuem permissions, but what if the PDF file is damaged or corrupted somehow? Can you guys give me example PDF files how I can test the command? The script will try to execute the command and if it finishes with a nonzero (so with an error) then it will write the error and the filename into a txt fiile for further debug.
Title: Re: Corrupted PDF files for testing
Post by: Phil Harvey on February 20, 2017, 07:40:31 AM
Just rename any random file to .pdf and the command will fail (as long as the file isn't a format that ExifTool can write).

- Phil
Title: Re: Corrupted PDF files for testing
Post by: bertalanimre on February 20, 2017, 07:54:18 AM
Actually I've found out, exiftool FIXES the corrupt pdf files. xD

I've made an example pdf file corrupted on this site:
https://corrupt-a-file.net/

OK, the pdf became unreadable. However exiftool was able to read some metadata, tho it also notified my about the files state and that it is corrupted. Then I've tried to remove every data with -all= and what happened? The PDF became readable and openable. :D It was a blank page, so all the information was lost, but still, it partly fixed the metadata in it. Lol or it is just supprising for me only?
Title: Re: Corrupted PDF files for testing
Post by: Phil Harvey on February 20, 2017, 08:32:45 AM
This is surprising.