Corrupted PDF files for testing

Started by bertalanimre, February 20, 2017, 06:07:15 AM

Previous topic - Next topic

bertalanimre

Hey Forum,

I was wondering... I'm creating a bash script that will clean PDF files with the combo of GhostScript, Exiftool and QPDF (will share on GitHub, but later). However I wish to test it with PDF files where exiftool would fail with the following command:
exiftool -all= -XMPToolkit= -overwrite_original filename.pdf

When exactly the exiftool returns an error stating that it cannot write the metadata on the file? I assuem permissions, but what if the PDF file is damaged or corrupted somehow? Can you guys give me example PDF files how I can test the command? The script will try to execute the command and if it finishes with a nonzero (so with an error) then it will write the error and the filename into a txt fiile for further debug.

Phil Harvey

Just rename any random file to .pdf and the command will fail (as long as the file isn't a format that ExifTool can write).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

bertalanimre

Actually I've found out, exiftool FIXES the corrupt pdf files. xD

I've made an example pdf file corrupted on this site:
https://corrupt-a-file.net/

OK, the pdf became unreadable. However exiftool was able to read some metadata, tho it also notified my about the files state and that it is corrupted. Then I've tried to remove every data with -all= and what happened? The PDF became readable and openable. :D It was a blank page, so all the information was lost, but still, it partly fixed the metadata in it. Lol or it is just supprising for me only?

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).