Hey Forum!
I know that changes with Exiftool in PDF are reversible, please keep in mind, my main goal is (yet) not to erease data, just to modify it.
According to the manual, this should be a working command I assume:
exiftool -args -extractEmbedded -all:Creator= Cleaned/out.pdf
The output is:
0 image files updated
1 image files unchanged
If I examine the command with v2 enabled and grepped to the word "Creator", I get the following:
Deleting PostScript:Creator
Deleting PDF:Creator
Deleting XMP-iptcExt:Creator
Deleting XMP-pdf:Creator
Deleting XMP-dc:Creator
However, if I ask for the detailed information in the file and I grep it to Creator again, I'm still having my creator named in the file:
Without v2:
-Creator=John Doe
With v2:
| | | CreatorTool = Adobe InDesign CS6 (Macintosh)
| | | - Tag 'x:xmpmeta/rdf:RDF/rdf:Description/xmp:CreatorTool'
Even if I use -all:all=
, the information still remains in that PDF. Can it be write protected? How can I erease the information without damaging the content of tables?
How is this even possible? Did I execute the command wrong? Please give me a hint or something.
Thanks in advance!
Bert
Wuhu, found it!
It was GhostScript which helped me out. Altho it is not as good as ExifTool with PDF-s, but it does remove the embedded metadata without harming the table of contents. After this, I can simply remove all the metadata with exiftool and make it permanent with a qpdf linearization.
Phil: If you are intrested I can give the whole process to you in a bash script file I'm working on at the moment. It is not a big deal but might give you some good thoughts. :)
Modify message
Quote from: bertalanimre on February 03, 2017, 03:54:45 AM
According to the manual, this should be a working command I assume:
exiftool -args -extractEmbedded -all:Creator= Cleaned/out.pdf
Without v2:
-Creator=John Doe
With v2:
| | | CreatorTool = Adobe InDesign CS6 (Macintosh)
| | | - Tag 'x:xmpmeta/rdf:RDF/rdf:Description/xmp:CreatorTool'
Of course "CreatorTool" is still there because you just deleted "Creator". The
-v2 option won't show tags which have been deleted (even though they still remain in dead sections of the PDF file).
Quote from: bertalanimre on February 03, 2017, 05:59:32 AM
Phil: If you are intrested I can give the whole process to you in a bash script file I'm working on at the moment. It is not a big deal but might give you some good thoughts. :)
It isn't useful for me, but it would be of use to others, so posting it here would be great.
- Phil
OK, so the following set of commands do the following: It deletes not just plain metadata but also the embedded metadata from PDF files which normally won't be deleted by Exiftool
Requirements:
- Exiftool 10.35
Qpdf 6.0.0
GhostScript 9.20
Commands that can be put into a bash script if you wish:
gs -q -sOutputFile=temp.pdf -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress input.pdf
exiftool -args -extractEmbedded -all= temp.pdf 2> /dev/null
exiftool -args -extractEmbedded -XMPToolkit= ./temp.pdf 2> /dev/null
rm -rf input.pdf
qpdf --linearize temp.pdf input.pdf
find ./ -name '*.pdf_original' -type f -exec rm -rf {} \;
rm -rf temp.pdf
I hope it helps somebody. I've spent the last 2 days getting this together in a quite complicated script cleaning PDFs and InDesign files :)
Just a few comments:
You can avoid the "find" line by adding -overwrite_original to you ExifTool commands. Also, the -args and -extractEmbedded options do nothing when writing. As well, I don't see what use -XMPToolkit= is since you already removed all XMP with -all= earlier. So I think this single ExifTool command will do what you want:
exiftool -all= -overwrite_original temp.pdf 2> /dev/null
- Phil
Cheer Phil for the ideas. In the next version, I'll implement it.
However the XMPToolkit is being written back into the file once. Honestly I've skipped one line from my own script where I add a title and an author with Exiftool. Then the XMPToolkit get's filled up telling that the file was modified with Exiftool. That is why I've got this last part sticked in.
But cheers again. You are a great help to all of us. :) And I thank you for that!
OK then, the command would be:
exiftool -all= -xmptoolkit= -author="author name" -title="some title" -overwrite_original temp.pdf 2> /dev/null
Still, no need for multiple commands.
- Phil
Thank you! Works like a charm. :)
Phil,
I deleted my pdf metadata
Creator and
Creator Tool by using:
exiftool -creator= xxx.PDF
Is this thread saying they are not actually deleted but rather still stored somewhere in the PDF? I am using Windows so I am guessing some of this conversation doesn't apply to Windows. Is that correct?
Thanks
Quote from: Phil Harvey on February 03, 2017, 11:26:32 AM
OK then, the command would be:
exiftool -all= -xmptoolkit= -author="author name" -title="some title" -overwrite_original temp.pdf 2> /dev/null
Still, no need for multiple commands.
- Phil
ExifTool does the same thing on all platforms. Windows is not special.
I think that reading the start of the PDF tags documentation (https://exiftool.org/TagNames/PDF.html) will answer your question.
- Phil