Batch removing metadata from 300+ PDFs in subfolders

Started by mackey, September 08, 2021, 11:38:09 AM

Previous topic - Next topic

mackey

Hi, I am in the process of organizing a company's entire WooCommerce product catalog in preparation for uploading to their new website. I found that the PDF schematics attached to most products have some metadata that we would like to strip (Author, Title, sometimes the file location of the original file on the computer it was created on C:\etc. etc.). The PDFs (~320 of them) are in a bunch of different subfolders, is there a way to strip all of the PDF metadata from any PDFs in the entire root folder?

I've tried doing this with Acrobat and for some reason the quality of the PDF drawings deteriorated greatly. Any help would be appreciated!

StarGeek

The simplest command would be
exiftool -All= -ext pdf /path/to/pdfs/

But exiftool's edits to PDFs are reversible (see the PDF tags page) and the files would have to be re-linearized to permanently remove the data, which isn't as easy to do in batch.  That link gives a command that can be used with qpdf, but that program can only be used on one file at a time and doesn't directly edit the original.

I would think that there should be an option somewhere on Acrobat to tell it not to re-compress the images, but I don't have access to it to double check.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

mackey

Maybe I can get away with just running exiftool, not sure how many people will care enough to try to reverse the process to see the metadata (there's nothing sensitive, I just wanted to clean the files up). If I run exiftool like you said will it seek out and find any/all PDFs in the subfolders?

StarGeek

Add the -r (-recurse) option to recurse into subdirectories.  Don't  use something like *.pdf, as that will block recursion (see above link and Common Mistake #2).  Just pass a directory name and the -ext (-extension) option will limit processing to PDFs.

The above command will create backup files. Add -overwrite_original option to suppress this.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype