Batch process PDFs to update Title, Author & Creator metadata properties

Started by DBartholomew, March 03, 2024, 05:31:21 PM

Previous topic - Next topic

DBartholomew

I want to share a project I've been working on that allows me to batch process PDF documents in bulk, to prepare them for publication on our website.

I'm using ExifTool to replace the PDF metadata properties, Title, Author and Creator.  The Title is replaced using the PDF filename.  Author and Creator are replaced with predefined standard values contained in a batch file called PDF-Properties.bat.

I've spent several weeks developing and testing this solution on a variety of computers in our office, and hope others may find it useful.

The commands also preserve the file "Last Modified" date, and run in the context of the "Invoker".  The last feature was necessary because most of my coworkers do not have administrative permissions on our corporate Microsoft Windows network.  Without this enhancement the program would only run if the user was an administrator.

This is a great tool for us because we have to publish a lot of PDF documents on the internet.  Most or almost all of our documents do not have a PDF metadata property: "Title".  And our CMS system displays the PDF Title in the browser tab.  When the Title is empty, our CMS system displays a content ID number in the browser tab.

Replacing the Author and Creator metadata properties is useful because it allows us to insert the Company Name in place of the original author's name.  And that helps protect employee privacy.

This solution runs stand-alone, requiring no installation on Windows.  Simply copy the Exiftool folder to the desktop and run it in situ.

File Contents:

ClickMe.bat:
@echo off
cmd /min /C "set __COMPAT_LAYER=RunAsInvoker"
for /r "%cd%" %%x in (*.pdf) do call "ExifSys\PDF-Properties.bat" "%%x"
cls


PDF-Properties.bat
@echo off
"ExifSys\exiftool.exe" -P -overwrite_original -Title="%~n1" -Author="Your Name Here" -Creator="Your Name Here" %1

README.txt
This folder is set up to batch process PDF files and edit their Meta-Data Properties.

Instructions:

1.) Copy one or more PDF files into this ExifTool folder.
2.) Rename the files to have a meaningful filenames.
3.) Double click the file labeled "ClickMe.bat".

This will run the program and all the PDF files contained in the folder will have the following properties adjusted:

"Title" property will be set to the filename (without the ".pdf" extension)
"Author" and "Creator" property will be changed to "Your Name Here"

4.) Optionally, drag any individual PDF file onto the a copy of ExifTool.exe renamed to "ExifTool(-k).exe". 
A command window will appear displaying all of the PDF properties for that file. 
You can use this to confirm the program worked as intended.

Note: This forum limited the size of the .zip file I am attaching.  So I deleted a copy of Exiftool(-k).exe from the root folder to save space.  This is an optional feature used to inspect PDF file metadata.

One thing I wish I could control: 
Exiftool.exe runs in the command window and displays a status message as it runs:

"1 image files updated". 

I wish it recognized that we are working with PDF files and change the message to just say,
"1 files updated".

I hope others find this as useful as I do.

StarGeek

I'm not trying to put down your hard work, but you can probably change your ClickMe.Bat to this and drop the second bat file

@echo off
cmd /min /C "set __COMPAT_LAYER=RunAsInvoker"
"ExifSys\exiftool.exe" -P -overwrite_original -ext PDF "-Title<Basename" -Author="Your Name Here" -Creator="Your Name Here" .
cls

Your ClickMe.Bat appears to be running exiftool in a loop, which is Common Mistake #3.  Exiftool can directly process an entire directory at once without looping and it's biggest performance hit is the startup time, which can greatly extend processing time. The -ext (-extension) option is used to limit processing to only PDFs. The filename without the extension is copied to the title with
"-Title<Basename"

The dot represents the current directory and will process all files in the directory where the bat file is located.  Adding the -r (-recurse) option will allow recursion into subdirectories.

Also, remember that changes exiftool makes to PDFs are reversible unless the file is re-linearized with a program such as qpdf.  See note 1 & 2 on the PDF Tags page
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

DBartholomew

Cool thanks, I gave that a try and it works.  I appreciate your help.

DBartholomew

Quote from: StarGeek on March 03, 2024, 07:10:00 PMAlso, remember that changes exiftool makes to PDFs are reversible unless the file is re-linearized with a program such as qpdf.  See note 1 & 2 on the PDF Tags page

I'd like to re-linearize the PDFs with QPDF.exe however it seems to require a Windows installation.  My goal is to have a tool that doesn't require installing software since most of our users don't have permission to install Windows programs.

Do you know if QPDF can be run locally without running an installer?  It's not a show stopper to have reversible metadata, it would just seem more secure if the author and creator names were permanently removed. 

StarGeek

Quote from: DBartholomew on March 03, 2024, 08:44:17 PMDo you know if QPDF can be run locally without running an installer?

The Releases page on Github has a zip file for each of the builds.  I believe the msvc or mingw are Windows builds, just using different compilers. And there's a 32 and 64 bit version for each.

You might download one of those and run qpdf from the /bin directory and see if it will pick up the needed dll files that are in that directory.

I have qpdf installed through Chocolatey which takes care of things like updating the PATH, so I don't know if it can be considered portable and run from the /bin/ directory or not.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

DBartholomew

I've updated the attached zip file to include the revised command syntax and simplified configuration suggested in this thread. Thanks again StarGeek for the sage advice about looping and more efficient command syntax.

As before, I've removed a copy of ExifTool(-k).exe due to upload size limitations in this forum.  Simply copy ExifTool.exe from the ExifSys folder into the ExifTool root folder and rename it to use it.  Then drag PDFs onto ExifTool(-k).exe to inspect PDF metadata.