Search and Replace Keywords in PDF

Started by riffrack, November 21, 2013, 08:26:46 AM

Previous topic - Next topic

riffrack

Hello there

A complete newbie here. I have spent a good few hours searching through the various sites explaining ExifTool, however I cannot find a straight forward explanation on how to accomplish the following:

I have a few hundred pdf files with almost correct keywords stored in XML. Now I want to search & replace only part of these keywords. E.g. a spelling mistake: the XML tag was spelt <privat></privat> instead of <private></private>. This is part of a larger XML string in the keywords metadata.

In another case I need to remove an XML tag completely: <Country>UK</Country><Country>IT</Country><Country>US</Country>. Here I would need to remove the value <Country>IT</Country> completely or in other words replace <Country>IT</Country> with an empty string.

I use Windows OS.

Can this be done with a single command line or is it more complicated? I would need to run this as a batch.

Any help is much appreciated.

Phil Harvey

Sorry, but ExifTool doesn't write XML.  (Unless you are talking about XMP, which is RDF/XML)

So I don't think you can use ExifTool to do what you want.

But if you are talking about XMP, removing the XMP Country tag is done like this:

exiftool -country= some.pdf

But note that PDF edits by ExifTool are reversible, so the original information still actually exists in the file.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

riffrack

Hi Phil

Thanks for your answer. All I want to do is replace a string. It doesn't matter if the text in the keywords field is formatted as XML or not. Let's say the keywords just contained plain text: "This is a PDF documeent" and I would want to replace the typo "documeent" with "document". Can I do this with ExifTool? Simple Search and Replace.

If I open a PDF document in Adobe Reader, I can click on Properties and edit the keywords field with any text I want. Somehow I would like to able to correct this data.

Any help much appreciated.

Phil Harvey

Yes.  You will be able to do this with ExifTool.

First, use ExifTool to extract the information with the -s option:

exiftool -s file.pdf

Then, look through the information until you find the name of the tag you want to change.  Use a command like this to write the new information (where TAG is the name of the tag to write):

exiftool -TAG="some new string" file.pdf

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).