ExifTool Forum

General => Metadata => Topic started by: brabo on January 25, 2023, 08:05:42 AM

Title: PDF keywords
Post by: brabo on January 25, 2023, 08:05:42 AM
I want to populate the Title, Author, Subject, and Keywords fields in a PDF. These fields are all empty in the target file (see Before.png and TheRoadNotTaken.pdf). I'm using the following command:

exiftool -Title="The Road Not Taken" -Author="Robert Frost" -Subject="Famous Poems: The Road Not Taken by Robert Frost" -sep ", " -Keywords="road, Robert Frost, famous poems, poetry" TheRoadNotTaken.pdf

In the resulting file (see After.png and TheRoadNotTaken - After ExifTool.pdf), the Title, Author, and Subject are populated correctly but Keywords is populated incorrectly as per my expectation:

1. There shouldn't be quotation marks.
2. The part after the second quotation mark -- ; Famous Poems: The Road Not Taken by Robert Frost -- shouldn't be there (it appears to have been copied from Subject.
3. The 4 keywords should be considered to be individual keywords, not one long concatenated keyword.

Can anyone please help?
Title: Re: PDF keywords
Post by: greybeard on January 25, 2023, 08:47:55 AM
Looks OK to me

I just ran the following command on MacOS

exiftool -Title -Author -Subject -Keywords -sep "//" 'TheRoadNotTaken - After ExifTool.pdf'
and got this output

Title                          : The Road Not Taken
Author                         : Robert Frost
Subject                        : Famous Poems: The Road Not Taken by Robert Frost
Keywords                       : road//Robert Frost//famous poems//poetry
Title: Re: PDF keywords
Post by: brabo on January 25, 2023, 09:41:35 AM
Thanks for checking. I'm using a Windows 10 PC and am seeing the incorrect results in Adobe Acrobat Reader v2022 by clicking on File -> Properties.
Title: Re: PDF keywords
Post by: greybeard on January 25, 2023, 10:14:46 AM
Quote from: brabo on January 25, 2023, 09:41:35 AMThanks for checking. I'm using a Windows 10 PC and am seeing the incorrect results in Adobe Acrobat Reader v2022 by clicking on File -> Properties.

Are you sure thats not just the way Acrobat shows properties? I tried a couple of other programs (Chrome and Mac Preview) and they show what you would expect.
Title: Re: PDF keywords
Post by: brabo on January 25, 2023, 10:44:11 AM
I have multiple PDFs that show a simple list of words and phrases separated by commas in the Keywords field in Reader; they don't include quotation marks and don't repeat the content of the Subject field.

But I did not create those PDFs so don't know how those fields were populated. Also I haven't used ExifTool on those files; I received them with all the metadata already populated.

So I'm assuming that those are examples of how Acrobat typically displays keywords.
Title: Re: PDF keywords
Post by: brabo on January 25, 2023, 12:36:18 PM
I just found this screenshot online as an example.
Title: Re: PDF keywords
Post by: Hubert on January 25, 2023, 03:43:48 PM
Writing the tag -xmp-pdf:subject (as opposed to just -subject) prevents the Subject field being duplicated in the Keywords field in Acrobat Reader.

I haven't (yet) discovered a way of removing the enclosing double quotes in Acrobat Reader. In Preview on macOS they are displayed like this:

(https://i.ibb.co/GTPrPGB/keywords.png) (https://ibb.co/sH3X3ck)


Title: Re: PDF keywords
Post by: brabo on January 25, 2023, 04:56:21 PM
Thank you! That solves a big part of the problem!
Title: Re: PDF keywords
Post by: brabo on January 26, 2023, 10:57:56 AM
If anyone knows how to get rid of the quotation marks in the Keywords field, please let me know. Thanks!

exiftool -Title="The Road Not Taken" -Author="Robert Frost" -xmp-pdf:Subject="Famous Poems: The Road Not Taken by Robert Frost" -sep ", " -Keywords="road, Robert Frost, famous poems, poetry" TheRoadNotTaken.pdf
Title: Re: PDF keywords
Post by: greybeard on January 26, 2023, 11:21:34 AM
Quote from: brabo on January 26, 2023, 10:57:56 AMIf anyone knows how to get rid of the quotation marks in the Keywords field, please let me know. Thanks!

exiftool -Title="The Road Not Taken" -Author="Robert Frost" -xmp-pdf:Subject="Famous Poems: The Road Not Taken by Robert Frost" -sep ", " -Keywords="road, Robert Frost, famous poems, poetry" TheRoadNotTaken.pdf

Maybe this is a question for the Adobe forum - the quotes are not physically in the file so its the way that Adobe chooses to display the keywords
Title: Re: PDF keywords
Post by: StarGeek on January 26, 2023, 01:52:45 PM
You should take note that there are several PDF tags that have conflicting uses and setup.  There are PDF specifc tags (https://exiftool.org/TagNames/PDF.html) and then there are XMP PDF tags (https://exiftool.org/TagNames/XMP.html#pdf).  And there is some conflict between those tags and the more common Dublin Core XMP tags (https://exiftool.org/TagNames/XMP.html#dc), which are often in PDFs as well.

For example, Subject is a list type tag often used keywords and can appear in PDFs. But PDF:Subject and XMP-pdf:Subject are strings, with the latter marked as Avoid, so you have to explicitly use XMP-pdf:Subject to set it.

The PDF:Keywords is a list type tag, but XMP-pdf:Keywords is a string.

This basically comes down to a FAQ #3 (https://exiftool.org/faq.html#Q3).  Find a file that has the data in the right places, or if you have a program that writes the data, use that to write unique values to all the tags you want to use.  Then use the command in FAQ #3 to figure out what are the actual tags you want to use.