Problem: Settings multiple keywords in PDF files does not work correctly

Started by Dieter Geiss, January 27, 2021, 05:04:22 PM

Previous topic - Next topic

Dieter Geiss

I am using the newest version 12.16 on WIndows 10.

After setting multiple keywords as explained in the FAQ for a PDF file, it seems to work at first but then when you open the PDF viewer's properties you see it is not displayed correctly. The keywords are multiple Strings+. You can easily reproduce this problem.

I used this command line:
exiftool -sep ", " -keywords="one, two, three" Test.pdf

If you then call
exiftool Test.pdf

you get
Keywords                        : one, two, three

Looks correct but in the PDF Viewer you get this in the keywords field:
"one, two, three"

First it's only one line, not three. Second, the double quotation marks don't have to be there.

It should look like this in the PDF viewer:
one
two
three

If you put in these three lines manually and saved the file and then call
exiftool -sep "/" Test.pdf

you would see that the PDF Viewer saved it correctly:
Keywords                        : one/two/three

The question is: How can I add multiple lines to the multiple line-field keywords without double quotation marks in a PDF? What is the correct command line?

StarGeek

This is a case of multiple tags with the same name, so the command in FAQ #3 should be used to verify the actual location of the tags.

With that command, you will probably see something like this
C:\>exiftool -G1 -a -s -keywords y:/!temp/test.pdf
[PDF]           Keywords                        : one, two, three
[XMP-pdf]       Keywords                        : one, two, three


So the Keywords have been written to two tags.  The problem is that XMP-pdf:Keywords is not a list type tag but is a simple string tag.

I'm guessing that the PDF viewer is Adobe Reader?  Or is it some other PDF reader?

Adobe Reader ignores PDF:Keywords.  It fills the Keywords Property by combining  two tags, XMP-pdf:Keywords and XMP-dc:Subject.  The first is, as mentioned, a simple string and the second is a list type tag.  So, for example, if you use this command
exiftool -XMP:Keywords="one, two, three" -XMP:Subject="four, five, six" -sep ", " test.pdf
Reader will show four keywords
"one, two, three"; four; five; six

It is also important to note that the quotation marks are not part of what is saved in the file.  They are there to show that the text between them is a single keyword.

To add to all this fun, there is also XMP-pdf:Subject which is a simple string.  Exiftool avoids writing this unless it already exists or is specified by name.

The takeaway of all this is to only write XMP:Subject and not Keywords.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Dieter Geiss

Yes, it's Adobe Reader. Thanks for the clarification. The customer wants multiple lines in the field in the Reader's document properties. Not in one line. So he or she could copy that somewhere in a document (one line below the other).

I will try XMP:Subject.

StarGeek

Quote from: Dieter Geiss on January 27, 2021, 07:07:49 PM
The customer wants multiple lines in the field in the Reader's document properties.

You're customer is probably going to be disappointed as, at least with my copy, Reader doesn't list them on separate lines

Double quotes to group if needed, semicolons to separate the words.  Neither the quotes nor the semicolons are actually embedded in the file.  That's just how Reader groups and separates.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Dieter Geiss

Yes, but the strange thing is, using the reader, you can just input multiple lines. Then save the file, reopen it and you will se, the linebreaks will be preserved. If you then call the exiftool to show the value it will also show it as a "one liner" but with the correct breaks that is, each line is a "segment" in the output (separated by ", " or what you set as seprator). Isn't that strange?

If you looked in the PDF file as a text file you would then find this:

         <dc:subject>
            <rdf:Bag>
               <rdf:li>Line 1</rdf:li>
               <rdf:li>Line 2</rdf:li>
               <rdf:li>Line 3</rdf:li>
            </rdf:Bag>
         </dc:subject>

<pdf:Keywords>Line 1&#xD;&#xA;Line 2&#xD;&#xA;Line 3</pdf:Keywords>


So, both the Subject (as list items) and the Keywords (using the ascii codes for cr/lf) show the line breaks. The question is: Is there any possibility to write this to the file using the ExifTool?

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: Dieter Geiss on January 28, 2021, 03:45:43 AM
Yes, but the strange thing is, using the reader, you can just input multiple lines. Then save the file, reopen it and you will se, the linebreaks will be preserved.

I can't seem to get Reader to edit the keywords and checking online, the adobe forums seem to indicate that it isn't possible with reader (example post).  How did you get Reader to edit the keywords?

QuoteSo, both the Subject (as list items) and the Keywords (using the ascii codes for cr/lf) show the line breaks.

The Subject entries are not separated by line feed, they're separated by the <rdf:li>/</rdf:li> containers.  If you add -api Compact=AllSpace to an exiftool command, you would get
<dc:subject><rdf:Bag><rdf:li>Line 1</rdf:li><rdf:li>Line 2</rdf:li><rdf:li>Line 3</rdf:li></rdf:Bag></dc:subject>
and it would still show up the same way.

QuoteThe question is: Is there any possibility to write this to the file using the ExifTool?

After playing around with Reader and exiftool, it looks like you have duplicate the XMP-dc:Subject keywords into the pattern you want in XMP-pdf:keywords.  If you set just XMP-dc:Subject, you get a semicolon separated list.  If you set just XMP-pdf:keywords with the line breaks, you get quotes around it indicating a single keyword.

So you'll have to do something like this
exiftool -sep ", " -XMP-dc:Subject="Line 1, Line 2, Line 3" -E -XMP-pdf:Keywords="Line 1&#xA;Line 2&#xA;Line 3" test.pdf

I only used Line Feeds (;&#xA;) above, but if you wanted Carriage Return Line Feed, you would use &#xD;&#xA;.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype