ExifTool Forum

General => Metadata => Topic started by: nbsusa on June 06, 2016, 02:07:13 PM

Title: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 06, 2016, 02:07:13 PM
I am using exiftool to write some keywords into a PDF. I am using a simple command line and getting somewhat of what I am in need of. I have attempted both -xmp:subject and -keywords with pretty much the same result.

This is what I am using right now in my Windows command line (using most recent version of exiftool as of today):

exiftool  -xmp:Subject="Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield, Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies)." -sep ";"MPHS-CCR_6426.pdf -overwrite_original

When I view PDF properties I see the following (note leading ; and space as well as quotes beginning right before Environmental):

; Document 6426;  Report;  1967 Aug 23; " Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield, Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies)."

When viewing Advanced. Dublin Core. Subject. the keywords are correct.

How can I get this to display in the normal PDF properties without the leading semicolon and space and no quotes. When using -keywords I get quotes around everything and the Dublin core keywords are not split up.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Hayo Baan on June 06, 2016, 03:53:15 PM
You get the leading spaces because you entered them yourself after the ; ;)
The solution is simple, don't enter them or specify the separator as "; ".

As for the leading semi colon (the list separator), I don't see this in the output of exiftool. The same goes for the quotes. In both cases my guess would be that the application you use to view the file info is causing this (in case of the quotes, probably because there are "special" characters in the keyword).  What application are you using to view the file info?
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 06, 2016, 04:03:17 PM
Thanks for the reply. I am using both Adobe Acrobat and Acrobat Reader to view the completed PDF's. Our customer is requiring the first semicolon (with the space following) to be removed, as well as the quotes that show up. If I change my command line file to use -keywords= instead of -xmp:subject= I do not get the leading semicolon but I get quotes around the entire string (which should be 4 separate keyword fields)
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: StarGeek on June 06, 2016, 05:57:20 PM
You might want to try the command in FAQ 3 (http://www.exiftool.org/faq.html#Q3).  I'm guessing that Adobe is reading the keywords from multiple tags and combining them.  Plus, there are PDF specific tags, like PDF:Keywords.

Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 06, 2016, 08:50:16 PM
I also suggest reading FAQ 17 (https://exiftool.org/faq.html#Q17).

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 09:07:28 AM
Not really sure where to go from here as neither FAQ3 or FAQ17 helped with the issue.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 09:59:16 AM
Did you use Adobe Acrobat to change the metadata as required by the customer, then did you use ExifTool to read this metadata?  You should then be able to write the metadata exactly like this using ExifTool.

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 10:29:59 AM
Same issue. I exported a csv that contained the info I hand typed into the PDF Keywords field, modified the csv, then imported it. I still get the leading semicolon. The issue with the quotes I was having was due to a comma within part of the text. But, I need to get rid of the leading semicolon. I have no idea why that would keep appearing.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 10:48:37 AM
Some concrete examples are necessary.  Can you post the modified CSV and the command you used to import it?

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 01:37:37 PM
I've attached the output I created using the following command line:

exiftool -csv -r MPHS-CCR_6426.pdf > out.csv

I've also attached my modified csv and using the following command line to update the PDF. The only things I changed were to remove the Keyword field and modify the Subject field.

exiftool -sep ";" -csv=in.csv MPHS-CCR_6426.pdf -overwrite_original
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 01:45:09 PM
Here's what I get (changing the SourceFile to a.pdf so I can test your CSV file):

% exiftool a.pdf -sep ";" -csv=/Users/phil/Desktop/in.csv
    1 image files updated

% ./exiftool a.pdf -keywords -subject -G1
[XMP-dc]        Subject                         : Document 77777,  Report,  1967 Aug 23,  Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).
[PDF]           Subject                         : Document 77777; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).


I don't see the leading semicolon that you mention.

The only problem I see is an extra space before all entries after the first in XMP:Subject.  You should use -sep "; " (semicolon+space) in the first command above to avoid this.

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 02:44:04 PM
Phil, it's when you physically open the PDF in Adobe Acrobat or Reader and view File/Properties and look at Keywords where you will see it. I initially had the -sep with "; " but a previous reply told me to take the space out.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 02:57:23 PM
I think the previous reply mentioned to either remove the space from after the semicolon in the tag value, or add a space in the -sep argument.

Could you post the ExifTool output has I have done (the second command in my last post) for two PDF files:  one that shows the extra semicolon in Adobe Acrobat, and one that doesn't?

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 03:50:06 PM
Phil, I get the same look that you get when running that. It does not show the leading semicolon running either one. The only way it shows up is when looking at the PDF properties in Acrobat. I have not looked in any other PDF viewer to see if it is there because the client needs it to look a certain way in Acrobat. I'm attaching a screen shot.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 04:05:06 PM
Yes, but what is the difference in the metadata (comparing ExifTool outputs) between one that shows the leading semicolon in Acrobat and one that doesn't?  There must be a difference, and ExifTool will show you what it is.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 04:11:15 PM
With the leading semicolon:
[PDF]           Keywords                        : Document, 6426;, Report;, 1967, Aug, 23;, Environmental, Education, for, Urban, Schools, (An, Address, Delivered, at, the, 14th, Annual, National, Conservation, Education, Association, Conference, in, Springfield, Missouri), /, by, Edward, A., Ames, (Executive, Director, of, Wave, Hill, Center, for, Environmental, Studies).
[XMP-dc]        Subject                         : Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).


Without leading semicolon:
[PDF]           Keywords                        : Document, 6426;, Report;, 1967, Aug, 23;, Environmental, Education, for, Urban, Schools, (An, Address, Delivered, at, the, 14th, Annual, National, Conservation, Education, Association, Conference, in, Springfield, Missouri), /, by, Edward, A., Ames, (Executive, Director, of, Wave, Hill, Center, for, Environmental, Studies).
[XMP-dc]        Subject                         : Document 6426, Report, 1967 Aug 23, Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).


I see commas in the one without the semicolon even though when viewing in PDF properties it has semicolons.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 04:15:37 PM
This output is very different than the one I posted.  The XMP-dc:Subject is stored incorrectly in the one with the leading semicolon.   It was written without the -sep "; " option, so it is one long string instead of separate items.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 07, 2016, 04:28:03 PM
Sorry about that.

The one with the leading semicolon I just ran and this is all it output now:

[XMP-dc]        Subject                         : Document 6426, Report, 1967 Aug 23, Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 07, 2016, 04:30:11 PM
OK, so what is the difference between this and the other?  You might have to compare the full ExifTool output.

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: StarGeek on June 07, 2016, 04:32:44 PM
I believe the problem can be seen right there in the CSV file.  Keywords in the original file exists, but is empty.

I went and downloaded Adobe reader to see what tags is was reading.  It looks like the relevant fields are XMP-pdf:Keywords and XMP-dc:Subject.  Reader reads both of these tags and shows them as "Keywords".  Adding to the fun is the fact that XMP-pdf:Keywords appears to be a string, not a list, using semi-colons as separators.

nbsusa, if you clear out XMP-pdf:Keywords first, you probably will remove the leading blank keyword.
exiftool -XMP-pdf:Keywords=
  see edit 2

edit:
Command I used to fill the tags:
exiftool -XMP-pdf:Subject="XMP-pdf:Subject" -XMP-pdf:Keywords="XMP-pdf:Keywords" -XMP-dc:Subject="XMP-dc:Subject" -PDF:Subject="PDF:Subject" -PDF:Keywords="PDF:Keywords"
And what Adobe reader showed
(http://imgur.com/P88ixyc.jpg)

Edit2: Actually, it may not.  I cleared XMP-pdf:Keywords out, double checked, and even though the XMP-pdf:Keywords tag didn't exist, Reader showed the leading semicolon in Properties.  Clearing out XMP-dc:Subject and just using XMP-pdf:Keywords did not leave any leading or trailing semicolons.  The solution might be to just use XMP-pdf:Keywords.  And since that tag appears to be a string, not a list, it won't be affected by the -sep option.  You just have to separate the keywords with (SemicolonSpace).
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 08, 2016, 09:04:30 AM
I'm still not getting the desired result.

If I use -sep "; " -xmp-pdf:keywords="Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies)."

Then the PDF (in Reader) displays "Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies)."

Note the quotes around everything. This does not split the four separate keywords (I can see they are not split in Adobe Acrobat) and does not display them in Dublin Core Properties as separate keyword lists.

Maybe this is not possible to do with exiftool?
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 08, 2016, 09:22:54 AM
You should be using XMP-dc:Subject, not XMP-pdf:Keywords.

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 08, 2016, 09:39:18 AM
Phil, when I use:
-sep "; " -XMP-dc:Subject="Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies)."

I get this in the PDF Keywords (viewing in Reader)

; Document 6426; Report; 1967 Aug 23; Environmental Education for Urban Schools (An Address Delivered at the 14th Annual National Conservation Education Association Conference in Springfield Missouri) / by Edward A. Ames (Executive Director of Wave Hill Center for Environmental Studies).

Back to the leading semicolon. However, the Dublin Core properly displays the 4 separate keyword lists.
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 08, 2016, 11:32:35 AM
Sorry.  I just re-read StarGeek's post in which he suggested using XMP-pdf:Keywords.  This tag is a simple string, not a list like XMP-dc:Subject.

StarGeek suggested using XMP-pdf:Keywords because it avoided the leading semicolon, but it doesn't look like this is the right solution if Acrobat puts quotes around a complex string like this.

I don't understand why you continue to have this problem.  ExifTool can write anything you want, so it should be easy to reproduce exactly what Acrobat writes.

You have worn me down, so I finally gave in and tested this with a trial version of Adobe Acrobat.  Here is what I did (and what we have been trying to get you to do all along):

1. Write the Keywords you want using Adobe Acrobat.  (For this test I wrote "test 1; test 2; test 3", without the quotes)

2. Use ExifTool to see what was written:

> exiftool ~/Desktop/a.pdf -G1 -a -subject -keywords
[XMP-dc]        Subject                         : test 1, test 2, test 3
[XMP-pdf]       Keywords                        : test 1; test 2; test 3
[PDF]           Keywords                        : test, 1;, test, 2;, test, 3


3. Use ExifTool to write the same thing to another file:

> exiftool ~/Desktop/b.pdf -xmp-dc:subject="test 1, test 2, test 3" -xmp-pdf:keywords="test 1; test 2; test 3" -pdf:keywords="test, 1;, test, 2;, test, 3" -sep ", "
    1 image files updated


4. Open the other file in Acrobat to verify that the Keywords appear as desired --> YES THEY DO!

The only trick was using -sep ", " to split the strings into separate list items when writing.

Note that even though it looks nice in Acrobat, it is wrong.  Acrobat did not split the PDF:Keywords into separate items properly.

- Phil
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: nbsusa on June 08, 2016, 11:52:13 AM
That worked like a champ! I apologize for not quite understanding everything you were asking me to do. I didn't realize I needed to do all 3. My fault and I appreciate everything everyone did. Thank you very much!
Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: StarGeek on June 08, 2016, 02:46:44 PM
Adobe Reader completely ignored PDF:Keywords.  Annoying difference between two pieces of software by the same company.

Title: Re: Leading semicolon and space as well as quotes in keywords
Post by: Phil Harvey on June 08, 2016, 07:13:53 PM
I have no idea which of those 3 tags are important.  However, Adobe Acrobat wrote them all, so that's what I did too.

- Phil