Hi
Can someone help me understand why the following command will overwrite the Title tag, but not delete the subject tag of my PDF file?
exiftool -Title="My Title" -Subject= myfile.pdf
Oddly, the value of the Subject field still shows in Acrobat Reader DC or PDF Architect, but not when extracting metadata from Exiftool:
exiftool -a myfile.pdf
Upon inspection of the PDF file in a text editor, I can still see this towards the end of the file:
/Title()
/Subject(Original_value)
The desired values only show in a later %BeginExifToolUpdate section, which, I presume, other PDF applications don't take into consideration.
Thanks!
R.
See the 3rd paragraph under PDF tags (https://exiftool.org/TagNames/PDF.html).
So using a text editor to look through the file will find the original data, as mentioned in that link.
I do find it odd that Acrobat would see that data, as exiftool is following Adobe's rules for incremental updates.
QuoteI do find it odd that Acrobat would see that data, as exiftool is following Adobe's rules for incremental updates.
Agreed. Even after linearizing with qpdf, the result is the same. As Acrobat Reader is a reference implementation, I'm not sure if a bug should be considered somewhere. Anyway, PDF Architect shows the same values.
I actually noticed that reprocessing the file through GhostScript's ps2pdf sets the expected blank value in the subject field.
This is unsettling. It has been tested previously with Adobe products and worked as specified at that time. I don't have time to re-test it now, but I'll look into this with a current version of Adobe Reader when I get a chance.
- Phil
Here's a test scenario for your consideration:
- I produced the attached PDF file with PDF Creator, setting original values:
(https://i.imgur.com/sUtPVaE.png)
- Then, I used exiftool: exiftool -Title="New title" -Subject= blank.pdf. Metadata is not as expected in Acrobat Reader DC:
(https://i.imgur.com/afVRIsv.png)
- However, if I use ps2pdf: ps2pdf blank.pdf blank-reprocessed.pdf, I end up with the expected ouput:
(https://i.imgur.com/bIQL0O0.png)
Acrobat Reader version:
(https://i.imgur.com/KQsn5j9.png)
I hope this helps.
What happens if you use this
exiftool -Description= myfile.pdf
Using exiftool to look at the data for your example, it shows that PDF Creator fills both PDF:Subject and XMP:Description with your "Original subject".
I should have looked at my notes on Adobe reader. My previous research, though a couple years old, showed that Adobe Reader will fill the "Subject" field with data from these tags
PDF:Subject
XMP-dc:Description
XMP-pdf:Subject
XMP-xmp:Description
The last two are probably pretty rare, but all of these would be cleared with
exiftool -subject= -description= myfile.pdf
Yes, removing Description gets the expected results :-)
Note that I initially had the issue from a file that wasn't produced by PDF Creator, so I suppose that this way of setting the metadata isn't uncommon.
Thanks!
Glad you figured it out. I had just tried this myself with a different PDF file and couldn't reproduce these results.
Thanks StarGeek.
- Phil