ExifTool Forum

ExifTool => Newbies => Topic started by: mrgou on January 22, 2020, 05:00:56 PM

Title: Removing PDF Subject Tag with Exiftool
Post by: mrgou on January 22, 2020, 05:00:56 PM
Hi

Can someone help me understand why the following command will overwrite the Title tag, but not delete the subject tag of my PDF file?

exiftool -Title="My Title" -Subject= myfile.pdf

Oddly, the value of the Subject field still shows in Acrobat Reader DC or PDF Architect, but not when extracting metadata from Exiftool:

exiftool -a myfile.pdf

Upon inspection of the PDF file in a text editor, I can still see this towards the end of the file:

/Title()
/Subject(Original_value)


The desired values only show in a later %BeginExifToolUpdate section, which, I presume, other PDF applications don't take into consideration.

Thanks!

R.
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: StarGeek on January 22, 2020, 05:58:24 PM
See the 3rd paragraph under PDF tags (https://exiftool.org/TagNames/PDF.html).

So using a text editor to look through the file will find the original data, as mentioned in that link.

I do find it odd that Acrobat would see that data, as exiftool is following Adobe's rules for incremental updates.
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: mrgou on January 24, 2020, 12:46:14 PM
QuoteI do find it odd that Acrobat would see that data, as exiftool is following Adobe's rules for incremental updates.

Agreed. Even after linearizing with qpdf, the result is the same. As Acrobat Reader is a reference implementation, I'm not sure if a bug should be considered somewhere. Anyway, PDF Architect shows the same values.

I actually noticed that reprocessing the file through GhostScript's ps2pdf sets the expected blank value in the subject field.
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: Phil Harvey on January 24, 2020, 01:03:30 PM
This is unsettling.  It has been tested previously with Adobe products and worked as specified at that time.  I don't have time to re-test it now, but I'll look into this with a current version of Adobe Reader when I get a chance.

- Phil
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: mrgou on January 25, 2020, 04:48:05 AM
Here's a test scenario for your consideration:

(https://i.imgur.com/sUtPVaE.png)
(https://i.imgur.com/afVRIsv.png)
(https://i.imgur.com/bIQL0O0.png)
Acrobat Reader version:
(https://i.imgur.com/KQsn5j9.png)

I hope this helps.
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: StarGeek on January 25, 2020, 11:49:21 AM
What happens if you use this
exiftool -Description= myfile.pdf

Using exiftool to look at the data for your example, it shows that PDF Creator fills both PDF:Subject and XMP:Description with your "Original subject".

I should have looked at my notes on Adobe reader.  My previous research, though a couple years old, showed that Adobe Reader will fill the "Subject" field with data from these tags
PDF:Subject
XMP-dc:Description
XMP-pdf:Subject
XMP-xmp:Description


The last two are probably pretty rare, but all of these would be cleared with
exiftool -subject= -description= myfile.pdf
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: mrgou on January 25, 2020, 12:07:16 PM
Yes, removing Description gets the expected results :-)

Note that I initially had the issue from a file that wasn't produced by PDF Creator, so I suppose that this way of setting the metadata isn't uncommon.

Thanks!
Title: Re: Removing PDF Subject Tag with Exiftool
Post by: Phil Harvey on January 28, 2020, 07:23:52 AM
Glad you figured it out.  I had just tried this myself with a different PDF file and couldn't reproduce these results.

Thanks StarGeek.

- Phil