PDF Instance ID & Document ID?

Started by soozie, February 28, 2017, 11:10:56 AM

Previous topic - Next topic

soozie

when i print my pdf's metadata i see two fields: Instance ID & Document ID.  can you tell me what these are and what they are used for?

thanks

Phil Harvey

I would guess that the DocumentID doesn't change when the document is edited, but that the InstanceID does.  That is just a guess though.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

soozie

Based on the discussion below do you think a new instance id created for every save?  What ExifTool options would I use to export all the data in the pdf mentioned below to a txt file so I can better see what they are referring to?

Remember: the amount of metadata that a program uses when creating files is limitless. XMP is built on XML, so any metadata tags can be defined. Let's take a real-world example of how powerful PDF metadata can be when created from certain programs. Download Trustwave's Global Security Report PDF from 2013. Run it in exiftool. What do you see? That's right, the "History" metadata fields will show you not only that the document was saved 497 times, but it will also show you the exact times that is was saved, the program used to save it each time, and the Document Instance ID for each save (less exciting).

http://www.4n6k.com/2014/02/forensics-quickie-pdf-metadata.html

Phil Harvey

I think that implies that the instance ID is different each time the document is saved.

You can do this to export metadata to a text file:

exiftool -w txt FILE

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).