Create date extracted wrongly from a pdf

Started by jjp, March 17, 2014, 01:46:34 AM

Previous topic - Next topic

jjp

Hi

I am using ExifTool to extract "Create date" and "Modify Date" from pdf documents.
In one pdf the "Create date" is extracted with a wrong date. The "Modify Date" is correctly extracted.
Adobe reader properties shows both dates correctly.

ExifTool extracted dates:
Create Date                     : 1910:20:51 11:03:64
Modify Date                     : 2002:05:11 10:56:13

Adobe reader dates:
Create date: 11-05-2002 10:36:46
Modify date: 11-05-2002 10:56:13

Using a hex viewer I have found the date strings in the pdf:
/CreationDate (D:191020511103646)
/ModDate (D:20020511105613)

Notice that the creation date is 15 digits and the modify date is only 14 digits.
I am not sure how to interpret the creation date format though the last 12 numbers seems to match a full date format with the year shown with only 2 digits.

The pdf: http://www.researchgate.net/publication/11344214_The_pharmacodynamics_and_pharmacokinetics_of_mivacurium_in_children/file/72e7e51a45c45ab4b3.pdf

I use the perl module, but have also tried the windows exe file.

Best Regards
Jesper

Phil Harvey

Hi Jesper,

Wow, that has to be some sort of bug.  Apparently, a year of 19102 means 2002 (= 1900 + 102).  Very funny.  This is not valid according to the PDF specification, and was obviously created by some buggy software that didn't survive the Y2K transition.  Apparently Adobe Reader handles this special case, but I'm not sure it makes sense for me to add a patch to ExifTool for this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

jjp

Hi Phil,

thanks for the quick reply.
If it isn't valid in the PDF specification then I agree that ExifTool shouldn't try to handle it.

Best Regards
Jesper