ExifTool Forum

General => Metadata => Topic started by: jjp on March 17, 2014, 01:46:34 AM

Title: Create date extracted wrongly from a pdf
Post by: jjp on March 17, 2014, 01:46:34 AM
Hi

I am using ExifTool to extract "Create date" and "Modify Date" from pdf documents.
In one pdf the "Create date" is extracted with a wrong date. The "Modify Date" is correctly extracted.
Adobe reader properties shows both dates correctly.

ExifTool extracted dates:
Create Date                     : 1910:20:51 11:03:64
Modify Date                     : 2002:05:11 10:56:13

Adobe reader dates:
Create date: 11-05-2002 10:36:46
Modify date: 11-05-2002 10:56:13

Using a hex viewer I have found the date strings in the pdf:
/CreationDate (D:191020511103646)
/ModDate (D:20020511105613)

Notice that the creation date is 15 digits and the modify date is only 14 digits.
I am not sure how to interpret the creation date format though the last 12 numbers seems to match a full date format with the year shown with only 2 digits.

The pdf: http://www.researchgate.net/publication/11344214_The_pharmacodynamics_and_pharmacokinetics_of_mivacurium_in_children/file/72e7e51a45c45ab4b3.pdf

I use the perl module, but have also tried the windows exe file.

Best Regards
Jesper
Title: Re: Create date extracted wrongly from a pdf
Post by: Phil Harvey on March 17, 2014, 09:27:11 AM
Hi Jesper,

Wow, that has to be some sort of bug.  Apparently, a year of 19102 means 2002 (= 1900 + 102).  Very funny.  This is not valid according to the PDF specification, and was obviously created by some buggy software that didn't survive the Y2K transition.  Apparently Adobe Reader handles this special case, but I'm not sure it makes sense for me to add a patch to ExifTool for this.

- Phil
Title: Re: Create date extracted wrongly from a pdf
Post by: jjp on March 18, 2014, 02:43:27 AM
Hi Phil,

thanks for the quick reply.
If it isn't valid in the PDF specification then I agree that ExifTool shouldn't try to handle it.

Best Regards
Jesper