Does exiftool differentiate between PDF/A-3a A-3b and A-3u?

Started by roadowl, February 09, 2020, 11:38:46 AM

Previous topic - Next topic

roadowl

Hi,

I noticed that when I open a PDF/A-3a pdf, created as such under Abbyy, different pdf-viewers such as evince and xreader, when showing the properties, include 'Format: A-3a' (see screenshot).

However, when using exiftool, it just shows PDF/A-3, without the 'a', see the command-line output below.

Is the 'a' represented by exiftool in some other way?

~/tmp/19780826-vnkk-piet-grijs-zon > exiftool  19780826-vnkk-piet-grijs-zon.pdf
ExifTool Version Number         : 10.80
File Name                       : 19780826-vnkk-piet-grijs-zon.pdf
Directory                       : .
File Size                       : 3.1 MB
File Modification Date/Time     : 2020:02:09 15:21:15+01:00
File Access Date/Time           : 2020:02:09 17:30:27+01:00
File Inode Change Date/Time     : 2020:02:09 15:21:15+01:00
File Permissions                : rwxr--r--
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : Yes
Page Count                      : 2
Warning                         : Superfluous BOM at start of XMP
Format                          : application/pdf
Producer                        : ABBYY FineReader 15
Keywords                        :
Creator Tool                    : ABBYY FineReader 15
Create Date                     : 2020:02:09 14:21:14Z
Modify Date                     : 2020:02:09 14:21:14Z
Document ID                     : uuid:745B25F8-0771-49F8-B74C-7FD3A242C311
Part                            : 3
Conformance                     : A
Tagged PDF                      : Yes
Creator                         : ABBYY FineReader 15



StarGeek

I could be wrong but I think that the Conformance tag is the "A" you're looking for.

It took some doing to find a sample pdf that also had the Part and Conformance tags.  Take a look at this pdf.  Does it show up as a PDF/A-3u for you?  Exiftool shows this on that file
C:\>exiftool -g1 -a -s -part -Conformance Y:\!temp\qqq\1406.6126.pdf
---- XMP-pdfaid ----
Part                            : 3
Conformance                     : U


Also, take note that these are XMP tags.  They are easily alterable.  That means that it appears the data is entirely based upon what is filled out in the metadata by the originating program, not the actual structure of the PDF.  If that's the case, then this standard is entirely based upon trusting the program that created the file.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

roadowl

Thank you. yes, I am very aware that the output given in the case of your pdf is to a large extent 'arbitrary fill-in'.
The output I get is:

[18:04:47] bjd@skyscraper: ~/dt > exiftool 1406.6126.pdf
ExifTool Version Number         : 10.80
File Name                       : 1406.6126.pdf
Directory                       : .
File Size                       : 5.6 MB
File Modification Date/Time     : 2020:02:09 18:04:33+01:00
File Access Date/Time           : 2020:02:09 18:04:33+01:00
File Inode Change Date/Time     : 2020:02:09 18:04:33+01:00
File Permissions                : rw-rw-r--
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.4
Linearized                      : No
Page Count                      : 16
XMP Toolkit                     : Adobe XMP Core 4.0-c316 44.253921, Sun Oct 01 2006 17:14:39
Schemas Schema                  : PRISM metadata
Schemas Namespace URI           : http://prismstandard.org/namespaces/basic/2.2/
Schemas Prefix                  : prism
Schemas Property Name           : aggregationType
Schemas Property Value Type     : Text
Schemas Property Category       : external
Schemas Property Description    : Type of Aggregation; e.g. journal, series
Producer                        : pdfTeX
Format                          : application/pdf
Title                           : PDF/A-3u as an archival format for Accessible mathematics
Creator                         : Ross Moore
Description                     : Using PDF/A-3u to deliver the LaTeX source of mathematical expressions, for Accessibility and other purposes
Subject                         : PDF/A-3u, Accessible mathematics
Aggregation Type                : journal
Copyright                       : Macquarie University
Publisher                       : Macquarie University
ISSN                            : Lecture Notes in Artificial Intelligence
Volume                          : LNAI 8543
Number                          : CICM 2014
Cover Date                      : 2014
Issue Name                      : Springer Lecture Notes in Computer Science
Page Range                      : 184-199
Starting Page                   : 184
Ending Page                     : 199
Part                            : 3
Conformance                     : U
Creator Tool                    :
Marked                          : True
Modify Date                     : 2014:06:25 01:40:33+00:00
Create Date                     : 2014:06:25 01:40:33+00:00
Metadata Date                   : 2014:06:25 01:40:33+00:00
Page Mode                       : UseOutlines
Author                          : Ross Moore
Keywords                        : PDF/A-3u, Accessible mathematics
Trapped                         : False
GTS PDFA1 Version               : PDF/A-3u:2012
PTEX Fullbanner                 : This is pdfTeX, Version 3.1415926-2.3-1.40.12 (TeX Live 2011) kpathsea version 6.0.1


From this, and my own pdf, I conclude (tentatively) that indeed exiftool's 'Conformance' denotes the small suffix after the 'Part' identifier ('3' in your and my PDF). I was taking it to refer to the 'A' in PDF/A', but apparently that is wrong. I suppose the 'A' in PDF/A is implied in the version being 1.4 or greater -- but that's a different question for another time (I need my PDF's to be conformant to official international 'archival' standards).



StarGeek

What I meant was how does that pdf display in the program you used to in the screen shot?  Does it show up as PDF/A-3u in that program?
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

roadowl

AH, sorry.
Yes, its properties show "PDF/A-3u". As does exiftool, only it shows a capital 'A' and 'U'. Maybe that is confusing?
From its contents, my PDF reader only shows the last two pages (bug filed with Linux Mint xreader, based on poppler). Mozilla shows its entire contents.