I've been working with some PDFs and learning how to edit them. One thing I haven't been able to find an easy answer for is to see if a PDF already has bookmarks or if I need to add them. Right now, I'm stuck with using this command
qpdf --json file.pdf | jq ".outlines" | grep -Poi "Title\": "
It uses qpdf to dump all the data in json format, then jq (aka sed for json data) (https://stedolan.github.io/jq/) to look for the "outlines" structure, then finally grep to see if there are any title entries, finally check the return code, 0=bookmarks, 1=no bookmarks. A bit much when I'm just looking for a True/False answer.
If this wouldn't be an easy add then this can be ignored.
edit: Just figured out I could use pdftk (https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) to remove a step
pdftk file.pdf dump_data | grep "BookmarkBegin"
Can you upload a sample with a bookmark so I can take a look?
- Phil
Link sent by DM.
Hi StarGeek,
Try this outlines.config file:
%Image::ExifTool::UserDefined = (
'Image::ExifTool::PDF::Root' => {
Outlines => {
SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::Outlines' },
},
},
);
%Image::ExifTool::UserDefined::Outlines = (
Count => { Name => 'NumOutlines' },
);
1; #end
And this command:
exiftool -config outlines.config -numoutlines FILE
I think that the PDF contains bookmarks if NumOutlines exists. It works for the files you sent, but you should probably do more testing.
- Phil
Hello,
I use ExifTool to get info concerning various collections of pdf files (title, author, keywords/subject, number of pages). I recently add -pagemode.
When bookmark is available, the value is "UseOutlines". When the value is "UseNone" or is empty, there is no bookmarks.
Quote from: Phil Harvey on April 22, 2023, 10:30:37 PMIt works for the files you sent, but you should probably do more testing.
A quick check looks good. Many thanks.
Quote from: sevy on April 23, 2023, 01:04:13 AMWhen bookmark is available, the value is "UseOutlines". When the value is "UseNone" or is empty, there is no bookmarks.
A quick check shows that I have PDF where this isn't the case. Several files have bookmarks (and the above config works on it), but it does not have a
PageMode tag.
Running on my Calibre library, I have 926 pdfs. Checking to see which files have a
NumOutlines tag from the above config and are either missing a
PageMode or have a
$PageMode!~/UseOutlines/i results in 435 files. Of those, 60 of them come back with a
NumOutlines of 0 (I'll have to edit the config to return not defined instead of 0). That leaves 375 pdfs that have a usable
NumOutlines but do not have a matching
PageMode.
I still have to take a closer look at some of these to make sure but I do not think
PageMode gives an accurate result regarding the existence of bookmarks.
Thanks for pointing that. I will have to adapt my workflow.