ExifTool Forum

ExifTool => Bug Reports / Feature Requests => Topic started by: StarGeek on April 22, 2023, 02:50:11 PM

Title: REQ: Detect if a PDF has bookmarks
Post by: StarGeek on April 22, 2023, 02:50:11 PM
I've been working with some PDFs and learning how to edit them. One thing I haven't been able to find an easy answer for is to see if a PDF already has bookmarks or if I need to add them.  Right now, I'm stuck with using this command
qpdf --json file.pdf | jq ".outlines" | grep -Poi "Title\": "

It uses qpdf to dump all the data in json format, then jq (aka sed for json data) (https://stedolan.github.io/jq/) to look for the "outlines" structure, then finally grep to see if there are any title entries, finally check the return code, 0=bookmarks, 1=no bookmarks.  A bit much when I'm just looking for a True/False answer.

If this wouldn't be an easy add then this can be ignored.

edit: Just figured out I could use pdftk (https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) to remove a step
pdftk file.pdf dump_data | grep "BookmarkBegin"
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: Phil Harvey on April 22, 2023, 05:39:54 PM
Can you upload a sample with a bookmark so I can take a look?

- Phil
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: StarGeek on April 22, 2023, 10:00:49 PM
Link sent by DM.
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: Phil Harvey on April 22, 2023, 10:30:37 PM
Hi StarGeek,

Try this outlines.config file:

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::PDF::Root' => {
        Outlines => {
            SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::Outlines' },
        },
    },
);

%Image::ExifTool::UserDefined::Outlines = (
    Count => { Name => 'NumOutlines' },
);

1; #end

And this command:

exiftool -config outlines.config -numoutlines FILE

I think that the PDF contains bookmarks if NumOutlines exists.  It works for the files you sent, but you should probably do more testing.

- Phil
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: sevy on April 23, 2023, 01:04:13 AM
Hello,

I use ExifTool to get info concerning various collections of pdf files (title, author, keywords/subject, number of pages). I recently add -pagemode.
When bookmark is available, the value is "UseOutlines". When the value is "UseNone" or is empty, there is no bookmarks.

Title: Re: REQ: Detect if a PDF has bookmarks
Post by: StarGeek on April 23, 2023, 11:17:08 AM
Quote from: Phil Harvey on April 22, 2023, 10:30:37 PMIt works for the files you sent, but you should probably do more testing.

A quick check looks good.  Many thanks.
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: StarGeek on April 23, 2023, 12:09:20 PM
Quote from: sevy on April 23, 2023, 01:04:13 AMWhen bookmark is available, the value is "UseOutlines". When the value is "UseNone" or is empty, there is no bookmarks.

A quick check shows that I have PDF where this isn't the case.  Several files have bookmarks (and the above config works on it), but it does not have a PageMode tag.

Running on my Calibre library, I have 926 pdfs.  Checking to see which files have a NumOutlines tag from the above config and are either missing a PageMode or have a $PageMode!~/UseOutlines/i results in 435 files.  Of those, 60 of them come back with a NumOutlines of 0 (I'll have to edit the config to return not defined instead of 0).  That leaves 375 pdfs that have a usable NumOutlines but do not have a matching PageMode.

I still have to take a closer look at some of these to make sure but I do not think PageMode gives an accurate result regarding the existence of bookmarks.
Title: Re: REQ: Detect if a PDF has bookmarks
Post by: sevy on April 24, 2023, 12:34:53 PM
Thanks for pointing that. I will have to adapt my workflow.