I was recently trying to add bookmarks to a PDF I'd generated with pdftk. It turns out to be fairly simple to add bookmarks to a PDF using Ghostscript, following maggoteer's post to the Ubuntu forums. The syntax is:
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=out.pdf in-*.pdf pdfmarks
Where out.pdf
is the generated PDF, in-*.pdf
are the input PDFs,
and pdfmarks
is a text file with contents like:
[/Title (Title Page) /Page 1 /OUT pdfmark
[/Title (Table of Contents) /Page 3 /OUT pdfmark
...
Nice and easy.
For nested levels, use the /Count
attribute. For example:
[/Count 3 /Title (Chapter 1) /Page 1 /OUT pdfmark
[/Count -2 /Title (Section 1.1) /Page 2 /OUT pdfmark
[/Title (Section 1.1.1) /Page 3 /OUT pdfmark
[/Title (Section 1.1.2) /Page 4 /OUT pdfmark
[/Count -1 /Title (Section 1.2) /Page 5 /OUT pdfmark
[/Title (Section 1.2.1) /Page 6 /OUT pdfmark
[/Title (Section 1.3) /Page 7 /OUT pdfmark
The argument to /Count
gives the number of immediately subordinate
bookmarks. The sign of the argument sets the default display
(negative for closed, positive for open).
You can also setup the document info dictionary with something like:
[ /Title (My Test Document)
/Author (John Doe)
/Subject (pdfmark 3.0)
/Keywords (pdfmark, example, test)
/DOCINFO pdfmark
If you want more detail, take a look at Adobe's pdfmark reference.
I've bundled the whole pdfmarks-generation bit into a script,
pdf-merge.py, which generates the pdfmark file and runs
Ghostscript automatically. Think of it as a bookmark-preserving
version of pdftk's cat
. The script uses pdftk internally to
extract bookmark information from the source PDFs.
The script also adds a bit of PostScript to ignore any bookmarks in
the source PDFs during the Ghostscript run. The only bookmarks in the
output will be the ones you specify explicitly in the pdfmarks file.
If for some reason the automatically generated pdfmarks are not quite
what you want, the script can pause (via --ask
) to allow you to
tweak the pdfmarks manually before running Ghostscript.