Ringlord Technologies Products

The "Linux LaTeX-PDF HOW-TO" by Udo Schuermann

Revised: 2-Jul-2003
Revised: 24-Apr-2003
Revised: 17-Jan-2003
Revised: 9-Aug-2002
Revised: 8-Apr-2002
Revised: 24-Jul-2001
Revised: 18-Jun-2001
Revised: 17-April-2001
Revised: 16-April-2001
Revised: 15-April-2001

HTML is well-positioned for delivering information with contextual formatting to a wide variety of rendering agents ("browsers"). In most cases HTML is more than sufficient for this task but indisputably there arise some occasions that demand more control over formatting and the requirement for a document feature set that exceeds the "one-page fits all" model of HTML. In such cases LaTeX and the Portable Document Format (PDF) offer a perfect solution!

Why not stay with TeX/LaTeX and the .dvi (device independent) format? Most people in the academic community are perfectly comfortable with .dvi files, knowing already about tools such as dvips(1), kdvi, xdvi, etc. that allow printing and viewing of .dvi files. In the Microsoft Windows world, however, technical expertise is often sorely lacking (let's face it: Microsoft sells to people who would mistake MS Word for a DTP program) so .dvi files are not an option when addressing the general public.

And this brings us to the purpose of this document: how to create high-quality (e.g. fast-displaying) PDF documents from your LaTeX sources. These documents will offer the following benefits to those who read your work; all of that capability is available completely free of charge. I've done all the work for you with this comprehensive how-to document:

1. Fast-rendering, high quality fonts ("Computer Modern"),
4. High-quality thumbnail/preview images for your document's pages,
5. Hyperlinks within the document and for external materials using the web-browser.

The combination of tools available allows you to generate documents whose visual quality and feature-completeness can rival anything you can get out of commercial tools on other platforms. What's more, few of the documents created with Adobe Distiller are as feature-rich as what you get with these tools. And with Linux it's all free. No surprise at all.

Let's get started!

What You Need

Linux is not required, but given the context of this document I make the assumption here that you have an operating system that is basically Unix-like. Solaris, BSD, Linux… they should all work the same for this purpose:

1. You need an installation of TeX as well as the LaTeX macro package. Enter the command "latex" at the shell prompt. Do you get something like the following?

This is TeX, Version 3.14159 (Web2C 7.3.1)
**

If so, then press control-d now and ignore TeX's confused reply. You have TeX and LaTeX installed.

2. Now type "thumbpdf". Do you get something like the following?

THUMBPDF 1.4, 22.04.1999 - Copyright (c) 1999 by Heiko Oberdiek.
*** make "thumbpdf.pdf" / run pdftex ***
*** parse "thumbpdf.pdf" ***

If so, then you have the excellent THUMBPDF program installed and have all the pieces that you really need! Of course, you don't need the THUMBPDF program: without it your PDF files will have only blank thumbnails for the pages. Few people seem to bother with those anyway, probably because they don't know how to create them…

Note: If the version of THUMBPDF is later than 1.11 (or at least 2.10) then you need to use the --v2 parameter on the script provided in Figure 8: in earlier versions THUMBPDF created (and relied upon) files named "thumb???.png" but this pattern has now changed to "thb?.png".

A Quickie Test

To see if you have all the right pieces in place, create yourself a working directory (perhaps called "tex") and then cut-and-paste the following text into a file in that directory named "test.tex":

 Figure 1: test.tex, A Small Test Document \documentclass{article} \usepackage{thumbpdf} \usepackage[pdftex, colorlinks=true, urlcolor=rltblue, % \href{...}{...} external (URL) filecolor=rltgreen, % \href{...} local file linkcolor=rltred, % \ref{...} and \pageref{...} pdftitle={Untitled}, pdfauthor={Your Name}, pdfsubject={Just a test}, pdfkeywords={test, testing, testable stuff}, pdfproducer={pdfLaTeX}, pdfadjustspacing=1, pagebackref, pdfpagemode=None, bookmarksopen=true]{hyperref} \usepackage{color} \definecolor{rltred}{rgb}{0.75,0,0} \definecolor{rltgreen}{rgb}{0,0.5,0} \definecolor{rltblue}{rgb}{0,0,0.75} \title{Untitled} \author{Your Name} \date{\today} \begin{document}\label{start} \maketitle \section{Section One} A test document from \href{http://www.Ringlord.com/}{Ringlord Technologies}! \subsection{Subsection One Dot One} Hello, this is section 1.1 \subsection{Subsection One Dot Two} And here we have section 1.2; here's a link to \href{Other.pdf}{another Document (Other.pdf)}, which probably does not exist (but that's OK). \section{Section Two} You could also click on the following number to jump to the first page, namely page \pageref{start}\ldots \label{end}\end{document}

 Figure 2: Acrobat Reader 4.0 showing our test document. At this resolution it may not be obvious, but the outline in the left column reflects the section and subsections of the text on the page.
Now run the command

pdflatex test

twice in a row to ensure that the \pagerefS are resolved. If you are not familiar with the way TeX and LaTeX work, you should know at least this:

TeX and LaTeX works as a compiler does. It parses the document specification and builds output from that. As it encounters \labelS (markers whose section and/or page number you can reference) it writes these to a file for future reference. Any forward references (say a reference on page 1 to an object somewhere else, maybe on page 50) are not yet known by the time that the reference on page 1 is processed.

By running the program a second time it will be able to fill-in the reference on page 1 that it finally encountered on page 50 on the previous run.

But back to our test.tex file: If you get errors then check again that you copied the entire test document correctly. You should get several dozen lines of output from the program, most or all of which you can ignore. You should get the shell prompt back; if not then there's an error and you should check that you copied and pasted the test document from Figure 1 correctly.

If everything went as planned then you should now have a simple PDF file named "test.pdf" (yes, test.tex --> test.pdf, that is the expected pattern). You could now blindly trust that your PDF document is ready for distribution. Instead, it would be good to check it. Further below I'll also show you how to add thumbnail images to the PDF for some extra karma.

Try the following commands in succession. As soon as one of them works you should have a way of verifying your documents. I'll leave it up to you to research the operation of these programs:

2. xpdf test.pdf
3. gs test.pdf

If you have Adobe Acrobat Reader (acroread) installed then you have access to a set of features of the PDF document that the other programs may not make fully accessible to you. Figure 2 shows you what this document looks like on my 1600x1200 sized display screen. Note the bookmark outline in the left column. And if you look closely you'll notice that there is some blue (web link) and green (external document link) text, too. There is a red page number, too, that links to a page in this document, but it's too small to read at the resolution of that image. You should be able to see that on your own copy, though!

At this point you've managed to produce a fairly nice, if simple PDF document. In the next sections I'll show you how to add some cool features.

PDF PlusPlus

This is a good time to examine the various parts of that "test.tex" (Figure 1) file to get an idea of what it is that got us this far. This will also let you customize the output. Here is that relevant section:

 Figure 3: Hyperlinking for PDF \usepackage[pdftex, colorlinks=true, urlcolor=rltblue, % \href{...}{...} external (URL) filecolor=rltgreen, % \href{...} local file linkcolor=rltred, % \ref{...} and \pageref{...} pdftitle={Untitled}, pdfauthor={Your Name}, pdfsubject={Just a test}, pdfkeywords={test testing testable}, pdfadjustspacing=1, pagebackref, pdfpagemode=None, bookmarksopen=true]{hyperref}

The hyperref macro package provides you with the ability to specify hyperlinks in your document. The package is configured with a series of comma-separated options in the (optional) [ ]-list:

pdftex
Regardless if you use pdftex or pdflatex you should use "pdftex' here.
This changes hyperlinks from an ugly outline-box to coloured text. The colours are specified in the following options:
urlcolor=rltblue
The colour of external resources. These would be specified like \href{http://www.Ringlord.com/publications/latex-pdf-howto.html}{The Ringlord Technologies LaTeX-PDF HowTo}
filecolor=rltgreen
The colour of documents on the local system. If you created a set of PDF files you could link back and forth between them. The \href format is the same, except that you would use a normal filename rather than an "http://" URL.
The colour of references internal to the current document: section and page numbers.
pdftitle, pdfauthor, pdfsubject, pdfkeywords
Title, author, subject, and keywords for the document. These are accessible (in Acrobat Reader) from the Document Info menu item.
Ensures that pdfLaTeX uses the same spacing as LaTeX.
pagebackref
Creates links to page numbers in the bibliography. No effect if you have no bibilography, so there's no pain if you specify it and don't use it.
pdfpagemode=None
The document will open without the navigation outline. In Figure 2 we manually opened that outline.
bookmarksopen=true
The whole bookmark tree is expanded for maximum exposure of all deeper elements. This works regardless whether 'pdfpagemode' causes the navigation outline to be initially visible or not.

Now, you are probably wondering what's with these "rltred", "rltgreen", and "rltblue" colours. These are defined after (and could be defined before) the hyperref package, but must be defined after the color package is included. Of note here is the fact that I'm using a somewhat darker green than usual because green on a white computer monitor background gets washed-out too easily.

 Figure 4: Defining Colours by Name \usepackage{color} \definecolor{rltred}{rgb}{0.75,0,0} \definecolor{rltgreen}{rgb}{0,0.5,0} \definecolor{rltblue}{rgb}{0,0,0.75}

Mixing PDF and LaTeX Commands

If you tried to compile "test.tex" as a normal LaTeX document, you may have noticed (with some dismay) that it no longer compiled. That's because the various references to PDF constructs aren't understood by the LaTeX constructs.

There's an easy way to fix that, however. It involves a bit of TeX code:

 Figure 5: Flexible generation of both PDF and DVI files \newif\ifpdf \ifx\pdfoutput\undefined \pdffalse % we are not running PDFLaTeX \else \pdfoutput=1 % we are running PDFLaTeX \pdftrue \fi %---------------------------------------------------------------------- % The following is the construct that interests us in the end: \ifpdf % Put PDF-specific stuff here \else % Put LaTeX-specific stuff here \fi

If that doesn't make a lot of sense to you, don't worry about it. The last five lines are the interesting stuff. It's basically an if-then-else statement that allows us to do some stuff when compiling the document for PDF, and something else for LaTeX. I'll have another example for you later (see Figure 8) that includes all the tricks. For now, just keep reading:

The THUMBPDF program takes as its input a PDF file. From that it can generate a series of PNG images (what, you're still using GIF?! Shame on you!) which it then merges into the PDF file. This works well enough, but THUMBPDF doesn't create terribly nice looking snapshots of the pages: there is no antialiasing. Does that mean THUMBPDF is useless? Not at all! The author of THUMBPDF is a smart person and has allowed us to skip the generation of PNG images, thereby accepting existing images, or images that we've generated with another program! Haha, flexible software. Gotta love it!

So, what program should we use to generate high-quality scaled representations of our PDF document's pages?

The combination that has worked very well for me is Ghostscript to take snapshots of the pages as JPEG images, and The GIMP to scale and post-process these images into PNG files that THUMBPDF can read.

What, The GIMP is slow? It's true that The GIMP is a large, large program and takes about five seconds (on my system) to load. But there's a way to get it to process all of our pages without reloading the program so we pay this slow-loading penalty only once per document. That's not bad. Once loaded The GIMP isn't all that slow, so let's go with that solution. If you're adventurous you can use the PPM utilities, ImageMagick, or whatever tool you prefer.

Using The GIMP gives me a chance to introduce non-interactive use of that program, which took me a while to figure out and get right. Why not teach you that, too? :-)

Using Ghostscript to take Page Snapshots

The following invocation of Ghostscript (gs) will take an inputfile.pdf and generate a 36 DPI snapshot of each of the file's pages, writing these to files named 1.jpeg, 2.jpeg, etc.:

 Figure 6: Getting Ghostscript to Make Snapshots gs -q -NOPAUSE -sDEVICE=jpeg -sOutputFile=%d.jpeg -r36x36 \ "inputfile.pdf" &>/dev/null <

But wait, I've already incorporated this in a nifty tool (see Figure 8). We'll now proceed to using The GIMP to perform automated processing of multiple image files from the command line. Yes, this little recipe is useful for just about any command-line usage of The GIMP:

Processing Multiple Images With the GIMP From the Comand Line

Store the following file as "scale-file.scm" in your GIMP scripts directory. If you have GIMP 1.2 installed then there should be a directory .gimp-1.2 in your home directory. Within that is a directory named scripts, so you'd store the following file as "~/.gimp-1.2/scripts/scale-file.scm":

 Figure 7: Scaling an Image with The GIMP ; Retrieves a file, scales and post-processes it, then writes ; it to another file. Store this in your ~/.gimp-1.2/scripts ; directory; and this is how you call it: ; (scale-file "page-1.jpeg" "thumb001.png" 82 106) (define (scale-file source-name dest-name width height) (set! img (car (gimp-file-load 1 source-name source-name))) (gimp-image-undo-disable img) ; (gimp-display-new img) (gimp-image-scale img width height) (plug-in-unsharp-mask 1 0 (car (gimp-image-active-drawable img)) 3 0.5 50) (set! drawable (car (gimp-image-flatten img))) (gimp-file-save 1 img drawable dest-name dest-name) (gimp-image-undo-enable img) )

As the comment in this file explains it will accept an input file and an output file, as well as the desired dimensions for the image to be written to the output file. The scaling will perform an "unsharp mask" operation to undo some of the blurring effect caused by the down-scale operation. This is especially valuable for small thumbnails: they will have a slightly heightened contrast, which improves their appearance.

Note, the "1" as the first argument in the commands refers to the fact that this script is to run in non-interactive mode. I've commented out the (gimp-display-new img) command; if you ran this in interactive mode it would not just prompt you for various parameters but would also produce an image window to show you the result. But we definitely do not want to bother with visible stuff and interactive operations in this project. Onwards we go:

A Totally Cool Unified PDF-Thumbnail Tool

The program in Figure 8 puts the previous two together into a unifying whole. The effect is that we give this program a PDF file and it spits out a bunch of scaled and nicely post-processed PNG images that are ready to use by the THUMBPDF program for insertion into our PDF document.

There will be a slight delay while Ghostscript does its work. Then there is a slightly longer delay (probably five seconds or more) while The GIMP is loaded. After that it processes between 10 and 20 page images per second (depends on your processor's speed, of course). So you see, even with a 100 page document you're not likely to feel yourself growing old while waiting for this to finish :-)

Ah, before I forget: There are some shell script variables at the very beginning of the program. You may want to look over them (and the comments) to verify that they are not blatantly wrong or in danger of conflicting with something of your own making.

And finally, if as of July 24, 2001 the Totally Cool Unified PDF-Thumbnail Tool creates on its own all the Lispish code described above, so you don't need to store any of it in The GIMP directory. Just make sure that the script knows from where The GIMP loads its scripts…

N.B. This script relies on ThumbPDF whose filename pattern has undergone changes from release to release. Please search for '@@@' in this script and apply what changes may be necessary to correct things, if necessary:

References

I am indebted to the numerous sources of fine (and free!) software, and the uncounted individuals on the internet who have shared their various knowledge with the rest of us. Without their sharing, I could not have gathered all this information here, built the meager tools you find here, and then shared it in turn with others.

1. Patrick Jöckel's How-to make a PDF-document from a LaTeX-source?, 15-Apr-2003.
2. Donald E. Knuth, The TeXbook, Addison-Wesley Publishing Company, 1986.
3. Leslie Lamport, LaTeX: A Document Preparation System, Addison-Wesley Publishing Company, 1986.
4. Heiko Oberdiek, for the THUMBPDF software (a perl script).
5. Tobias Oetiker, Hubert Partl, Irene Hyna, and Elisabeth Schlegl, The Not So Short Introduction to LaTeX (http://www.ctan.org/tex-archive/info/lshort/english/lshort.pdf), April 2001.
6. D. P. Story (http://www.math.uakron.edu/~dpstory/), Using LaTeX to Create Quality PDF Documents (http://www.math.uakron.edu/~dpstory/latx2pdf.html), August 11, 1999.

Thanks go to Andreas Abraham for suggesting the -L / --landscape option.

Thanks go to J�rgen Wenzel, who made me aware of the new filename pattern that later versions of THUMBPDF use, a change which caused the script in Figure 8 to break.

Thanks go to Tom Heinrich as well as George Dowding, both of whom have pointed out independently to me the fact that the \$THUMBPDF_V2 flag needed to be checked near the end of the script, too. Took me a while to get around to adding that to the script in Figure 8 above.

If this article has helped you, or you have suggestions for improvements or corrections, why not drop us a line? Go to the contacts page. Use the help or status email addresses. Thanks!

All content is copyright © Ringlord Technologies unless otherwise stated. We do encourage deep linking to our site's pages but forbid direct reference to images, software or other non-page resources stored here; likewise, do not embed our content in frames or other constructs that may mislead the reader about the content ownership. Play nice, yes?

BTC donations appreciated: 15mv2sKRSJ6z6Uk1BwGbyJqU1tZxin9Bb5