HTMLDOC supports most HTML 3.2 elements, some HTML 4.0 elements, and can generate title and table of contents pages. It does not currently support stylesheets.
HTMLDOC can be used as a standalone application, in a batch document processing environment, or as a web-based report generation application.
No restrictions are placed upon the output produced by HTMLDOC .
sgi
that generated "compiled" Standard Generalized
Markup Language ("SGML") files that could be used by the Electronic
Book Technologies ("EBT") documentation products (EBT is now owned by
INSO.) When sgi
stopped supporting these tools we
turned to INSO, but the cost of their tools is prohibitive to small
businesses.
In the end we decided to write our own program to generate our documentation. HTML seemed to be the source format of choice since WYSIWYG HTML editors are widely (and freely) available and at worst you can use a plain text editor. We needed HTML output for documentation on our web server, PDF for customers to read and/or print from their computers, and PostScript for our own printing needs.
The result of our efforts is the HTMLDOC software which is available for UNIX® and Microsoft® Windows®. Among other things, this software users manual is produced using HTMLDOC.
This manual is organized into tutorial and reference chapters:
The Graphics Interchange Format is the copyright and GIFSM is the service mark property of CompuServe Incorporated.
Compaq, Digital, and Tru64 are registered trademarks of Compaq.
Intel is a registered trademark of Intel Corporation.
IRIX and sgi
are registered trademarks of Silicon
Graphics, Inc.
Linux is a registered trademark of Linus Torvalds.
MacOS is a registered trademark of Apple Computer, Inc.
Microsoft, Windows, Windows 95, Windows 98, Windows Me, Windows 2000, Windows NT, and Windows XP are registered trademarks of Microsoft Corporation.
Red Hat and RPM are registered trademarks of Red Hat, Inc.
Solaris is a registered trademark of Sun Microsystems, Inc.
SPARC is a registered trademark of SPARC International, Inc.
UNIX is a registered trademark of the X/Open Company, Ltd.
HTMLDOC is copyright 1997-2002 by Easy Software Products. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
A copy of the GNU General Public License is included in Appendix A of this manual. If this appendix is missing from your copy of HTMLDOC, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
This software is based in part on the work of the Independent JPEG Group and FLTK project.
HTMLDOC may compile and run on other platforms, however we have not tested nor do we provide support for platforms other than those listed previously.
% dselect install htmldoc-version-linux-2.0-intel.deb ENTER
% dselect remove htmldoc ENTER
% rpm -i htmldoc-version-linux-2.0-intel.rpm ENTER
% rpm -e htmldoc ENTER
% gunzip htmldoc-version-platform.tar.gz ENTER % tar xf htmldoc-version-platform.tar ENTER % ./setup ENTERSubstitute the correct version and platform strings as appropriate.
% /etc/software/htmldoc.remove ENTER
Double-click on the HTMLDOC package in the finder and follow the installer prompts.
Secure (https) URL support can be enabled via the OpenSSL library. You should use at least version 0.9.6.
CC
environment
variable to the name and path of your ANSI C compiler:
% setenv CC /path/to/compiler ENTER [C Shell] % CC=/path/to/compiler; export CC ENTER [Bourne/Korn Shell]Similarly, if your C++ compiler is not called CC, gcc , c++, or g++, set the
CXX
environment variable to the name and path of your C++ compiler:
% setenv CXX /path/to/compiler ENTER [C Shell] % CXX=/path/to/compiler; export CXX ENTER [Bourne/Korn Shell]
Then run the following command to configure HTMLDOC for installation in the default directories:
% ./configure ENTER
The default configuration will install HTMLDOC in the
/usr/bin directory with the data files under
/usr/share/htmldoc and the documentation and on-line help under
/usr/share/doc/htmldoc. Use the --prefix
option to
change the installation prefix to /usr/local:
% ./configure --prefix=/usr/local ENTER
If the FLTK library is not installed in a standard location for your
compilers, use the --with-fltk-includes
and
--with-fltk-libs
options to point to the FLTK library:
% ./configure --with-fltk-libs=/path/to/fltk/lib \ --with-fltk-includes=/path/to/fltk ENTER
Finally, if the OpenSSL library is not installed in a standard
location for your compilers, use the --with-openssl-includes
and --with-openssl-libs
options to point to the OpenSSL
library:
% ./configure --with-openssl-libs=/path/to/openssl/lib \ --with-openssl-includes=/path/to/openssl ENTER
% make ENTERIf you get any fatal errors, please subscribe to the HTMLDOC mailing list and send a copy of the make/compiler output to " htmldoc@easysw.com" for assistance. Please note the version of HTMLDOC that you are using as well as any pertinent system information (operating system, OS version, compiler, etc.)
To subscribe to the HTMLDOC mailing list, send a message to "majordomo@easysw.com" with the text:
subscribe htmldocin the message body. You must subscribe to the list to post questions and comments.
% make install ENTER
If you are installing in a restricted directory like /usr then you'll need to be logged in as root.
To install HTMLDOC without InstallShield, create an installation directory and copy the htmldoc.exe executable, the afm directory, the data directory, and the doc directory to it.
Then use the regedit program to create the following two string entries:
HKEY_LOCAL_MACHINE\Software\Easy Software
Products\HTMLDOC\data
HKEY_LOCAL_MACHINE\Software\Easy Software Products\HTMLDOC\doc
This chapter describes how to start HTMLDOC and convert HTML files into PostScript and PDF files.
Note:
HTMLDOC currently does not support HTML 4.0 features such as
stylesheets or the |
To start HTMLDOC under UNIX type:
% htmldoc ENTER
Choose HTMLDOC from the Start menu to start HTMLDOC under Windows.
The HTMLDOC window (Figure 2-1) shows the list of input files that will be converted. Start by clicking on the Web Page radio button (1) to specify that you will be converting a HTML web page file.
Figure 2-1 - The HTMLDOC Window
Then choose a file for conversion by clicking on the Add Files... button (2). When the file chooser dialog appears (Figure 2-2), double-click on the HTML file (3) you wish to convert from the list of files.
Figure 2-2 - The File Chooser Dialog
Now that you've chosen a HTML file to convert, click on the Output tab (4) to set the output file (Figure 2-3). Type the name of the output file into the Output Path field or click on the Browse... button (5) to select the output file using the file chooser.
Figure 2-3 - The Output Tab
Since you chose to convert a Web Page instead of a book, HTMLDOC has automatically chosen to produce a PDF file.
HTMLDOC uses HTML heading elements to delineate chapters and
headings in a book. The H1
element is used for chapters:
<HTML> <HEAD> <TITLE>The Little Computer that Could</TITLE> </HEAD> <BODY> <H1>Chapter 1 - The Little Computer is Born</H1> ... <H1>Chapter 2 - Little Computer's First Task</H1> ... </BODY> </HTML>
Sub-headings are marked using the H2
through H6
elements.
Note:
When using book mode, HTMLDOC starts rendering with the first |
Start by clicking on the Book radio button (1) to specify you'll be converting one or more HTML files into a book.
Then choose one or more files for conversion by clicking on the Add Files... button (2). When the file chooser dialog appears, pick the file(s) you wish to convert from the list of files and then click on the OK button.
Figure 3-1: The Input Tab
HTMLDOC supports automatic generation of a title page using an
image file, the title text, and other META
information on
it. Type the title image filename into the Title File field
or click on the Browse... button (3) to select a title image
for your book. HTMLDOC can also use a HTML file that you have
generated for the title page(s). To use a HTML title page, type the
title filename into the Title File field or click on the
Browse... button (3) to select a HTML file for your book.
Figure 3-2: The Output Tab
The output format is set in the Output tab (4). Click on the Output tab and then click on the HTML, PS, or PDF radio buttons to set the output format.
Now that you've chosen an output format, type the name of the output file into the Output Path field or click on the Browse... button (5) to select the output file using the file chooser.
Once you have chosen the output file you can generate it by clicking on the Generate button (6) at the bottom of the HTMLDOC window.
HTMLDOC can save the list of HTML files, the title file, and
all other options to a special .BOOK
file so you can
regenerate your book when you make changes to your HTML files.
Click on the Save button (7) to save the current book to a file.
This chapter describes how to use HTMLDOC from the command-line to convert web pages and generate books.
Note:
The free version of HTMLDOC for Windows does not include the command-line program. |
To convert a single web page type:
% htmldoc --webpage -f output.pdf filename.html ENTER % htmldoc --webpage -f output.ps filename.html ENTER
To convert more than one web page with page breaks between each HTML file, type:
% htmldoc --webpage -f output.pdf file1.html ... fileN.html ENTER % htmldoc --webpage -f output.ps file1.html ... fileN.html ENTER
The --webpage
option tells HTMLDOC that you want
to convert web pages or other unstructured HTML files. You can also use
--continuous
to convert multiple HTML files without page
breaks between files and --book
to convert structured HTML
files with headings into a book with a table of contents. The default
document type is --book
.
The -f
option tells HTMLDOC the file to generate.
If you don't specify an output file, a PDF file is sent to the standard
output. The output.pdf
and output.ps
arguments are the names of the output files you want to generate. The
.pdf
extension specifies that you want to generate a PDF file,
while the .ps
extension specifies PostScript output.
The filename.html
, file1.html
, and
fileN.html
arguments are the input HTML files you want to
convert. The HTML files can also be URLs, for example:
% htmldoc --webpage -f output.pdf http://slashdot.org/ ENTER % htmldoc --webpage -f output.ps http://freshmeat.net/ http://easysw.com/ ENTER
Type one of the following commands to generate a book from one or more HTML files:
% htmldoc --book -f output.html file1.html ... fileN.html ENTER % htmldoc --book -f output.pdf file1.html ... fileN.html ENTER % htmldoc --book -f output.ps file1.html ... fileN.html ENTER
where output.html
, output.pdf
, and
output.ps
are the names of the files you want to generate, and
file1.html
to fileN.html
are the HTML files you want
to use for the book.
The --book
option tells HTMLDOC that you want to
generate a book from the HTML file(s) you specified.
The -f
option tells HTMLDOC what file to
generate. If you don't specify an output file then a PDF file is sent
to the standard output.
HTMLDOC will build a table of contents for the book using the
heading elements (H1
, H2
, etc.) in your HTML
files. It will also add a title page using the document TITLE
text and other META
information you supply in your HTML
files. See Chapter 6 - HTML Reference for more
information on the META
variables that are supported.
Note:
When using book mode, HTMLDOC starts rendering with the first |
The --titlefile
option sets the HTML file or image to
use on the title page:
% htmldoc --titlefile filename.bmp ... ENTER % htmldoc --titlefile filename.gif ... ENTER % htmldoc --titlefile filename.jpg ... ENTER % htmldoc --titlefile filename.png ... ENTER % htmldoc --titlefile filename.html ... ENTER
HTMLDOC supports BMP, GIF, JPEG, and PNG images, as well as generic HTML text you supply for the title page(s).
This chapter describes how to interface HTMLDOC to your web server using CGI scripts and programs.
Note:
The free version of HTMLDOC for Windows does not support use from a web server. |
HTMLDOC can be used in a variety of ways to generate formatted reports on a web server. The most common way is to combine HTMLDOC with a CGI script or program and send the output to the HTTP client.
To make this work the CGI script or program must send the appropriate HTTP attributes, the required empty line to signify the beginning of the document, and then execute the HTMLDOC program to generate the HTML, PostScript, or PDF file as needed.
Another way to generate PDF files from your reports is to use HTMLDOC as a "portal" application. When used as a portal, HTMLDOC automatically retrieves the named document or report from your server and passes a PDF version to the web browser. See the next sections for more information.
WARNING:
Passing information directly from the web browser to HTMLDOC can potentially expose your system to security risks. Always be sure to "sanitize" any input from the web browser so that filenames, URLs, and options passed to HTMLDOC are not acted on by the shell program. |
Shell scripts are probably the easiest to work with, but are normally limited to GET type requests. Here is a script called topdf that acts as a portal, converting the named file to PDF:
#!/bin/sh # # Sample "portal" script to convert the named HTML file to PDF on-the-fly. # # Usage: http://www.domain.com/path/topdf/path/filename.html # # # The "options" variable contains any options you want to pass to HTMLDOC. # options="-t pdf --webpage --header ... --footer ..." # # Tell the browser to expect a PDF file... # echo "Content-Type: application/pdf" echo "" # # Run HTMLDOC to generate the PDF file... # htmldoc $options http://${SERVER_NAME}:${SERVER_PORT}$PATH_INFO
Users of this CGI would reference the URL "http://www.domain.com/topdf.cgi/index.html" to generate a PDF file of the site's home page.
The options variable in the script can be set to use any supported command-line option for HTMLDOC; for a complete list see Chapter 8 - Command-Line Reference.
Perl scripts offer the ability to generate more complex reports, pull data from databases, etc. The easiest way to interface Perl scripts with HTMLDOC is to write a report to a temporary file and then execute HTMLDOC to generate the PDF file.
Here is a simple Perl subroutine that can be used to write a PDF report to the HTTP client:
sub topdf(filename); sub topdf { # Get the filename argument... my $filename = shift; # Make stdout unbuffered... select(STDOUT); $| = 1; # Write the content type to the client... print "Content-Type: application/pdf\n\n"; # Run HTMLDOC to provide the PDF file to the user... system "htmldoc -t pdf --quiet --webpage $filename"; }
PHP is quickly becoming the most popular server-side scripting
language available. PHP provides a passthru()
function
that can be used to run HTMLDOC. This combined with the
header()
function can be used to provide on-the-fly reports in
PDF format.
Here is a simple PHP function that can be used to convert a HTML report to PDF and send it to the HTTP client:
function topdf($filename, $options = "") { # Write the content type to the client... header("Content-Type: application/pdf"); flush(); # Run HTMLDOC to provide the PDF file to the user... passthru("htmldoc -t pdf --quiet --jpeg --webpage $options '$filename'"); }
The function accepts a filename and an optional "options" string for specifying the header, footer, fonts, etc.
To prevent malicious users from passing in unauthorized characters into this function, the following function can be used to verify that the URL/filename does not contain any characters that might be interpreted by the shell:
function bad_url($url) { // See if the URL starts with http: or https:... if (strncmp($url, "http://", 7) != 0 && strncmp($url, "https://", 8) != 0) { return 1; } // Check for bad characters in the URL... $len = strlen($url); for ($i = 0; $i < $len; $i ++) { if (!strchr("~_*()/:%?+-&@;=,$.", $url[$i]) && !ctype_alnum($url[$i])) { return 1; } } return 0; }
Another method is to use the escapeshellarg()
function
provided with PHP 4.0.3 and higher to generate a quoted shell argument
for HTMLDOC.
To make a "portal" script, add the following code to complete the example:
global $SERVER_NAME; global $SERVER_PORT; global $PATH_INFO; global $QUERY_STRING; if ($QUERY_STRING != "") { $url = "http://${SERVER_NAME}:${SERVER_PORT}${PATH_INFO}?${QUERY_STRING}"; } else { $url = "http://${SERVER_NAME}:${SERVER_PORT}$PATH_INFO"; } if (bad_url($url)) { print("<HTML><HEAD><TITLE>Bad URL</TITLE></HEAD>\n" ."<BODY><H1>Bad URL</H1>\n", ."<P>The URL <B><TT>$url</TT></B> is bad.</P>\n" ."</BODY></HTML>\n"); } else { topdf($url); }
C programs offer the best flexibility and easily supports on-the-fly report generation without the need for temporary files.
Here are some simple C functions that can be used to generate a PDF report to the HTTP client from a temporary file or pipe:
#include <stdio.h> #include <stdlib.h> /* topdf() - convert a HTML file to PDF */ FILE *topdf(const char *filename) /* HTML file to convert */ { char command[1024]; /* Command to execute */ puts("Content-Type: application/pdf\n"); sprintf(command, "htmldoc -t pdf --webpage %s", filename); return (popen(command, "w")); } /* topdf2() - pipe HTML output to HTMLDOC for conversion to PDF */ FILE *topdf2(void) { puts("Content-Type: application/pdf\n"); return (popen("htmldoc -t pdf --webpage -", "w")); }
Java programs are a portable way to add PDF support to your web server. Here is a class called htmldoc that acts as a portal, converting the named file to PDF. It can also be called by your Java servlets to process an HTML file and send the result to the client in PDF format:
class htmldoc { // Convert named file to PDF on stdout... public static int topdf(String filename)// I - Name of file to convert { String command; // Command string Process process; // Process for HTMLDOC Runtime runtime; // Local runtime object java.io.InputStream input; // Output from HTMLDOC byte buffer []; // Buffer for output data int bytes; // Number of bytes // First tell the client that we will be sending PDF... System.out.print("Content-type: application/pdf\n\n"); // Construct the command string command = "htmldoc --quiet --jpeg --webpage -t pdf --left 36 " + "--header .t. --footer .1. " + filename; // Run the process and wait for it to complete... runtime = Runtime.getRuntime(); try { // Create a new HTMLDOC process... process = runtime.exec(command); // Get stdout from the process and a buffer for the data... input = process.getInputStream(); buffer = new byte[8192]; // Read output from HTMLDOC until we have it all... while ((bytes = input.read(buffer)) > 0) System.out.write(buffer, 0, bytes); // Return the exit status from HTMLDOC... return (process.waitFor()); } catch (Exception e) { // An error occurred - send it to stderr for the web server... System.err.print(e.toString() + " caught while running:\n\n"); System.err.print(" " + command + "\n"); return (1); } } // Main entry for htmldoc class public static void main(String[] args)// I - Command-line args { String server_name, // SERVER_NAME env var server_port, // SERVER_PORT env var path_info, // PATH_INFO env var query_string, // QUERY_STRING env var filename; // File to convert if ((server_name = System.getProperty("SERVER_NAME")) != null && (server_port = System.getProperty("SERVER_PORT")) != null && (path_info = System.getProperty("PATH_INFO")) != null) { // Construct a URL for the resource specified... filename = "http://" + server_name + ":" + server_port + path_info; if ((query_string = System.getProperty("QUERY_STRING")) != null) { filename = filename + "?" + query_string; } } else if (args.length == 1) { // Pull the filename from the command-line... filename = args[0]; } else { // Error - no args or env variables! System.err.print("Usage: htmldoc.class filename\n"); return; } // Convert the file to PDF and send to the web client... topdf(filename); } }
There are two types of HTML files - structured documents using headings (H1, H2, etc.) which HTMLDOC calls "books", and unstructured documents that do not use headings which HTMLDOC calls "web pages".
A very common mistake is to try converting a web page using:
htmldoc -f filename.pdf filename.html
which will likely produce a PDF file with no pages. To convert web
page files you must use the --webpage
option at the
command-line or choose Web Page in the input tab of the GUI.
HTMLDOC does not support HTML 4.0 elements, attributes, stylesheets, or scripting.
The following HTML elements are recognized by HTMLDOC:
Element | Version | Supported? | Notes |
---|---|---|---|
!DOCTYPE | 3.0 | Yes | DTD is ignored |
A | 1.0 | Yes | See Below |
ACRONYM | 2.0 | Yes | No font change |
ADDRESS | 2.0 | Yes | |
AREA | 2.0 | No | |
B | 1.0 | Yes | |
BASE | 2.0 | No | |
BASEFONT | 1.0 | No | |
BIG | 2.0 | Yes | |
BLINK | 2.0 | No | |
BLOCKQUOTE | 2.0 | Yes | |
BODY | 1.0 | Yes | |
BR | 2.0 | Yes | |
CAPTION | 2.0 | Yes | See Below |
CENTER | 2.0 | Yes | |
CITE | 2.0 | Yes | Italic/Oblique |
CODE | 2.0 | Yes | Courier |
DD | 2.0 | Yes | |
DEL | 2.0 | Yes | Strikethrough |
DFN | 2.0 | Yes | Helvetica |
DIR | 2.0 | Yes | |
DIV | 3.2 | Yes | |
DL | 2.0 | Yes | |
DT | 2.0 | Yes | Italic/Oblique |
EM | 2.0 | Yes | Italic/Oblique |
EMBED | 2.0 | Yes | HTML Only |
FONT | 2.0 | Yes | See Below |
Element | Version | Supported? | Notes |
FORM | 2.0 | No | |
FRAME | 3.2 | No | |
FRAMESET | 3.2 | No | |
H1 | 1.0 | Yes | Boldface, See Below |
H2 | 1.0 | Yes | Boldface, See Below |
H3 | 1.0 | Yes | Boldface, See Below |
H4 | 1.0 | Yes | Boldface, See Below |
H5 | 1.0 | Yes | Boldface, See Below |
H6 | 1.0 | Yes | Boldface, See Below |
HEAD | 1.0 | Yes | |
HR | 1.0 | Yes | See Below |
HTML | 1.0 | Yes | |
I | 1.0 | Yes | |
IMG | 1.0 | Yes | See Below |
INPUT | 2.0 | No | |
INS | 2.0 | Yes | Underline |
ISINDEX | 2.0 | No | |
KBD | 2.0 | Yes | Courier Bold |
LI | 2.0 | Yes | |
LINK | 2.0 | No | |
MAP | 2.0 | No | |
MENU | 2.0 | Yes | |
META | 2.0 | Yes | See Below |
MULTICOL | N3.0 | No | |
NOBR | 1.0 | No | |
NOFRAMES | 3.2 | No | |
OL | 2.0 | Yes | |
OPTION | 2.0 | No | |
P | 1.0 | Yes | |
PRE | 1.0 | Yes | |
Element | Version | Supported? | Notes |
S | 2.0 | Yes | Strikethrough |
SAMP | 2.0 | Yes | Courier |
SCRIPT | 2.0 | No | |
SELECT | 2.0 | No | |
SMALL | 2.0 | Yes | |
SPACER | N3.0 | Yes | |
STRIKE | 2.0 | Yes | |
STRONG | 2.0 | Yes | Boldface Italic/Oblique |
SUB | 2.0 | Yes | Reduced Fontsize |
SUP | 2.0 | Yes | Reduced Fontsize |
TABLE | 2.0 | Yes | See Below |
TD | 2.0 | Yes | |
TEXTAREA | 2.0 | No | |
TH | 2.0 | Yes | Boldface Center |
TITLE | 2.0 | Yes | |
TR | 2.0 | Yes | |
TT | 2.0 | Yes | Courier |
U | 1.0 | Yes | |
UL | 2.0 | Yes | |
VAR | 2.0 | Yes | Helvetica Oblique |
WBR | 1.0 | No |
HTMLDOC supports many special HTML comments to initiate page breaks, set the header and footer text, and control the current media options:
<!-- FOOTER LEFT "foo" -->
<!-- FOOTER CENTER "foo" -->
<!-- FOOTER RIGHT "foo" -->
<!-- HALF PAGE -->
<!-- HEADER LEFT "foo" -->
<!-- HEADER CENTER "foo" -->
<!-- HEADER RIGHT "foo" -->
<!-- MEDIA BOTTOM nnn -->
<!-- MEDIA COLOR "foo" -->
<!-- MEDIA DUPLEX NO -->
<!-- MEDIA DUPLEX YES -->
<!-- MEDIA LANDSCAPE NO -->
<!-- MEDIA LANDSCAPE YES -->
<!-- MEDIA LEFT nnn -->
<!-- MEDIA POSITION nnn -->
<!-- MEDIA RIGHT nnn -->
<!-- MEDIA SIZE foo -->
<!-- MEDIA TOP nnn -->
<!-- MEDIA TYPE "foo" -->
<!-- NEED length -->
length
units left on the
current page. The length
value defaults to lines of text
but can be suffixed by in
, mm
, or cm
to convert from the corresponding units.
<!-- NEW PAGE -->
<!-- NEW SHEET -->
<!-- NUMBER-UP nn -->
<!-- PAGE BREAK -->
The HEADER
and FOOTER
comments allow you to
set an arbitrary string of text for the left, center, and right headers
and footers. Each string consists of plain text; special values or
strings can be inserted using the dollar sign ($
):
$$
CHAPTER
$CHAPTERPAGE
$CHAPTERPAGE(format)
$CHAPTERPAGES
$CHAPTERPAGES(format)
$DATE
$HEADING
$LOGOIMAGE
$PAGE
$PAGE(format)
$PAGES
$PAGES(format)
$TIME
$TITLE
Limited typeface specification is currently supported to ensure portability across platforms and for older PostScript printers:
Requested Font | Actual Font |
---|---|
Arial | Helvetica |
Courier | Courier |
Helvetica | Helvetica |
Monospace | Courier |
Sans-Serif | Helvetica |
Serif | Times |
Symbol | Symbol |
Times | Times |
All other unrecognized typefaces are silently ignored.
Currently HTMLDOC supports a maximum of 1000 chapters (H1
headings). This limit can be increased by changing the
MAX_CHAPTERS
constant in the config.h file included
with the source code.
All chapters start with a top-level heading (H1) markup. Any headings within a chapter must be of a lower level (H2 to H15). Each chapter starts a new page or the next odd-numbered page if duplexing is selected.
Note:
Heading levels 7 to 15 are not standard HTML and will not likely be recognized by most web browsers. |
The headings you use within a chapter must start at level 2 (H2). If you skip levels the heading will be shown under the last level that was known. For example, if you use the following hierarchy of headings:
<H1>Chapter Heading</H1> ... <H2>Section Heading 1</H2> ... <H2>Section Heading 2</H2> ... <H3>Sub-Section Heading 1</H3> ... <H4>Sub-Sub-Section Heading 1</H4> ... <H4>Sub-Sub-Section Heading 2</H4> ... <H3>Sub-Section Heading 2</H3> ... <H2>Section Heading 3</H2> ... <H4>Sub-Sub-Section Heading 3</H4> ...the table-of-contents that is generated will show:
VALUE="#"
TYPE="1"
TYPE="a"
TYPE="A"
TYPE="i"
TYPE="I"
External URL and internal (#target
and
filename.html
) links are fully supported for HTML and PDF output.
When generating PDF files, local PDF file links will be converted to external file links for the PDF viewer instead of URL links. That is, you can directly link to another local PDF file from your HTML document with:
<A HREF="filename.pdf">...</A>
HTMLDOC supports the following META
attributes
for the title page and document information:
<META NAME="AUTHOR" CONTENT="..."
<META NAME="COPYRIGHT" CONTENT="..."
<META NAME="DOCNUMBER" CONTENT="..."
<META NAME="GENERATOR" CONTENT="..."
<META NAME="KEYWORDS" CONTENT="..."
<META NAME="SUBJECT" CONTENT="..."
BREAK
attribute is still supported by the HR
element:
<HR BREAK>Support for the
BREAK
attribute is deprecated and will be
removed in a future release of HTMLDOC.
MAX_COLUMNS
constant in the config.h file included
with the source code. HTMLDOC supports HTML 3.0 tables with the
following exceptions:
CAPTION
element is always shown at the top of the
table.HTMLDOC does not support HTML 4.0 table elements or
attributes, such as TBODY
, THEAD
, TFOOT
, or RULES
.
.BOOK
files. The buttons on the bottom of the HTMLDOC
window allow you to manage these files and generate formatted
documents.
Note: Saving a document is not the same as generating a document. The book files saved to disk by the Save and Save As... buttons are not the final HTML, PDF, or PostScript output files. You generate those files by clicking on the Generate button.
Note: Saving a document is not the same as generating a document. The book files saved to disk by the Save and Save As... buttons are not the final HTML, PDF, or PostScript output files. You generate those files by clicking on the Generate button.
Note: Generating a document is not the same as saving a document. To save the current HTML files and settings in the HTMLDOC GUI, click on the Save or Save As... buttons instead.
Figure 7-1 - The Input Tab
The Delete Files button only removes the files from the Input Files list. The files are not removed from disk.
Click on the Browse... button to select a logo image file using the file chooser dialog.
Click on the Browse... button to select a title file using the file chooser dialog.
Figure 7-2 - The Output Tab
Directory output is not available when generating PDF files.
Note: HTMLDOC uses Flate compression, which is not encumbered by patents and is also used by the popular PKZIP and gzip programs. Flate is a lossless compression algorithm (that is, you get back exactly what you put in) that performs very well on indexed images and text.
Figure 7-3 - The Page Tab
HTMLDOC supports the following standard page size names:
Click in the Page Size field and enter the page width and length separated by the letter "x" to select a custom page size. Append the letters "in" for inches, "mm" for millimeters, or "cm" for centimeters.
Select the desired text in each of the option buttons to customize the header and footer for the document/body pages. The left-most option buttons set the text that is left-justified, while the middle buttons set the text that is centered and the right buttons set the text that is right-justified. Each choice corresponds to the following text:
Choice | Description |
---|---|
Blank | The field should be blank. |
Title | The field should contain the document title. |
Chapter Title | The field should contain the current chapter title. |
Heading | The field should contain the current heading. |
Logo | The field should contain the logo image. |
1,2,3,... | The field should contain the current page number in decimal format (1, 2, 3, ...) |
i,ii,iii,... | The field should contain the current page number in lowercase roman numerals (i, ii, iii, ...) |
I,II,III,... | The field should contain the current page number in uppercase roman numerals (I, II, III, ...) |
a,b,c,... | The field should contain the current page number using lowercase letters. |
A,B,C,... | The field should contain the current page number using UPPERCASE letters. |
Chapter Page | The field should contain the current chapter page number. |
1/N,2/N,... | The field should contain the current and total number of pages (n/N). |
1/C,2/C,... | The field should contain the current and total number of pages in the chapter (n/N). |
Date | The field should contain the current date (formatted for the current locale). |
Time | The field should contain the current time (formatted for the current locale). |
Date + Time | The field should contain the current date and time (formatted for the current locale). |
Figure 7-4 - The TOC Tab
Figure 7-5 - The Colors Tab
#RRGGBB
. Click on the Lookup... button to
pick the color graphically.
#RRGGBB
. Click on the Lookup... button to
pick the color graphically.
#RRGGBB
. Click on the Lookup... button to
pick the color graphically.
Figure 7-6 - The Fonts Tab
The Embed Fonts check box controls whether or not fonts are embedded in PostScript and PDF output.
Figure 7-7 - The PS Tab
PostScript Level 2 is compatible with most PostScript printers and supports printer commands and JPEG image compression.
PostScript Level 3 is compatible with only the newest PostScript printers and supports Flate image compression in addition to the Level 2 features.
setpagedevice
commands
for the page size and duplex settings. Click in the check box to enable
or disable printer commands.
Printer commands are only available with Level 2 and 3 output and may not work with some printers.
The Include Xerox Job Comments check box controls whether or not the output files contain Xerox job comments. Click in the check box to enable or disable the job comments.
Job comments are available with all levels of PostScript output.
Figure 7-8 - The PDF Tab
The Document page mode displays only the document pages. The Outline page mode displays the table-of-contents outline as well as the document pages. The Full-Screen page mode displays the document pages on the whole screen; this mode is used primarily for presentations.
The Single page layout displays a single page at a time. The One Column page layout displays a single column of pages at a time. The Two Column Left and Two Column Right page layouts display two columns of pages at a time; the first page is displayed in the left or right column as selected.
Figure 7-9 - The Security Tab
The security tab (Figure 7-9) allows you to enable PDF document encryption and security features.
The Encryption buttons control whether or not encryption is performed on the PDF file. Encrypted documents can be password protected and also provide user permissions.
The Permissions buttons control what operations are allowed by the PDF viewer.
The Owner Password field contains the document owner password, a string that is used by Adobe Acrobat to control who can change document permissions, etc.
If this field is left blank, a random 32-character password is generated so that no one can change the document using the Adobe tools.
The Include Links option controls whether or not the internal links in a document are included in the PDF output. The document outline (shown to the left of the document in Acrobat Reader) is unaffected by this setting.
The User Password field contains the document user password, a string that is used by Adobe Acrobat to restrict viewing permissions on the file.
If this field is left blank, any user may view the document without entering a password.
Figure 7-10 - The Options Tab
The options tab (Figure 7-10) contains the HTML file editor of your choice and allows you to save the settings and options that will be used in new documents.
The HTML Editor field contains the name of the HTML editor to run when you double-click on an input file or click on the Edit Files... button. Enter the program name in the field or click on the Browse... button to select the editor using the file chooser.
The %s
is added automatically to the end of the command
name to insert the name of the file to be edited. If you are using
Netscape Composer to edit your HTML files you should put "-edit" before
the %s
to tell Netscape to edit the file and not display
it.
The Browser Width slider specifies the width of the browser in pixels that is used to scale images and other pixel measurements to the printable page width. You can adjust this value to more closely match the formatting on the screen.
The default browser width is 680 pixels which corresponds roughly to a 96 DPI display. The browser width is only used when generating PostScript or PDF files.
The Search Path field specifies a search path for files that are loaded by HTMLDOC. It is usually used to get images that use absolute server paths to load.
Directories are separated by the semicolon (;) so that drive letters (and eventually URLs) can be specified.
The Proxy URL field specifies a URL for a HTTP proxy server.
The Tooltips check button controls the appearance of tooltip windows over GUI controls.
The Modern Look check button controls the appearance of the GUI controls.
The Strict HTML check button controls strict HTML conformance checking. When checked, HTML elements that are improperly nested and dangling close elements will produce error messages.
The Save Options and Defaults button saves the HTML editor and all of the document settings on the other tabs for use in new documents. These settings are also used by the command-line version of HTMLDOC.
Figure 7-11 - The File Chooser
The file chooser (Figure 7-11) allows you to select one or more files and create files and directories.
The Directory option button (1) shows the current directory or folder that is displayed in the file list (3). Click on the option button to navigate to other directories or folders.
The directory buttons (2) allow you to go up one level in the directory hierarchy, create a new directory, and change the filename filter settings, respectively.
The file list (3) lists the files and directories in the current directory or folder. Double-click on a file or directory to select that file or directory. Drag the mouse or hold the CTRL key down while clicking to select multiple files.
The Filename field contains the currently selected filename. Type a name in the field to select a file or directory. As you type, any matching filenames will be highlighted; press the TAB key to accept the matches.
The dialog buttons (5) close the file chooser dialog window. Click on the OK button to accept your selections or the Cancel button to reject your selections and cancel the file operation.
This chapter describes all of the command-line options supported by HTMLDOC.
Note:
The free version of HTMLDOC for Windows does not include the command-line program. |
% htmldoc options filename1.html ... filenameN.html ENTER % htmldoc options filename.book ENTERThe first form converts the named HTML files to the specified output format immediately. The second form loads the specified
.book
file and displays the HTMLDOC window, allowing a user to make
changes and/or generate the document interactively.
If no output file or directory is specified, then all output is sent to the standard output file.
-d
option specifies an output directory for the
document files.
This option is not compatible with the PDF output format.
-f
option specifies an output file for the document.
The -t
option specifies the output format for the
document and can be one of the following:
Format | Description |
---|---|
html | Generate one or more indexed HTML files. |
Generate a PDF file (default version - 1.3). | |
pdf11 | Generate a PDF 1.1 file for Acrobat Reader 2.0. |
pdf12 | Generate a PDF 1.2 file for Acrobat Reader 3.0. |
pdf13 | Generate a PDF 1.3 file for Acrobat Reader 4.0. |
pdf14 | Generate a PDF 1.4 file for Acrobat Reader 5.0. |
ps | Generate one or more PostScript files (default level). |
ps1 | Generate one or more Level 1 PostScript files. |
ps2 | Generate one or more Level 2 PostScript files. |
ps3 | Generate one or more Level 3 PostScript files. |
-v
option specifies that progress information should
be sent/displayed to the standard error file.
--batch
option specifies a book file that you would
like to generate without the GUI popping up. This option can be
combined with other options to generate the same book in different
formats and sizes:
% htmldoc --batch filename.book -f filename.ps ENTER % htmldoc --batch filename.book -f filename.pdf ENTER
--bodycolor
option specifies the background color for
all pages in the document. The color can be specified by a standard
HTML color name or as a 6-digit hexadecimal number of the form
#RRGGBB
.
The --bodyfont
option specifies the default text font
used for text in the document body. The typeface
parameter
can be one of the following:
typeface | Actual Font |
---|---|
Arial | Helvetica |
Courier | Courier |
Helvetica | Helvetica |
Monospace | Courier |
Sans-Serif | Helvetica |
Serif | Times |
Symbol | Symbol |
Times | Times |
The --bodyimage
option specifies the background image
for all pages in the document. The supported formats are BMP, GIF,
JPEG, and PNG.
The --book
option specifies that the input files
comprise a book with chapters and headings.
The --bottom
option specifies the bottom margin. The
default units are points (1 point = 1/72nd inch); the suffixes "in",
"cm", and "mm" specify inches, centimeters, and millimeters,
respectively.
This option is only available when generating PostScript or PDF files.
The --browserwidth
option specifies the browser width in
pixels. The browser width is used to scale images and pixel
measurements when generating PostScript and PDF files. It does not
affect the font size of text.
The default browser width is 680 pixels which corresponds roughly to a 96 DPI display. Please note that your images and table sizes are equal to or smaller than the browser width, or your output will overlap or truncate in places.
The --charset
option specifies the 8-bit character set
encoding to use for the entire document. HTMLDOC comes with the
following character set files:
charset | Character Set |
---|---|
cp-874 | Windows code page 874 |
cp-1250 | Windows code page 1250 |
cp-1251 | Windows code page 1251 |
cp-1252 | Windows code page 1252 |
cp-1253 | Windows code page 1253 |
cp-1254 | Windows code page 1254 |
cp-1255 | Windows code page 1255 |
cp-1256 | Windows code page 1256 |
cp-1257 | Windows code page 1257 |
cp-1258 | Windows code page 1258 |
iso-8859-1 | ISO-8859-1 |
iso-8859-2 | ISO-8859-2 |
iso-8859-3 | ISO-8859-3 |
iso-8859-4 | ISO-8859-4 |
iso-8859-5 | ISO-8859-5 |
iso-8859-6 | ISO-8859-6 |
iso-8859-7 | ISO-8859-7 |
iso-8859-8 | ISO-8859-8 |
iso-8859-9 | ISO-8859-9 |
iso-8859-14 | ISO-8859-14 |
iso-8859-15 | ISO-8859-15 |
koi8-r | KOI8-R |
--color
option specifies that color output is desired.
This option is only available when generating PostScript or PDF files.
--compression
option specifies that Flate compression
should be performed on the output file(s). The optional level
parameter is a number from 1 (fastest and least amount of compression)
to 9 (slowest and most amount of compression).
This option is only available when generating Level 3 PostScript or PDF files.
--continuous
option specifies that the input files
comprise a web page (or site) and that no title page or
table-of-contents should be generated. Unlike the --webpage
option described later in this chapter, page breaks are not inserted
between each input file.
This option is only available when generating PostScript or PDF files.
--datadir
option specifies the location of data files
used by HTMLDOC.
--duplex
option specifies that the output should be
formatted for two sided printing.
This option is only available when generating PostScript or PDF
files. Use the --pscommands
option to generate PostScript
duplex mode commands.
--effectduration
option specifies the duration of a
page transition effect in seconds.
This option is only available when generating PDF files.
The --embedfonts
option specifies that fonts should be
embedded in PostScript and PDF output. This is especially useful when
generating documents in character sets other than ISO-8859-1.
The --encryption
option enables encryption and security
features for PDF output.
This option is only available when generating PDF files.
The --firstpage
option specifies the first page that
will be displayed in a PDF file. The page
parameter can be
one of the following:
page | Description |
---|---|
p1 | The first page of the document. |
toc | The first page of the table-of-contents. |
c1 | The first page of chapter 1. |
This option is only available when generating PDF files.
--fontsize
option specifies the base font size for the
entire document in points (1 point = 1/72nd inch).
--fontspacing
option specifies the line spacing for
the entire document as a multiplier of the base font size. A
spacing
value of 1 makes each line of text the same height as the
font.
The --footer
option specifies the contents of the page
footer. The lcr
parameter is a three-character string
representing the left, center, and right footer fields. Each character
can be one of the following:
lcr | Description |
---|---|
. | A period indicates that the field should be blank. |
: | A colon indicates that the field should contain the current and total number of pages in the chapter (n/N). |
/ | A slash indicates that the field should contain the current and total number of pages (n/N). |
1 | The number 1 indicates that the field should contain the current page number in decimal format (1, 2, 3, ...) |
a | A lowercase "a" indicates that the field should contain the current page number using lowercase letters. |
A | An uppercase "A" indicates that the field should contain the current page number using UPPERCASE letters. |
c | A lowercase "c" indicates that the field should contain the current chapter title. |
C | An uppercase "C" indicates that the field should contain the current chapter page number. |
d | A lowercase "d" indicates that the field should contain the current date. |
D | An uppercase "D" indicates that the field should contain the current date and time. |
h | An "h" indicates that the field should contain the current heading. |
i | A lowercase "i" indicates that the field should contain the current page number in lowercase roman numerals (i, ii, iii, ...) |
I | An uppercase "I" indicates that the field should contain the current page number in uppercase roman numerals (I, II, III, ...) |
l | A lowercase "l" indicates that the field should contain the logo image. |
t | A lowercase "t" indicates that the field should contain the document title. |
T | An uppercase "T" indicates that the field should contain the current time. |
Setting the footer to "...
" disables the footer
entirely.
The --format
option specifies the output format for the
document and can be one of the following:
Format | Description |
---|---|
html | Generate one or more indexed HTML files. |
Generate a PDF file (default version - 1.3). | |
pdf11 | Generate a PDF 1.1 file for Acrobat Reader 2.0. |
pdf12 | Generate a PDF 1.2 file for Acrobat Reader 3.0. |
pdf13 | Generate a PDF 1.3 file for Acrobat Reader 4.0. |
pdf14 | Generate a PDF 1.4 file for Acrobat Reader 5.0. |
ps | Generate one or more PostScript files (default level). |
ps1 | Generate one or more Level 1 PostScript files. |
ps2 | Generate one or more Level 2 PostScript files. |
ps3 | Generate one or more Level 3 PostScript files. |
--gray
option specifies that grayscale output is
desired.
This option is only available when generating PostScript or PDF files.
--header
option specifies the contents of the page
header. The lcr
parameter is a three-character string
representing the left, center, and right header fields. See the
--footer
option for the list of formatting characters.
Setting the header to "...
" disables the header
entirely.
--headfootfont
option specifies the font that is used
for the header and footer text. The font
parameter can be
one of the following:
This option is only available when generating PostScript or PDF files.
--headfootsize
option sets the size of the header and
footer text in points (1 point = 1/72nd inch).
This option is only available when generating PostScript or PDF files.
The --headingfont
options sets the typeface that is used
for headings in the document. The typeface
parameter can
be one of the following:
typeface | Actual Font |
---|---|
Arial | Helvetica |
Courier | Courier |
Helvetica | Helvetica |
Monospace | Courier |
Sans-Serif | Helvetica |
Serif | Times |
Symbol | Symbol |
Times | Times |
--help
option displays all of the available options to
the standard output file.
--helpdir
option specifies the location of the on-line
help files.
--jpeg
option enables JPEG compression of
continuous-tone images. The optional quality
parameter
specifies the output quality from 0 (worst) to 100 (best).
This option is only available when generating Level 2 and Level 3 PostScript or PDF files.
--landscape
option specifies that the output should be
in landscape orientation (long edge on top).
This option is only available when generating PostScript or PDF files.
--left
option specifies the left margin. The default
units are points (1 point = 1/72nd inch); the suffixes "in", "cm", and
"mm" specify inches, centimeters, and millimeters, respectively.
This option is only available when generating PostScript or PDF files.
--linkcolor
option specifies the color of links in
HTML and PDF output. The color can be specified by name or as a 6-digit
hexadecimal number of the form #RRGGBB
.
The --links
option specifies that PDF output should
contain hyperlinks.
--linkstyle
option specifies the style of links in
HTML and PDF output. The style can be "plain" for no decoration or
"underline" to underline links.
--logoimage
option specifies the logo image for the
HTML navigation bar and page headers and footers for PostScript and PDF
files. The supported formats are BMP, GIF, JPEG, and PNG.
--no-compression
option specifies that Flate
compression should not be performed on the output files.
--no-duplex
option specifies that the output should be
formatted for one sided printing.
This option is only available when generating PostScript or PDF
files. Use the --pscommands
option to generate PostScript
duplex mode commands.
--no-embedfonts
option specifies that fonts should not
be embedded in PostScript and PDF output.
The --no-encryption
option specifies that no
encryption/security features should be enabled in PDF output.
This option is only available when generating PDF files.
--no-jpeg
option specifies that JPEG compression
should not be performed on large images.
The --no-links
option specifies that PDF output should
not contain hyperlinks.
--no-localfiles
option disables access to local files
on the system. This option should be used when providing remote
document conversion services.
--no-numbered
option specifies that headings should
not be numbered.
--no-pscommands
option specifies that PostScript
device commands should not be written to the output files.
The --no-strict
option turns off strict HTML conformance
checking.
--no-title
option specifies that the title page should
not be generated.
--no-toc
option specifies that the table-of-contents
pages should not be generated.
--no-xrxcomments
option specifies that Xerox
PostScript job comments should not be written to the output files.
This option is only available when generating PostScript files.
--numbered
option specifies that headings should be
numbered.
--nup
option sets the number of pages that are placed
on each output page. Valid values for the pages
parameter
are 1, 2, 4, 6, 9, and 16.
--outdir
option specifies an output directory for the
document files.
This option is not compatible with the PDF output format.
--outfile
option specifies an output file for the
document.
The --owner-password
option specifies the owner password
for a PDF file. If not specified or the empty string (""), a random
password is generated.
This option is only available when generating PDF files.
--pageduration
option specifies the number of seconds
that each page will be displayed in the document.
This option is only available when generating PDF files.
The --pageeffect
option specifies the page effect to use
in PDF files. The effect
parameter can be one of the
following:
effect | Description |
---|---|
none | No effect is generated. |
bi | Box Inward |
bo | Box Outward |
d | Dissolve |
gd | Glitter Down |
gdr | Glitter Down and Right |
gr | Glitter Right |
hb | Horizontal Blinds |
hsi | Horizontal Sweet Inward |
hso | Horizontal Sweep Outward |
vb | Vertical Blinds |
vsi | Vertical Sweep Inward |
vso | Vertical Sweep Outward |
wd | Wipe Down |
wl | Wipe Left |
wr | Wipe Right |
wu | Wipe Up |
This option is only available when generating PDF files.
The --pagelayout
option specifies the initial page
layout in the PDF viewer. The layout
parameter can be one
of the following:
layout | Description |
---|---|
single | A single page is displayed. |
one | A single column is displayed. |
twoleft | Two columns are displayed with the first page on the left. |
tworight | Two columns are displayed with the first page on the right. |
This option is only available when generating PDF files.
The --pagemode
option specifies the initial viewing mode
in the PDF viewer. The mode
parameter can be one of the
following:
mode | Description |
---|---|
document | The document pages are displayed in a normal window. |
outline | The document outline and pages are displayed. |
fullscreen | The document pages are displayed on the entire screen in "slideshow" mode. |
This option is only available when generating PDF files.
The --path
option specifies a search path for files that
are loaded by HTMLDOC. It is usually used to get images that use
absolute server paths to load.
Directories are separated by the semicolon (;) so that drive letters and URLs can be specified. Quotes around the directory parameter are optional. They are usually used when the directory string contains spaces.
--path "dir1;dir2;dir3;...;dirN"
The --permissions
option specifies the document
permissions. The available permission parameters are listed below:
Permission | Description |
---|---|
all | All permissions |
annotate | User can annotate document |
copy | User can copy text and images from document |
modify | User can modify document |
User can print document | |
no-annotate | User cannot annotate document |
no-copy | User cannot copy text and images from document |
no-modify | User cannot modify document |
no-print | User cannot print document |
none | No permissions |
The --encryption
option must be used in conjunction with
the --permissions
parameter.
--permissions no-print --encryption
Multiple options can be specified with multiple --permissions
entries as needed.
--permissions no-print --permissions no-copy --encryption
This option is only available when generating PDF files.
--portrait
option specifies that the output should be
in portrait orientation (short edge on top).
This option is only available when generating PostScript or PDF files.
--pscommands
option specifies that PostScript device
commands should be written to the output files.
This option is only available when generating Level 2 and Level 3 PostScript files.
The --quiet
option prevents error messages from being
sent to stderr.
--right
option specifies the right margin. The default
units are points (1 point = 1/72nd inch); the suffixes "in", "cm", and
"mm" specify inches, centimeters, and millimeters, respectively.
This option is only available when generating PostScript or PDF files.
The --size
option specifies the page size. The
size
parameter can be one of the following standard sizes:
size | Description |
---|---|
Letter | 8.5x11in (216x279mm) |
A4 | 8.27x11.69in (210x297mm) |
Universal | 8.27x11in (210x279mm) |
Custom sizes are specified by the page width and length separated by the letter "x" to select a custom page size. Append the letters "in" for inches, "mm" for millimeters, or "cm" for centimeters.
This option is only available when generating PostScript or PDF
files. Use the --pscommands
option to generate PostScript
page size commands.
The --strict
option turns on strict HTML conformance
checking. When enabled, HTML elements that are improperly nested and
dangling close elements will produce error messages.
--textcolor
option specifies the default text color
for all pages in the document. The color can be specified by a standard
HTML color name or as a 6-digit hexadecimal number of the form
#RRGGBB
.
The --textfont
options sets the typeface that is used
for text in the document. The typeface
parameter can be
one of the following:
typeface | Actual Font |
---|---|
Arial | Helvetica |
Courier | Courier |
Helvetica | Helvetica |
Monospace | Courier |
Sans-Serif | Helvetica |
Serif | Times |
Symbol | Symbol |
Times | Times |
--title
option specifies that a title page should be
generated.
--titlefile
option specifies a HTML file to use for
the title page.
--titleimage
option specifies the title image for the
title page. The supported formats are BMP, GIF, JPEG, and PNG.
--tocfooter
option specifies the contents of the
table-of-contents footer. The lcr
parameter is a
three-character string representing the left, center, and right footer
fields. See the --footer
option for
the list of formatting characters.
Setting the TOC footer to "...
" disables the TOC footer
entirely.
--tocheader
option specifies the contents of the
table-of-contents header. The lcr
parameter is a
three-character string representing the left, center, and right header
fields. See the --footer
option for
the list of formatting characters.
Setting the TOC header to "...
" disables the TOC header
entirely.
--toclevels
options specifies the number of heading
levels to include in the table-of-contents pages. The levels
parameter is a number from 1 to 6.
--toctitle
options specifies the string to display at
the top of the table-of-contents; the default string is "Table of
Contents".
--top
option specifies the top margin. The default
units are points (1 point = 1/72nd inch); the suffixes "in", "cm", and
"mm" specify inches, centimeters, and millimeters, respectively.
This option is only available when generating PostScript or PDF files.
The --user-password
option specifies the user password
for a PDF file. If not specified or the empty string (""), no password
will be required to view the document.
This option is only available when generating PDF files.
The --verbose
option specifies that progress information
should be sent/displayed to the standard error file.
The --version
option displays the HTMLDOC version
number.
The --webpage
option specifies that the input files
comprise a web page (or site) and that no title page or
table-of-contents should be generated. HTMLDOC will insert a
page break between each input file.
This option is only available when generating PostScript or PDF files.
--xrxcomments
option specifies that Xerox PostScript
job comments should be written to the output files.
This option is only available when generating PostScript files.
HTMLDOC sends error and status messages to stderr unless the
--quiet
option is provided on the command-line. Applications can
capture these messages to relay errors or statistics to the user.
The BYTES:
message specifies the number of bytes that
were written to an output file. If the output is directed at a
directory then multiple BYTES:
messages will be sent.
The PAGES:
message specifies the number of pages that
were written to an output file. If the output is directed at a
directory then multiple PAGES:
messages will be sent. No
PAGES:
messages are sent when generating HTML output.
The ERRnnn:
messages specify an error condition. Error
numbers 1 to 14 map to the following errors:
Error numbers 100 to 505 correspond directly to a HTTP status code.
HTMLDOC is provided under the GNU General Public License ("GPL") with a license exception for the OpenSSL toolkit. A copy of the exception and license follows this introduction.
For those not familiar with the GNU GPL, the license basically allows you to:
What this license does not allow you to do is make changes or add features to HTMLDOC and then sell a binary distribution without source code. You must provide source for any changes or additions to the software, and all code must be provided under the GPL.
In addition, as the copyright holder of HTMLDOC, Easy Software Products grants the following special exception:
No developer is required to provide this exception in a derived work.
Easy Software Products also sells rights to the HTMLDOC source code under a binary distribution license for vendors that are unable to release source code for their additions and modifications to HTMLDOC under the GNU GPL. For information please contact us at the address shown above.
Easy Software Products sells software support for HTMLDOC. You can find out more at our web site:
http://www.easysw.com/
Version 2, June 1991
Copyright 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
This appendix describes the HTMLDOC .book file format.
The HTMLDOC .book file format is a simple text
format that provides the command-line options and files that are part
of the document. These files can be used from the GUI interface or from
the command-line using the --batch
option:
htmldoc filename.book htmldoc --batch filename.book
The first form will load the book and display the GUI interface, if configured. Windows users should use ghtmldoc.exe executable to show the GUI and htmldoc.exe for the batch mode:
ghtmldoc.exe filename.book htmldoc.exe --batch filename.book
Each .book file starts with a line reading:
#HTMLDOC 1.8.17
The version number (1.8.17) is optional.
Following the header is a line containing the options for the book. You can use any valid command-line option on this line:
-f htmldoc.pdf --titleimage htmldoc.png --duplex --compression=9 --jpeg=90
Long option lines can be broken using a trailing backslash (\ ) on the end of each continuation line:
-f htmldoc.pdf --titleimage htmldoc.png --duplex \ --compression=9 --jpeg=90
Following the options are a list of files or URLs to include in the document:
intro.html 1-install.html 2-starting.html 3-books.html 4-cmdline.html 5-cgi.html 6-htmlref.html 7-guiref.html 8-cmdref.html a-license.html b-book.html c-relnotes.html
The following is the complete book file needed to generate this documentation:
#HTMLDOC 1.8.13 -f htmldoc.pdf --titleimage htmldoc.png --duplex --compression=9 --jpeg=90 intro.html 1-install.html 2-starting.html 3-books.html 4-cmdline.html 5-cgi.html 6-htmlref.html 7-guiref.html 8-cmdref.html a-license.html b-book.html c-relnotes.html
Prior to HTMLDOC version 1.8.12, the book file format was slightly different:
#HTMLDOC version file count file(s) options
While HTMLDOC still supports reading this format, we do not
recommend using it for new books. In particular, when generating a
document using the --batch
option, some options may not be
applied correctly since the files are loaded prior to setting the
output options in the old format.
This appendix provides the release notes for each version of HTMLDOC.