|
Pretty HTML
What is it?
Pretty HTML (phtml) is a simple tool that will reformat and word-wrap an HTML file.
Pretty HTML is not meant to accurately wrap text at a specific margin, nor is it prepared to handle grossly invalid HTML (such as <dd> tags without the enclosing <dl> tag -- there are actually some brain-dead tools out there that generate junk like that!!).
Pretty HTML removes superfluous spacing in tags, wraps text approximately at the margin that you specify (default is 75), and changes the indentations of the HTML to make the code more human-readable.
How to use it?
Pretty HTML is a post-processing tool. As such, you will use it on the completed page that WebLord (or any other program) has built. Let's assume this page is called MyPage.html and you would like to reformat this page to really impress those who look at the code and want to learn from it. This is the command line you'd use:
phtml FROM MyPage.html TO NewPage.html
Instead of the FROM and TO keywords (which are required) you could also use I/O (Input/Output) redirection:
phtml <MyPage.html >NewPage.html
Pretty HTML understands two additional arguments:
phtml <MyPage.html >NewPage.html WIDTH 120 TAB 8
Command line arguments
All (capitalized) keywords are required; each of these parameters may be given in any order on the command line.
- FROM filename
- The name of the input file. If you omit this parameter, Pretty HTML will read the page from stdin (standard input, i.e. the console) which you can redirect.
- TO filename
- The name of the processed output file. If you omit this parameter, Pretty HTML will write the resulting page to stdout (standard output, i.e. the console) which you can redirect.
- TAB tabsize
- A value ranging from 0 (zero) to 8. Zero is the default and tells Pretty HTML to use Smart Indentation. A value from 1 through 8 indents each context level by that number of spaces. Whenever possible, Pretty HTML collapses every 8 spaces into a single tab character.
Smart Indentation adjusts the number of characters used for contextual indentation according to the tag being indented. For example, <LI> would use 4 because that's how many characters the tag ( <...> ) occupies. <DIV> uses 5, <BLOCKQUOTE> uses 12.
- WIDTH rightmargin
- A value that represents the approximate right margin where text gets wrapped to a new line. It must be stressed that tags and their elements themselves are not wrapped; in addition, words can extend beyond the margin if they start before the wrap-column.
This "relaxed (lazy) wrapping" is far simpler to implement, requires less memory overhead, may result in speedier operation, and (most importantly) is completely uncritical as far as the functionality and presentation of the web page is concerned.
How to use it from within WebLord?
The following inheritable properties can be defined in your SITE object or separately for each PAGE object in order to apply pretty formating on our HTML:
- html-reformat = "yes";
- Should the Pretty HTML be applied to the output?
- html-reformat-width = "-1";
- After what column should text be wrapped? A value of -1 disables this, producing lines of unrestricted length. It is suggested to use a value somewhere around 70.
- html-reformat-tabsize = "0";
- Each contextual level in the HTML is indented. A value of 0
- disables all contextual indentation (everything gets smashed against the left margin); a value of 1 produces very sparse indentation. A value of 8 (the maximum) will use one tab for each indentation level. All values in between will use multiple spaces, which is wasteful in terms of the amount of data that needs to be transmitted to the client. Try to use 0, 1, or 8. Avoid 2 through 7. Every 8 spaces are collapsed to a tab, however, so using 4 might be a good compromise.
- html-reformat-tagcase = "lower";
- Should the HTML tags (and attribute keys) be forced to upper case, lower case, or left alone? Use "upper" or "lower"; any other value is ignored and will leave the case of the tags as your files provided them.
- html-reformat-quotepolicy = "strict";
- HTML 4.0 requires that ALL values in HTML attributes are quoted. It used to be OK to say <A href=http://cnn.com>...</a> but to be correct it really should be <A href="http://cnn.com">...</a>
- html-reformat-comments = "sloppy";
- To be strictly correct, HTML comments begin with <!-- followed by at least one whitespace; the comment is closed by a whitespace followed by --> There are some HTML editors (or people) who think that the following is a valid comment when in fact it is not: <!--Hi--> WebLord (and Pretty HTML) will accept the "sloppy" version of these "comments" if you specify a value "sloppy"
- html-reformat-augmentable = "yes";
- Will insert </TD> and </TR> HTML when missing in order to help browses like Netscape eliminate ungainly whitespace in tables. This is an experimental feature that is new in 2.2.
Other uses of Pretty HTML
Pretty HTML ("software") was written by Udo Schuermann and although it is copyrighted by me all parties except those noted below have permission to use the software, even redistribute it with their own products so long as the following are observed:
- No money is charged for the software, either explicitly or implicitly,
- The author (Udo Schuermann) is informed of any and all such redistributions,
- Explicit mention is made of the software and the author (Udo Schuermann) in the documentation of your product.
The following parties are explicitly prohibited from using, examining, storing, redistributing, or in any other way using or misusing of the software:
- Microsoft
- Microsoft affiliates
- Microsoft employees
- Microsoft sycophants
Any and all violations will become the mysterious cause of further delays in Microsoft's operating system development cycle, mysterious appearances of Windows NT Blue Screen of Death (BSoD), and other mysterious, but rather serious shit.
|