Project.pod -- Overall documentation for the Web::Chain project. Version 0.8
The Web::Chain project is a set of OO modules to implement a set of utilities to manage a particular type of hypertext, where a series of pages (or "nodes") are connected in a fixed linear sequence (essentially a bi-directional linked-list).
This is sometimes called a "browse sequence": it gives readers the option of reading straight through the content as though it were a book, one page after another.
The usual form of inheritance -- often called "implementation inheritance" -- has some well known problems, to the point where the Gang of Four can just recite the slogan "inheritance breaks encapsulation" and expect everyone to know what they're talking about.
My take on this problem is that with a typical inheritance-based design, you end up with a tree where every branch becomes a single, massive logical unit. In order to make any use of a subclass at the bottom of a chain of "is a" relationships, you typically need to understand all of the classes above it. You end up with "modules" without modularity, at least from the point of view of someone trying to use the code later. On the other hand, inheritance can often make it easier to write variant versions of existing code, but that's a dangerous asymmetry: easy to write, but hard to understand later.
The general prescriptions to get around this problem are:
(a) use aggregation in preference to inheritance (My tentative rule-of-thumb: reserve implementation inheritance for quick fixes of early design errors).
(b) use "interface inheritance" rather than "implementation inheritance".
With "interface inheritance" an abstract base class is used to specify the behavior of a type of object, and different possible implementations of that behavior inherit this spec from the abstract class. This is intended to make it possible to write new forms of the object that existing code can use without modification.
It's often said that perl "does not support interface inheritance", but that of course, just sounds like a challenge to a real perl programmer.
One of the purposes of this project was to implement a variety of interface inheritance in perl, and see how well it could be made to work.
In a typical OOP design, each object has an associated type, and along with it you get a set of methods designed to work on this type. While a single choice (e.g. the data format you intend to work on) can be easily encoded as a single object type, it's less clear what the right way is to achieve polymorphism when there are two (or more) independent choices that need to be made (e.g. in our case, the input and the output data formats).
One way of doing it might be multiple inheritance, but that's frowned on even more strongly than implementation inheritance.
It could be done with two layers of inheritance:
output_enabled_object "is a",
input_enabled_object "is a",
storage_and_manipulation_object
But long chains of inheritance can be clumsy (as is discussed above).
Instead, this design works with "has a" relationships, each Web::Chain::IO object has an input and output handle, though the actual code that implements these handles is loaded dynamically when the choice of data format is made. The Web::Chain::IO class just specifies the interface.
The Node class is a variation of the "fly-weight" pattern: A given node may be contained by different chain objects, but it is always the *same* node, not a copy. Note that since the next and previous linkage is implemented on the Node level, there are limitations on how different one Chain can be from another, if both include some of the same Nodes.
The interface/implementation division here bears a strong resemblance to the "Strategy" pattern, in that the higher- level commands may access different implementations in the lower level code, depending on circumstances.
This project is intended to facilitate processing linear chain of nodes of information, commonly (though not necessarily) a series of web pages joined in a sequence by "next" and "previous" links.
While in principle a hypertext document can be thought of as a series of nodes (or "pages") connected in an arbitrary manner; early experimentation with hypertext systems showed a tendency for users to feel "lost in hyperspace". They would often wander aimlessly without any sense of where they are in the overall structure.
A simple thing that can be done to combat this is to organize a hypertext into a linear sequence (i.e. a "browse sequence").
In the ideal case clicking on the "Next" link should take you to a page with a thematic connection to the page you've just been reading, but this is not strictly necessary (and not always possible: webs don't easily convert to linear form).
The authors and editors working on such projects need tools to manipulate these chains of nodes. Consider what needs to be done when you add a new page at some point in the sequence: the "next" and "previous" links in the new page must point at two existing nodes, each of them will have a link which will also will need to be updated to point at the new node: there are four link updates total, in three pages. Moving a segment from one place to another in the sequence involves six such link updates, in five pages.
These operations can be done manually, but they are far too tedious and error prone to want to do very many of them that way. The goal here is to automate the process in a convenient way.
My personal interest in this problem arises from a long-standing hypertext writing project I'm engaged in, which goes by the unimpressive name of "the doomfiles" (I've used "doom" as a handle for a long time -- much longer than the silly game has existed -- e.g. I've been known to use "The Voice of Doom" as a college radio airname).
This project might be compared to a "personal blog", but it pre-dates the blog era, in fact, it predates the web era: it was originally just a large file without read-protection, whose intended audience was other users of Stanford's unix systems. Hypertext links were then implemented simply text searches on upper-case keywords
I still use this "rawtext" format for writing new DF material, so it's one of the primary hypertext formats (along with HTML) that the initial version of this project needed to support.
Later (after the web was invented) I converted this text format to a series of interconnected web-pages.
So yes, "Rawtext" is yet another alternative method of creating html files by writing in some "simpler" text-like format, ala all the different "wikis".
In my defense:
The rules of the original 'rawtext' source format are simple: A link is an upper-case term surrounded by whitespace. The destination it jumps to is the same upper-case term flush-left (immediately following a horizontal divider line of equal-signs, or the top of the raw text file).
A rawtext file then, might look something like the the following.
__________________________________________________
|TOP |
| |
| Some discussion of |
| what this thing is |
| you're looking at. HISTORY |
| |
| |
|=== |
|FIRST_THOUGHT |
| |
| Musing away already |
| Muse muse muse... But what |
| Music or museums? about? CAVEATS |
| |
|=== |
|CAVEATS |
| |
| Hedging now better |
| than excuses later. |
| |
|=== |
|BIG_TOPIC |
| |
| Geting into |
| something bit Or relatively ASIDE |
| large at any rate. |
| Something |
| important. |
| |
| Another important |
| thought... TANGENTIAL_RAMBLE |
| |
|=== |
|ASIDE |
| |
| Does big really matter? |
| |
|=== |
|HISTORY |
| |
| How this got |
| started. |
| |
|=== |
|NOTES_FOR_THE_FUTURE |
| |
| Flesh this out |
| cut that down. |
| |
|=== |
|TANGENTIAL_RAMBLE |
| |
| Something completely |
| irrelevant, as though CAVEATS |
| that were unusual. |
| |
| |
|=== |
|FIN |
| |
| See you. |
| |
|__________________________________________________|
For our purposes, "Html" format is more rigidly defined than the usual "html" page would be. The Html format has a header with a standard layout including a "PREV" link to the previous page in the sequence, and a standard footer with a "NEXT" link to the next page in the sequence. The main body of content is embedded in <PRE></PRE> tags, so that white space is significant: this allows the use of graphical layout to suggest the finer grained "hyperlinks" between each paragraph of the content. Currently, the Web::Node module has no understanding of this fine-grained content, it's just treated as a blob.
My older scripts for processing doomfiles rawtext and transforming it into html were always a little buggy and brittle -- hence this more careful re-write, trying to apply some more modern design principles.
If the job is done right, it should facilitate transforming this project into other data formats in the future. Possibly the HTML will become XHTML, possibly a reason might emerge to switch to a database-backed design, etc.
Future development plans include:
Joseph Brenner, <doom@kzsu.stanford.edu>
Copyright (C) 2004 by Joseph Brenner
This document is part of a free software project; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.
None reported... yet.