November 9, 2004
What's the big deal about converting pod to html? Isn't that a
solved problem by now? Of course. Many times. And
an unsolved problem is only a little worse than a
problem solved too many times...
My mission should you choose to accept it (and I don't know why you would) was to extract the pod from a project with multiple *.pm/*.pod files and convert it into a tree of html files. The main feature I wanted was conversion of L<> links to other modules in the project into relative html links; and in general I was interested in doing this relatively simple job without having to put too much work into the process. |
If you're looking for a quick recommendation, I settled on
Pod::Simple::HTMLBatch,
which did the job without too many rough spots. My impression is that Sean M. Burke is the
man to watch in this field, and with any luck he will succeed in what seems like
the cursed project of replacing Pod::Html...
Also, Mark Overmeer looks like he's got some interesting ideas going with his OODoc (some custom extentions to pod markup to deal with large OO projects). |
Pod::Html / pod2html by Tom Christiansen, included in the perl core | ||
Description | Advantages | Disadvantages |
The original.
Works on one file at a time,
handles cross-references
as absolute hrefs, (prefix
set with --htmlroot).
|
Universally available.
The first thing everyone looks at.
|
No one seems to like it, all
agree it needs revision, and
starting from scratch is preferred
to maintenance. Outputs html
that doesn't validate (if I remember right).
File-oriented, making it less useful
with large projects.
|
SYNOPSIS use Pod::Html; pod2html("pod2html", "--podpath=lib:ext:pod:vms", "--podroot=/usr/src/perl", "--htmlroot=/perl/nmanual", "--libpods=perlfunc:perlguts:perlvar:perlrun:perlop", "--recurse", "--infile=foo.pod", "--outfile=/perl/nmanual/foo.html");
pods2html by Steven McDougall, uses Pod::Tree::HTML, HTML::Stream | ||
pods2html is a front end for Pod::Tree::HTML, which does
the actual translation one-file-at-a-time (though internally
I believe it redundantly traverses the entire tree to do link
resolution correctly). The pods2html
script uses File::Find to traverse the tree of pod (PODdir), and
writes the output to a parallel tree of html (HTMLdir).
|
Works on a directory tree (not file-oriented) Does the main job I was after, turns L<> links into relative html links. |
However, the relative links it generates are a little peculiar,
going all the way back up to the package root before
dropping back down to the destination. It has a --base option which is supposed to be like the --htmlroot of pod2html, but doesn't seem to do anything. |
SYNOPSIS "pods2html" ["--base" url] ["--css" url] ["--index" title] ["--"["no"]"toc"] ["--hr" level] ["--bgcolor" #rrggbb] ["--text" #rrggbb] PODdir HTMLdir
Pod::Tree::HTML 1.10 by Steven McDougall, uses Pod::Tree::PerlPod, Pod::Tree::PerlMap | ||
Quoted from the documentation for Pod::Tree::PerlPod: "Pod::Tree::PerlPod" translates Perl PODs to HTML. It does a recursive subdirectory search through $perl_dir to find PODs. "Pod::Tree::PerlPod" generates a top-level index of all the PODs that it finds, and writes it to HTML_dir"/pod.html". "Pod::Tree::PerlPod" generates and uses an index of the PODs that it finds to construct HTML links. Other modules can also use this index. From the documentation for "Pod::Tree::HTML": "Pod::Tree::HTML" reads a POD and translates it to HTML. The source and destination are fixed when the object is created. Options are pro- vided for controlling details of the translation. The "translate" method does the actual translation. |
Quoting the docs again: "Pod::Tree::PerlPod" indexes PODs by the base name of the POD file. To link to perlsub.pod, write L<perlsub> |
As of this writing, PerlPod.pm does not pass perl -c as shipped (Doh!):
"my" variable $pods masks earlier declaration in same
scope at ... PerlPod.pm line 151.
|
SYNOPSIS $perl_map = new Pod::Tree::PerlMap; $perl_pod = new Pod::Tree::PerlPod $perl_dir, $HTML_dir, $perl_map, %opts; $perl_pod->scan; $perl_pod->index; $perl_pod->translate; $top = $perl_pod->get_top_entry; # Alternately: use Pod::Tree::HTML; $source = "file.pod"; $dest = "file.html"; $html = new Pod:::Tree::HTML $source, $dest, %options; $html->translate;
Pod::HtmlTree 0.97 by Mike Schilli, uses use Pod::Html | ||
Quoting the docs: ... like to navigate between all those manual pages in your distribution and even view their source code? ... traverses your module's distribution directory finds all *.pm files recursively and calls "pod2html()" on them, hereby resolving all POD links (L<...> style). It then saves the nicely formatted HTML files under "docs/html" and updates each "SEE ALSO" section to contain links to every other *.pm file in you're module's distribution. |
Automatically fills in a blank SEE ALSO section for you (a nice touch).
That includes a link to the source code itself (which would be a very nice touch if it worked, but it doesn't seem to link to the right place). |
Always puts the output in a subdirectory of the pod
directory called "docs/html".
Uses pod2html internally, so the quality of the html output is not likely to be great. Fixing up html links later in post-processing seems a little cheesy to me (I'm looking for decent pod processing modules so I can avoid doing things like that). Doesn't do relative html links, generates funky HREF strings with a "/./" stuck in after the $htmlroot string. If $htmlroot is blank, the HREFs begin with "/" and that isn't good for much. Processes L<> links even if they're indented (technically that should mean it's a code block deserving <PRE></PRE> tags). |
SYNOPSIS use Pod::HtmlTree qw(pod2htmltree); pod2htmltree($httproot);
Pod::Simple::HTML by Sean M. Burke, uses Pod::Simple and Pod::Simple::PullParser | ||
Pod::Simple::PullParser is described: This class is for using Pod::Simple to build a Pod processor -- but one that uses an interface based on a stream of token objects, instead of based on events. |
|
The documentation has even more TODOs scattered around than in Pod::Simple::HTMLBatch.
|
SYNOPSIS TODO ((sic)) perl -MPod::Simple::HTML -e \ "exit Pod::Simple::HTML->filter(shift)->errors_seen" \ thingy.pod
DocSet by Stas Bekman, uses Pod::POM | ||
From the documentation for DocSet:
This package builds a docset from sources in different formats. The
generated documents can be all nicely interlinked and to have the same
look and feel. Currently it knows to handle input formats: * POD * HTML and knows to generate: * HTML * PS * PDF |
Used in production on a high visibility project with massive amounts of documentation: mod_perl. Can merge together project docs with source written in pod and/or html. |
Documentation is weak. You're expected to clone the
"examples" directory and figure it out. The configuration file
is perl code that defines a massive hash of complex
structures. You need to edit a lot of template toolkit
templates manually to get something that doesn't look like
the mod_perl web site.
|
SYNOPSIS docset_build [options] base_full_path relative_to_base_confer_file_location Options: -h this help -v verbose -i podify pseudo-pod items (s/^* /=item */) -s create the splitted html version (not implemented) -t create tar.gz (not implemented) -p generate PS file -d generate PDF file -f force a complete rebuild -a print available hypertext anchors (not implemented) -l perform L<> links validation in pod docs -e slides mode (for presentations) (not implemented) -m executed from Makefile (forces rebuild, no PS/PDF file, no tgz archive!)
Pod::Simple::HTMLBatch by Sean M. Burke, uses Pod::Simple | ||
Not a sub-class of Pod::Simple::HTML.
Works on a tree of files ("Batch").
|
Recently written, under active maintenance. Sean M. Burke hangs out on pod-people@perl.org and answers questions. Does the main job I was after, turns L<> links into relative html links with minimal fuss. Used by the search.cpan.org site. |
The documentation is a little sketchy (as of this writing,
there are lots of gaps labeled TODO). The appearance of the contents page it generates isn't great -- one alphabetized linear list for each top-level directory (I just link past it to use an introductory page of my own). Options do exist to customize (or skip) the contents page generation (I haven't tried them, myself). There's a bunch of javascript folderol for doing user selectable style sheets, the supreme coolness of which is not immediately apparent to me. (There's a partially documented option -- $batchconv->add_css( $url ); which I would guess would override that stuff.) |
SYNOPSIS use Pod::Simple::HTMLBatch; my $batchconv = Pod::Simple::HTMLBatch->new; $batchconv->verbose(3); $batchconv->batch_convert( [ $in_dir ], $out_dir ); From the command line (to get html format docs for all installed modules): mkdir html_docs; cd html_docs perl -mPod::Simple::HTMLBatch -ePod::Simple::HTMLBatch::go @INC .
OODoc 0.90 by Mark Overmeer, (a large but self-contained bundle) | ||
"extends pod with some keywords to be able to
document error messages, inheritance, and examples (it is the
step from visual markup to logical markup) in your code,
but can also accept plain pod..."
|
Could be a good idea: it's occurred to me that
pod and OOP aren't the best mix (e.g. documentation
for a sub-class doesn't describe inherited methods,
you need to look elsewhere to learn about all of them,
following the inheritance chain up the docs).
|
Haven't played with it yet.
Markup uses special extensions of it's own, which may slow adoption. |
SYNOPSIS See OODoc::Parser::Markov
Marek::Pod::HTML by Marek Rouchal, uses Pod::Parser and Pod::Checker (both in the core library); and HTML::Entities, HTML::TreeBuilder | ||
Written about five years ago as a candidate to replace
Pod::HTML (and pod2html); works on one or more files;
has optional ToC generation. Comes with a "mpod2html"
script that acts as a front end.
From the documentation for "mpod2html": An important note: mpod2html will cross-link only those documents that are processed in one conversion session. The benefit is that you will get only working hyperlinks, no "dead ends". The downside is that you cannot simply convert one additional Pod and everything will be nicely cross-linked. |
I gather the author was trying to play nice with the community,
and was working in the then current style (e.g. using the
Pod::Parser module, officially blessed into the core library).
He's succeeded in getting a number of related modules placed
in the perl core.
|
And yet, this module seems to have been abandoned (it still resides in
the preliminary "Marek" namespace). It requires that you give it both the the file system version of a module name and also the package name defined inside the file (usually right at the top)... why doesn't it just read the package name for you? That task was pushed out to the mpod2html script, which makes the module itself much less practical to work with on it's own. |
SYNOPSIS use Marek::Pod::HTML; pod2html( { -dir => 'html' }, { '/usr/lib/perl5/Pod/HTML.pm' => 'Pod::HTML' }); Alternately: mpod2html [ -converter module ] [ -suffix suffix ] [ -filesuffix suffix ] [ -dir path ] [ -libpods pod1,pod2,... ] [ -(no)localtoc ] [ -(no)navigation ] [ -(no)toc ] [ -tocname filename ] [ -toctitle title ] [ -(no)idx ] [ -idxopt options ] [ -idxname filename ] [ -idxtitle title ] [ -(no)ps ] [ -psdir path ] [ -psfont font ] [ -papersize format ] [ -(no)inc ] [ -(no)script ] [ -(no)warnings ] [ -(no)verbose ] [ -(no)banner ] [ -stylesheet link ] [ dir1 , dir2 , ... ] [ pod1 , pod2 , ... ]
Pod::POM by Andy Wardley. uses Pod::POM::Constants, Pod::POM::Nodes and Pod::POM::View::Pod. | ||
Quoting the docs: This module implements a parser to convert Pod documents into a simple object model form known hereafter as the Pod Object Model. The object model is generated as a hierarchical tree of nodes, each of which represents a different element of the original document. The tree can be walked manually and the nodes examined, printed or otherwise manipulated. In addition, Pod::POM supports and provides view objects which can automatically traverse the tree, or section thereof, and generate an output representation in one form or another. A script is provided for converting Pod documents to other format by using the view objects provided. The pom2 script should be called with two arguments, the first specifying the output format, the second the input filename. ... |
Used by Stas Bekman's DocSet.
Andy Wardley himself explains that Pod::POM's advantage is "flexibility in being able to customise the generated output" Cute touch: if you replace "pom2" with symlinks pom2html and pom2text, it will determine the output format from the name of the symlink. |
I almost missed this one:
it may be condemned to obscurity because it doesn't
have "Html" in it's name.
Doesn't seem to make any attempt at converting L<> linkage to html links. File oriented: need to write your own recursive descent and filename crunching code. |
SYNOPSIS $ pom2 text My/Module.pm > README $ pom2 html My/Module.pm > ~/public_html/My/Module.html # Alternately: use Pod::POM; my $parser = Pod::POM->new(\%options); # parse from a text string my $pom = $parser->parse_text($text) || die $parser->error(); # parse from a file specified by name or filehandle my $pom = $parser->parse_text($file) || die $parser->error(); # parse from text or file my $pom = $parser->parse($text_or_file) || die $parser->error(); # examine any warnings raised foreach my $warning ($parser->warnings()) { warn $warning, "\n"; } # print table of contents using each =head1 title foreach my $head1 ($pom->head1()) { print $head1->title(), "\n"; } # print each section foreach my $head1 ($pom->head1()) { print $head1->title(), "\n"; print $head1->content(); } # print the entire document as HTML use Pod::POM::View::HTML; print Pod::POM::View::HTML->print($pom); # create custom view package My::View; use base qw( Pod::POM::View::HTML ); sub view_head1 { my ($self, $item) = @_; return '<h1>', $item->title->present($self), "</h1>\n", $item->content->present($self); } package main; print My::View->print($pom);
There are more ways to do it than are dreamt of in your philosophies, Horatio:
|
Let us turn back the clock to the golden years of the perl 5.6 era:
perldelta - what's new for perl v5.6.x: As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned "base parser code" recommended for use by all pod2xxx translators. Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already underway.Too bad something went wrong with that effort...
But hey, at least now it's on the todo list:
en-5.8.5 - perltodo: POD -> HTML conversion still sucks Which is crazy given just how simple POD purports to be, and how simple HTML can be.
Neglect not the perl pod-people list, if you have any interest in this subject at all: pod-people archive. Some miscellaneous messages quoted from perl.pod-people follow... |
Some promises from Sean M. Burke, from just this year (2004):
From: sburke[at]cpan.org (Sean M. Burke) Date: Thu, 15 Jul 2004 12:28:46 -0800 Subject: Re: perltodo - POD - HTML conversion still sucks (I think that not!) Michael G Schwern wrote: >Instead of just putting in a new POD -> HTML conversion module and leaving >the old one around, consider gutting POD::Html and making it a thin >wrapper around a cleaner module (such as POD::HtmlEasy). Clean up the >old messes. I've already got that basically done for the next release of Pod-Simple.
Jumping back a few years, to 2002:
From: rra[at]stanford.edu (Russ Allbery) Date: Mon, 22 Jul 2002 00:30:01 -0700 To: Dave StorrsSubject: Re: Pod::Html question: L<> with text Dave Storrs writes: > But whenever I try that, it tells me that it "could not resolve link" > and spits up (instead of a link, I just get <EM> tags). I poked at the > source a bit and then wrote the following patch to Pod::Html.pm, but it > seems like I'm probably missing something. Is there a better way to do > this? The problem that you're running into is that pod2html in the Perl distribution needs some serious loving attention. It's currently quite a bit behind the curve compared to pod2text or pod2man. I've been tempted a few times to write a new one for my own purposes, as none of the other POD to HTML translators out there quite do what I want, but as there are already something like four of them, I've held off since it feels like a waste of energy. Parsing POD into HTML can require different techniques than the translators I've already written and is better suited for a tree-style parse, and there are apparently new POD parsers coming up that will make that easier. In the meantime, if you're just trying to convert your own documents, I'd poke around in CPAN and try one of the other POD to HTML translators and see if they work better. If you're trying to get things working for people who are just using the version that comes with Perl, you may be out of luck for the time being, but with any luck a better converter will be in the next release of Perl. (It's possible people fixed some things for 5.8.0, but I'm pretty sure that pod2html is still rather behind the curve.) === From: sburke[at]cpan.org (Sean M. Burke) Date: Mon, 22 Jul 2002 03:11:50 -0600 To: Russ Allbery <rra[at]stanford.edu>, Dave Storrs <dstorrs[at]dstorrs.com> Subject: Re: Pod::Html question: L<> with text Russ Allbery wrote: >[...]The problem that you're running into is that pod2html in the Perl >distribution needs some serious loving attention.[...] Well, not so much attention, as total replacement. The current Pod::Html code is, frankly, the worst (semi-)working code that I've ever seen written in any high-level language -- with the one exception of the "Universal Bulletin Board" source code. Incidentally, now that my book is finally done, I've been poking at my mostly-done perlpodspec-compliant Pod parser (to replace Pod::Parser), and it's going surprisingly well. The first thing that I mean to do with it (as a proof of concept, notably) is write a new pod2html. I think I mentioned this a few days ago, so sorry if I'm repeating myself. >[...]Parsing POD into HTML can require different techniques than the >translators I've already written and is better suited for a tree-style >parse, and there are apparently new POD parsers coming up that will make >that easier.[...] Yes, much much easier -- as easy as it should have been from the beginning! The new Pod parser essentially makes the difference between a tree view and a token view a merely superficial interface question, instead of a substantial difference. === From: sburke[at]cpan.org (Sean M. Burke) Date: Mon, 22 Jul 2002 05:06:25 -0600 To: Dave Storrs <dstorrs[at]dstorrs.com>, <pod-people[at]perl.org> Subject: Re: Pod::Html question: L<> with text Dave Storrs wrote: >[...]From what I see in the man page, I should be able to do the following: > Please click L<here|http://archive.develooper.com> for the > archives >[...] Actually, no; that L<text|scheme:...> syntax is expressly forbidden. As perlpod says: << Or you can link to a web page: * L<scheme:...> Links to an absolute URL. For example, L<http://www.perl.org/>. But note that there is no corresponding L<text|scheme:...> syntax, for various reasons. >> If your perlpod doesn't say that, see http://public.activestate.com/cgi-bin/perlbrowse?filename=pod%2Fperlpod.pod&action=print or, the real scary stuff: http://public.activestate.com/cgi-bin/perlbrowse?filename=pod%2Fperlpodspec.pod&action=print So instead of anything involving the forbidden L<here|http://...> syntax, try something like: The archives are at L<http://archive.develooper.com>