More 2 Cent Tips


PerlHoo rocks

Thu, 11 Dec 2003 15:07:49 -0800
By Rick Moen (rick@linuxmafia.com)

Some folks will have noticed me referring people to flat ASCII files I've squirreled away over the years on my Web server, usually inside http://linuxmafia.com/~rick/linux-info . While useful, this collection has always been (1) butt-ugly and (2) disorganised.

I've long realised I needed some sort of proper Web framework for all that material, and Rob Tougher's work updating the Gazette's HTML showed me how much improvement the addition of cascading stylesheets (CSS) can bring with only modest effort[1]. All of these thoughts came together when I ran across PerlHoo, a Yahoo-like Web directory system implemented in two simple Perl CGI scripts.

Please see: Description by author Jonathan Eisenzopf <eisen@pobox.com> in his series of three articles at Mother of Perl, http://www.webreference.com/perl/tutorial (recommended reading).

PerlHoo is simple, malleable, lightweight, fast (up to some thousands of documents per directory), and can point to URLs on or off your system. Its design limitations are:

If you need those things, there's a follow-on called PHPhoo. Personally, neither wanted nor needed them, and PerlHoo's exactly right for my needs.

There were two minor problems with Eisenzopf's design, as I found it in his most-recent (v. 1.1) tarball:

  1. Sucky URLs. PerlHoo indexes show up at CGI-synthesised virtual directory locations, e.g., http://linuxmafia.com/cgi-bin/perlhoo.pl/Apps for the Apps directory of PerlHoo's document tree. Finding a way to substitute something shorter for the "cgi-bin/perloo.pl" portion of those URLs would fix several things at once:

Fixing this required use of Apache mod_rewrite to make the undesirable patch element disappear, and a tiny bit of surgery on PerlHoo itself.

  1. Outdated and somewhat broken HTML. Eisenzopf's CGI-generated pages lack SGML DTDs, closing "body" and "html" tags, and the required "ul" pair to go with its use of "li" elements. The page relies upon setting specific colours by their hexadecimal identities, rather than using CSS. It also incorrectly used a nested "p" and "h3" structure to attempt physical markup. I've fixed all of these things, so that pages generated by perlhoo.pl are now CSS-oriented and pass the W3C validator as HTML 4.01 Transitional.

Just so other people don't have to reinvent those particular wheels, I've posted my modified and documented version of PerlHoo at http://linuxmafia.com/pub/linux/apps/ . The tarball includes full instructions on how to configure Apache, including mod_write .

My PerlHoo instance, "Linuxmafia Knowledgebase", can now be found at http://linuxmafia.com/kb .

To answer the other obvious question: Why, yes, of course I've gotten Ben Okopnik hooked. I'm no dummy! Ben says he's hacked PerlHoo separately to support individual stylesheets for each directory of PerlHoo's index, but I've not yet seen the results.

[1] One difference being that Rob has graphical design talent. I'm certainly not trying to denigrate Rob's excellent work.


 

Copyright © 2004, Rick Moen, rick@linuxmafia.com.

This article was originally published in issue 98 of Linux Gazette, January 2004.