|
Focus: Automation of Websites
Web Site Automation
Enterprises can save time and other resources by automating
their Websites. And a good way to do it is by using Perl and other open source
languages and tools. by Dr Seamus Phan
Writing HTML pages by hand can be tedious, but not incredibly hard if you have
been doing it for years.
To speed things up, you can also rely on sophisticated HTML and site authoring
tools such as Macromedia Dreamweaver (www.macromedia.com) or Adobe Golive (www.adobe.com)
to take the hassle out of handcoding pages and sites.
However, you cede control to these tools since the code may not be legible
when you try to edit them later. In this age of personalisation, site authoring
or page writing can become increasingly cumblesome and ineffective, since customers
demand personalised Web sites.
Although there are enterprise tools that allow and automate site personalisation,
most are proprietary and demand more resource than many Webmasters and administrators
can summon.
How can Web sites be automated and yet retain some cost effectiveness? The answer
seems to lie in the use of open source languages such as Perl.
Practical Extraction and Report Language, or Perl (www.perl.org), is one of
the most widely used server-based scripting languages on the Internet. It was
invented by linguist Larry Wall in 1987.
Today, Perl 5.x runs well on almost any flavour of Unix, Linux, Mac OS X, and
Windows. Very large (and small sites) today use Perl extensively for a variety
of server-based scripted tasks, including Amazon.com, Slashdot.org, the US Census
Department, Swedish Pension system, Netcraft Internet Survey, and even MessageLabs'
SkyScan Antivirus system, according to the Perl Foundation (www.perl-foundation.org).
A good example of a highly evolved and automated Web site is Slashdot, a regular
hangout for geeks.
According to the Perl Foundation, Chris Nandor, senior programmer, Open Source
Development Network, said: "Perl makes our lives at Slashdot tremendously
simpler than they would be otherwise. Whether it is running our internal ticket
system or handling millions of page views a day with mod_perl, or writing tools
for maintenance and administration, Perl is the glue that holds everything together."
Why automate?
There is a distinction between client-side, and server-side Web site automation
tools. The likes of Dreamweaver and Golive are really client-side automation
tools, since authoring is done at the user level on a desktop computer.
On the other hand, server-side automation tools are more sophisticated and have
a steeper learning curve, but require little human intervention once the system
gets going, unlike for client-side tools.
If you need to personalise your Web site with template-based designs-so that
either your organisation or your customers can select preferred interfaces-automation
is required. It is almost impossible to accede to requests by customers for
specific designs if you author your sites manually.
If you need to keep track of customers or employees and maintain a database
of details, as well as present information to them in their preferred formats
and depth, automation is required as well.
If you are in a large organisation with thousands of pages of information that
needs to be frequently updated or refreshed, it will be unwise to keep the Web
site manually updated.
The running costs will add up and your Web site will soon be bleeding profusely.
In this scenario, automation can remove most of the hassle of keeping your content
up-to-date.
And why shouldn't you automate? In scenarios where you have little content and
you can do with a Weblog (blog) or journal-like script instead, automation will
be overkill. There are many nice Web sites, as well as enterprise micro-sites
that benefit nicely from blogs instead of full automation.
If you can modularise your Web site into micro-sites and manage running journals
or blogs, then you may not want to use more sophisticated Web site automation
scripts, since they come with a hefty learning tag.
One-line automation
Without venturing into Perl, Server Side Includes (SSI) on Unix can be a godsend
if you intend to automate some parts of your Web site.
SSI should be activated through the srm.conf file on your Unix server, where
you should insert "AddType text/x-server-parsed-html.shtml" to the
MIME list.
The next thing you do is insert the type of SSI command you need within the
Web page. For example, if you need to display a file "home.html" within
"show.shtml", insert within it.
Some common SSI commands include showing when the document was last modified,
by using . If you like to show the local time on a web page, insert .
To actually show template-based designs with SSI, such as configurable headers
and footers, you insert commands such as or.
The advantage of using SSI is that you can segregate the content from the
header and footer elements.
Further, you only need to edit the header and footer templates once. This
will aloow the entire site to be populated with the single line SSI commands.
And it will be able to show the edited header or footer templates flawlessly.
In this instance, the function of SSI is similar to Cascading Style Sheets (CSS),
which are progressively being used to replicate typographic styles across entire
Websites.
Cooking with Perl
Throughout the Asia-Pacific, where economies are highly competitive
and customers demand returns aggressively, it is no wonder that Asia-based companies
are adopting open source applications rapidly.
In my capability as a consultant, I have largely employed open source solutions,
including Perl and other scripting additions such as PHP/Zend or even Python,
simply because customers demand cost-effective and high-performance systems.
N. Viswanathan, a programmer with cgenerator (www.cgenerator.com), a Singapore-based
company providing systems integration and content solutions, said: "I like
Perl because of its programming flexibility and ease of use. It is an elegant
language, not something like JSP where you need a framework and so forth. Perl
is simple and powerful, like Cartesian Equations." Cgenerator is a content-centric
company that uses server-centric automation to manage a variety of information
sources.
Another Perl resource is Paul Helinski's Website Automation Toolkit published
by John Wiley & Sons (www.wiley.com/compbooks).
It advocates the use of a proprietary SiteWrapper technology, which is a complex
Perl-based automation system that can be somewhat difficult to set up, although
the results can be gratifying once you get past the initial installation hurdles.
The reason is because much of the Perl source code has URI references that require
individual addressing and editing before they will work on a server.
It is not a colossal task akin to writing middleware to connect enterprise applications,
but just a slight hassle.
A good venue to seek out Web site automation Perl scripts
is HotScripts (www.hotscripts.com/ Perl/Scripts_and_Programs/). You can find
modules under Calendars, Chats, Click Tracking, Content Management, Groupware,
Mailing Lists, Polls, Search Engines, Site Navigation, Web Traffic Analysis,
etc.
The advantage of HotScripts is that the directory is elegantly
organised, and many scripts are either available for free, or on a trial basis.
The scripts should work under Perl 5.x, and will require a standard Apache Web
server system; some require SQL servers such as mySQL or miniSQL.
Perl is moving on to version 6. And this new version will address many of
the current inadequacies within the code itself. Perls founder Larry Wall
said, "Perl 5 was my rewrite of Perl.
I want Perl 6 to be the commu-nity's rewrite of Perl and of the community."
A sneak preview is available in a book entitled "Perl 6 Essentials"
by Allison Randal, Dan Sugalski and Leopold Totsch (O'Reilly, ISBN 0596004990).
This article first appeared in Network Computing Asia
|