Archives ||About Us || Advertise || Feedback || Subscribe-
Issue of December 2003 

 Home > Focus
 Print Friendly Page ||  Email this story

Focus: Automation of Websites

Web Site Automation

Enterprises can save time and other resources by automating their Websites. And a good way to do it is by using Perl and other open source languages and tools. by Dr Seamus Phan

Writing HTML pages by hand can be tedious, but not incredibly hard if you have been doing it for years.

To speed things up, you can also rely on sophisticated HTML and site authoring tools such as Macromedia Dreamweaver ( or Adobe Golive ( to take the hassle out of handcoding pages and sites.

However, you cede control to these tools since the code may not be legible when you try to edit them later. In this age of personalisation, site authoring or page writing can become increasingly cumblesome and ineffective, since customers demand personalised Web sites.

Although there are enterprise tools that allow and automate site personalisation, most are proprietary and demand more resource than many Webmasters and administrators can summon.

How can Web sites be automated and yet retain some cost effectiveness? The answer seems to lie in the use of open source languages such as Perl.

Practical Extraction and Report Language, or Perl (, is one of the most widely used server-based scripting languages on the Internet. It was invented by linguist Larry Wall in 1987.

Today, Perl 5.x runs well on almost any flavour of Unix, Linux, Mac OS X, and Windows. Very large (and small sites) today use Perl extensively for a variety of server-based scripted tasks, including,, the US Census Department, Swedish Pension system, Netcraft Internet Survey, and even MessageLabs' SkyScan Antivirus system, according to the Perl Foundation (

A good example of a highly evolved and automated Web site is Slashdot, a regular hangout for geeks.

According to the Perl Foundation, Chris Nandor, senior programmer, Open Source Development Network, said: "Perl makes our lives at Slashdot tremendously simpler than they would be otherwise. Whether it is running our internal ticket system or handling millions of page views a day with mod_perl, or writing tools for maintenance and administration, Perl is the glue that holds everything together."

Why automate?

There is a distinction between client-side, and server-side Web site automation tools. The likes of Dreamweaver and Golive are really client-side automation tools, since authoring is done at the user level on a desktop computer.

On the other hand, server-side automation tools are more sophisticated and have a steeper learning curve, but require little human intervention once the system gets going, unlike for client-side tools.

If you need to personalise your Web site with template-based designs-so that either your organisation or your customers can select preferred interfaces-automation is required. It is almost impossible to accede to requests by customers for specific designs if you author your sites manually.

If you need to keep track of customers or employees and maintain a database of details, as well as present information to them in their preferred formats and depth, automation is required as well.

If you are in a large organisation with thousands of pages of information that needs to be frequently updated or refreshed, it will be unwise to keep the Web site manually updated.

The running costs will add up and your Web site will soon be bleeding profusely. In this scenario, automation can remove most of the hassle of keeping your content up-to-date.

And why shouldn't you automate? In scenarios where you have little content and you can do with a Weblog (blog) or journal-like script instead, automation will be overkill. There are many nice Web sites, as well as enterprise micro-sites that benefit nicely from blogs instead of full automation.

If you can modularise your Web site into micro-sites and manage running journals or blogs, then you may not want to use more sophisticated Web site automation scripts, since they come with a hefty learning tag.

One-line automation

Without venturing into Perl, Server Side Includes (SSI) on Unix can be a godsend if you intend to automate some parts of your Web site.

SSI should be activated through the srm.conf file on your Unix server, where you should insert "AddType text/x-server-parsed-html.shtml" to the MIME list.

The next thing you do is insert the type of SSI command you need within the Web page. For example, if you need to display a file "home.html" within "show.shtml", insert within it.

Some common SSI commands include showing when the document was last modified, by using . If you like to show the local time on a web page, insert .

To actually show template-based designs with SSI, such as configurable headers and footers, you insert commands such as or.

The advantage of using SSI is that you can segregate the content from the header and footer elements.

Further, you only need to edit the header and footer templates once. This will aloow the entire site to be populated with the single line SSI commands.

And it will be able to show the edited header or footer templates flawlessly.

In this instance, the function of SSI is similar to Cascading Style Sheets (CSS), which are progressively being used to replicate typographic styles across entire Websites.

Cooking with Perl

Throughout the Asia-Pacific, where economies are highly competitive and customers demand returns aggressively, it is no wonder that Asia-based companies are adopting open source applications rapidly.

In my capability as a consultant, I have largely employed open source solutions, including Perl and other scripting additions such as PHP/Zend or even Python, simply because customers demand cost-effective and high-performance systems.

N. Viswanathan, a programmer with cgenerator (, a Singapore-based company providing systems integration and content solutions, said: "I like Perl because of its programming flexibility and ease of use. It is an elegant language, not something like JSP where you need a framework and so forth. Perl is simple and powerful, like Cartesian Equations." Cgenerator is a content-centric company that uses server-centric automation to manage a variety of information sources.

Another Perl resource is Paul Helinski's Website Automation Toolkit published by John Wiley & Sons (

It advocates the use of a proprietary SiteWrapper technology, which is a complex Perl-based automation system that can be somewhat difficult to set up, although the results can be gratifying once you get past the initial installation hurdles.

The reason is because much of the Perl source code has URI references that require individual addressing and editing before they will work on a server.

It is not a colossal task akin to writing middleware to connect enterprise applications, but just a slight hassle.

A good venue to seek out Web site automation Perl scripts is HotScripts ( Perl/Scripts_and_Programs/). You can find modules under Calendars, Chats, Click Tracking, Content Management, Groupware, Mailing Lists, Polls, Search Engines, Site Navigation, Web Traffic Analysis, etc.

The advantage of HotScripts is that the directory is elegantly organised, and many scripts are either available for free, or on a trial basis.

The scripts should work under Perl 5.x, and will require a standard Apache Web server system; some require SQL servers such as mySQL or miniSQL.

Perl is moving on to version 6. And this new version will address many of the current inadequacies within the code itself. Perl’s founder Larry Wall said, "Perl 5 was my rewrite of Perl.

I want Perl 6 to be the commu-nity's rewrite of Perl and of the community."

A sneak preview is available in a book entitled "Perl 6 Essentials" by Allison Randal, Dan Sugalski and Leopold Totsch (O'Reilly, ISBN 0596004990).

This article first appeared in Network Computing Asia

- <Back to Top>-  

Copyright 2001: Indian Express Newspapers (Bombay) Limited (Mumbai, India). All rights reserved throughout the world.
This entire site is compiled in Mumbai by the Business Publications Division (BPD) of the Indian Express Newspapers (Bombay) Limited. Site managed by BPD.