Archives ||  About Us ||  Advertise ||  Feedback ||  Subscribe-
 Home > Focus - Building Servers
 Print Friendly Page ||  Email this story
Apt App servers

How do you build servers to suit your applications? by Graeme K Le Roux

Some years ago, someone from Novell remarked that using the then new OS/2 was like driving a semi-trailer to the local 7-11 for a slurpy. His point was that OS/2 was an overkill for the bulk of server situations at the time i.e. file storage and print serving.

He was right but his comment overlooked the fact that OS/2 was not intended simply to be a platform for storing files or feeding printers, it was meant to run applications like database engines and mainframe gateways.

Today, with the advent of low cost NAS, using a server for file storage is a rather expensive option, unless you have many printers which require a lot of host processing. Purpose-built print servers are arguably a more cost-effective option than a generic print server. Today, servers run applications such as e-mail or more general messaging services, database engines, Web services and act as hosts for thin client applications. Each of these applications place different requirements on a host platform which is what a server is and thus the configuration of a server can be critical to its performance. Configuration can also affect reliability.

Broadly speaking, applications can be divided into two basic categories; I/O-bound and compute-bound. As their names suggest, the former is limited by the I/O capabilities of the platform while the latter is limited by processing capability. Under most circumstances, it is an extremely bad idea to mix the two types of applications because, in general, you can cost effectively build a system to handle a very high rate of I/O, or you can build it to have a very high processing throughput, but not both.

Consider the difference between a basic Web server and a supercomputer. The Web server is designed to take a large number of small requests and respond with larger parcels of data in the form of Web pages. In this case, little processing is involved but disk and network I/O rates are critical.

A supercomputer on the other hand, works by loading an entire job into memory, throwing massive processing power at it, and then downloading the results from memory to clients usually via a dedicated front-end processor. The supercomputer may chug away for hours on a single job that's what its for.

Bus stops
The difference between a host built for I/O-bound applications and one built for compute-bound ones often boils down to a few key hardware issues which are often neglected. All computers are built around one or more system buses with a CPU at one end and a terminator at the other. Devices such as disk controllers, network interfaces, etc., are attached to a system bus somewhere between the CPU and the terminator.

System buses have a finite bandwidth overload the bus and connected devices will spend a significant proportion of their time waiting for access to it. For example, a high-end RAID controller or a Fiber Channel card on the same system bus as a 100 Mbps network adapter or worse a gigabit NIC can overload the PCI bus and thus prevent either device from running at full speed. This would therefore be a poor configuration for, a very heavily-loaded Web server.

You could use a platform with a separate system bus for the NIC and the storage controller. But whether this would be of any benefit would depend on the way in which the application uses the processor/memory sub-system which connects the two buses. In most cases, having more than one bus helps, but at some point you would need to consider a combination of SAN and load-balancing technology.

Another thing which can drag down the performance of a system bus is the nature of the devices attached to it. So called "bus-mastering" devices are generally far more efficient in their use of a bus and usually faster i.e. have a higher throughput than non-bus-mastering devices. They also place a lower load on the system CPU which allows it to work more efficiently. As a general rule, choose bus-mastering devices in preference to non-bus-mastering devices in servers.

Another trap for the unwary is in mixing different types of devices on peripheral buses: USB, Firewire and most notably, SCSI. The basic rule is that a peripheral bus will run at the speed of the slowest device. Hang a legacy SCSI tape drive on a Wide SCSI bus and the bus runs at the speed of the legacy device. Thus, a relatively slow device like a tape unit on the same bus as fast devices like hard disks will cause throughput of the bus to be governed by the speed of the tape unit.

As a general rule, try to group devices by type-hard disks, tapes and removable storage, and scanners and put each group on a separate bus (channel in SCSI terms) if not a separate controller. It is also a good idea to avoid placing a mix of internal and external devices on the same SCSI bus. This works perfectly well, but you generally get better performance if you avoid such a mixture. Multi-channel SCSI adapters are relatively inexpensive and under modern OSs, installing multiple SCSI adapters in a single host is quite simple.

Don't share disks
One of the most common mistakes which is made in configuring servers is sharing a hard disk between executables (the OS and application programs) and data (e.g. databases, Web pages, etc.). Hard disks are mechanical devices consisting of one or more platters and a head/positioner assembly which flies over the platter(s).

No matter how many tricks you do in the software and hardware to buffer and/or queue read and write requests, the inescapable physical fact is that a disk head cannot be in two places at once. Furthermore, the slowest operation a disk is capable of aside from spinning up or down is moving its heads. If you put data and executables on a single disk, then you guarantee that the disk will have to move its heads a lot more than if you provide separate disks.

Further, in a multi-threaded OS environment, you have a situation where two or more processes are attempting to read and/or write to different locations at the same time. If one process is the OS, it will preempt the others; and if a process is dealing with real time I/O (say streaming data out a network port), data may be lost or unacceptably delayed.

As a general rule, build a server with at least two physically separate hard disks and install your executables on one and data on the other. In practice, this simple and inexpensive change to server configuration can as much as double the performance of database engines, messaging hosts (e.g. Microsoft Exchange) and Web servers. Empirical evidence suggests that it also makes for a much more reliable server particularly in Windows environment. This is possibly due to the slower rate of fragmentation on system disks.

Ramming up performance
The next most important thing to remember about configuring servers is that nothing improves the performance of virtual memory like real memory. In most cases, your system disk the one with all the executables is also where the OS is going to place the data associated with its virtual memory system. This is data, and if the OS swaps memory "pages" to disk too frequently, you end up with the same problem we discussed above with regards to simple application data.

Note that the problem here is not the amount of pages which are swapped to disk; it's the frequency with which the system has to access the hard disk rather than actually doing useful processing.

So reduce the number of times an OS has to go to the hard disk for memory pages by adding more physical memory to store pages in. While OS vendors will tell you how much memory is required to runtheir OSs, it is not the amount required for the OS to successfully run an application.

To run an application, you have to add the amount of memory required by the application to the amount required by the OS. You also typically have to add a small amount of memory for each concurrent user. For example, an NT server typically requires about 256 MB RAM, Microsoft's IIS server needs about 128 MB RAM, so the minimum for a stable, reasonably heavily, used NT IIS server is about 384 MB RAM. Obviously, the amount of memory you put in a server varies with the application application servers in a large thin client environment can easily use a gigabyte of RAM or more.

Processing for power
Another issue is processing power or throughput, which is, the number of instructions that can be processed per second. There are two ways to increase processing throughput: install a faster CPU or add multiple CPUs. Which option works better depends upon the nature of the code being processed. This is not a new problem; the pioneers of supercomputing had to grapple with it and it was the core of the RISC/CISC debate.

Let's say you have 21 pairs of numbers to add up. Now suppose you have a CPU which can do the job in 21 seconds and another which can do the job in 42 seconds. You can either use one CPU and get the job done in 21 seconds or you can use two of the slower CPUs and get the job done in the same time. You'd simply pick the cheapest option. But what if every third pair of numbers was dependant upon the result of the preceding two calculations?

The fast CPU still does the job in 21 seconds, so will the slower CPUs-provided we present the sets of numbers in the same order. But what if we don't? Say we present the additions which are dependant upon other calculations first. The fast CPUs will have to stop and presumably execute extra code to find and do the other additions first. The pair of CPUs will fair better; CPU 1 will wait until CPU 2 does the calculations required; it will then do its additions while CPU 2 waits. CPU 1 will then do the next two additions and so on.

Whether or not two CPUs are better than one here depends upon the processing overhead incurred by the faster CPU when it hits an addition, which requires it to process other additions first. It is this problem of dependence which currently very often prohibits GHz-class CPUs from achieving their benchmark performance in practice. As a rule of thumb, servers generally provide better throughput in practice with multiple CPUs than with single faster units. You can also aggregate CPUs to achieve superior performance. For example if your server offers a maximum CPU speed of 1.2 GHz you can install two 900-MHz units and achieve in theory a processing throughput of 1.8 GHz.

Rules of thumb

  • Don't use a server for simple file storage.
  • Don't mix compute- and I/O-bound applications on a single platform.
  • Choose Bus-Mastering adapters where possible and avoid overloading system buses.
  • Use multi-channel SCSI controllers and/or multiple controllers to avoid mixing device types-hard disks, tapes and removable storage, and scanners. Where possible, do not place external and internal devices on the same channel.
  • Provide separate hard disks for data and executables.
  • Ensure that you have sufficient physical memory on the side of overkill.
  • Multiple slower CPUs are often a more robust solution than single fast ones. Ideally, choose multiple CPUs, each of which runs at about 75 percent of the maximum processor speed the system supports.

Graeme K. Le Roux is the director of Morsedawn (Australia), a company which specialises in network design and consultancy and writes for Network Computing-Asian Edition.

- <Back to Top>-  

Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD