Home
> Inperso > Full Story
Powering
processors for Unix
Five
years ago, IBM set out to build a system that would change
the way one looked at conventional Unix servers. The result
is the pSeries 690 server (code named Regatta) which was
launched in October. With a new processor called POWER4
and features borrowed from its mainframe systems, Regatta
might just be a threat to Sun Microsystems, which has for
long been dominant in the server market.
Dr. Joel Tendler, Program Director, Technology
Assessment, Enterprise Systems Group, IBM has been one of
the architects of this system. In an exclusive interview
with Network Magazine he speaks about the innovations that
go into Regatta and explains how it stands up against the
competition. by Brian Pereira
You
have been instrumental in designing the POWER4 processor.
What are some of the bottlenecks and technical limitations
with processor architecture? How has IBM worked round these
bottlenecks during development of the POWER4 processor?
When
we began working on POWER4 we asked ourselves, how do you
get the most out of the transistors you have? How do you
build those transistors? If you look at our development
over the last few years, we have been announcing breakthroughs
almost on an annual basis. In 1997 we announced copper (as
a conducting material on the chip). Up to that time chips
were using aluminum which generates too much heat. Copper
is a better conductor and hence the chip runs cooler and
faster. The next breakthrough was Silicon on Insulator (SOI)
where we put a protective layer inside the silicon, effectively
giving the electrons a shorter path between gates. Both
of these innovations are inside our POWER4 chip. We have
also announced other innovations like 'low-K' dielectric
which reduces crosstalk enabling us to move the copper wires
closer. But it's not just having good chips that mattersit's
also about using that technology. Technology goes beyond
siliconwhat matters is how you build systems and put
them together.
As you increase the number of processors within a system
the number of interactions (between components) goes up.
That's where the Switch comes in. Until now, servers used
Crossbar Switches. But the circuitry necessary to make this
work increases (as you add more processors). With POWER4
we use a Distributed Switch which does not have a single
point of controlthe control is spread out among the
chips. With this switch I can pass information between my
processors with higher bandwidth and in shorter time.
The bus is something we call Synchronous Way Pipeline Interface.
It allows me to run the buses at speeds never achieved before.
With the POWER4 system the bus speed between chips is 650
MHz. Sun's Ultrasparc III processor communicates at 150
MHz, so our bus speed is four times as fast. These are dual
buses that run at half the processor speed, and that will
continue to be the case as we increase processor speeds
in future.
IBM has been building mainframes for 30 years and has
a good knowledge of that technology. It is now putting that
technology in Unix/midrange systems. What kind of mainframe
technology goes into the new eServer pSeries 690?
We
have put in things like Chip Kill technology, partitioning
and self-healing. Chip Kill technology says that even if
a memory chip dies/fails I can reconstruct all the information
inside that failed chip. This increases the reliability
of the system's memory.
Partitioning technology allows you to virtualize servers.
Our p690 system allows you to build up to 16 logical partitions,
but it's not just the numberit's how you build those
partitions. The resources in these partitions are uncoupled
and each partition can be configured to have a specific
number of resources. The granularity of these partitions
is one processor, one PCI adapter and a minimum of 1 GB
memory, in increments of 256 MB.
Our competition (Sun) offers 18 domains with 72 processors,
but those are physical partitions. Partitioning was first
introduced on the IBM mainframes in the late 1980s.
What are the migration issues involved when companies move
from a room full of servers to a single server with multiple
processors?
The application has to be ported, but that's not an issue.
There are many customers who have made that migration via
AIX. It's more an organizational issue (a decision to move
to a single box). We have tools that offer migration from
Solaris to AIX. It becomes a lot easier because now I am
able to take the multiple boxes and manage them in one place.
That adds value to the customer not just in terms of flexibility
but also in terms of TCO. It's not just the box; it's the
management of the box, the reliability of the box. It's
things like eLiza (self-healing technology)the vendor
can now stop worrying about the box and invest his resource
in building applications so that he can satisfy his customers.
eLiza technology automatically takes corrective action and
thus reduces TCO.
What are the potential applications for the p690 server?
The
p690 is designed for both technical and commercial applications.
On the commercial side, it can be used by the Banking or
Finance, and Manufacturing sectors. These industries demand
high reliability, the ability to maintain applications,
high bandwidth connectivity, and predictable performance.
The other areas are high-performance computing where it
can be used in universities, for weather forecasting, and
in the defense industry. I can build a supercomputer by
connecting a series of such machines. Each processor is
capable of four floating-point operations in a cycle. That's
over 5 gigaflops. With nine floating-point operations per
processor, that's over a 166 gigaflops for a 32-way system.
Take six of those systems and I have a teraflop.
What is the basic configuration of the p690? What is
its scalability like?
The
p690 comes in modules. I can have a 1-, 2-, 3-, or 4-module
system. Each module has 8 processors. So it can be as small
as an 8-way symmetric multiprocessor system to as large
as a 32-way symmetric multiprocessor. So we offer an 8-,
16-, 24-, or 32-way systems. The processors are either 1.1
GHz (1100MHz) or 1.3 MHz. In future we will move to 1.5
GHz and 2 GHz, and the buses will scale with it.
We believe that you must design today not just with the
technology that is available, but with the technology that
will come in the next few years.
How does the price of this machine compare with that of
the Sun Fire 15K? Assuming both have similar configurations.
Regatta is half the price of a Sun Fire. On a processor-to-processor
base, we offer twice the computing power at half the cost.
Our 32-way system outperforms Sun's 72-way server.
So you are looking at putting fewer processors in the
box?
We
are not looking at putting fewer processors in the box,
but putting in enough processors to do the job.
Fewer processors add value to the customer too. Because
many software vendors charge by number of processors. If
we offer 32 processors at 1.3 GHz it is a much smaller number
than 72 processors on 900 MHz (as in the Sun Fire). So I
have software savings in addition to higher performance.
On the technology front how does Regatta compare with
HP Superdome and Sun Fire?
Both
of those are using technology that is a generation or two
old. We started this project five years ago (end of 1996),
and when we began we tried to forecast where our competitors
would be now and set our targets for the Regatta system.
Both HP and Sun have slipped years from where they said
they would be. We delivered Regatta on the date we said
we would five years ago.
Their switches are a generation behind us. In the area of
the chip itself, at 1100 MHz the POWER4 chip consumes 115
watts of power. The Sun at 900 MHz consumes 75 watts. As
you increase the frequency, the power increases proportionally,
so at 1100 MHz the Sun chip would be 90 wattsthat's
for one processor. The 115 watts for the POWER4 processor
is for both processors and the second level cache.
So it's not just important to have good technology, but
also to know how to use it. It's how you build systems using
technology. Our chips run cooler. The hotter the chip the
lower the reliability. It reduces your TCO.
Comparing our system with HP Superdome, I'd say Regatta
outperforms the HP machine and takes up less space. If you
look at the HP machine in terms of weight and packaging,
there is a significant differences.
Until now developers would first make applications for the
Solaris platform. Having a Unix server with new technology
does not give you an advantage. What about the applications?
We have set up porting centers where we work with the developers
to help them port their applications to our platform. So
eventually all the important applications will be there
on our system.
Brian Pereira can be reached at brianp@networkmagazineindia.com
<<