|
No more idling, go with the grid
Grid computing is a better way to use idle computing power
in your enterprise. Here are a few ways to benefit from these new technologies,
says Dr Alok Chaturvedi
In
the last decade we started seeing an increased use of distributed computing
systems in academic and scientific settings. Early examples of distributed computing
mostly involved the linking of distributed supercomputing resources to create
one large virtual supercomputer.
In more recent times, we are seeing less powerful single processor machines
linked together to produce some of the most powerful supercomputers on earth.
This trend has been greatly facilitated by grid computing.
What It Attempts
Grid computing attempts to solve the problems associated with linking geographically
distributed heterogeneous computing resources through a network such as the
Internet. It draws its name from the idea of an electronic power grid, where
users can tap into the grid to take advantage of idle computing resources.
The central issue addressed by grid architecture is the definition of the protocols
necessary to enable interoperability and the sharing of resources between organisations.
It is a standards-based open architecture that allows extensibility, interoperability,
portability and code sharing.
Thus, grid computing can be viewed as an infrastructure that enables integrated
collaborative use of computers, networks, visualisation resources, storage and
data sets by scientific institutions that may be owned by different organisations.
Gains Of Grid Computing
The gains to organisations using distributed computing enabled by grid technology
can be significantparticularly as previous studies have estimated that
organisations tend to use only 20 percent of their desktop computing capacity.
However, the collaborative use of computing resources raises a number of interesting
problems for businesses.
A specific issue is the problem of sharing and using resources in distributed
computing systems such as computational grids and P2P computing networks.
Some Grids
An early implementation of a grid community took place at NASA, creating the
Information Power Grid which used grid-based technology to interconnect its
supercomputing facilities to allow multi-site simulations of whole space shuttle
designs. In Europe, there are 20 grid projects underway, involving scientific
and academic institutions such as CERN.
In fact one could argue that Europe is leading the charge in distributed computing
or grid technology. They have two major initiatives planned. The first is called
Enabling Grids for E-Science, which aims to build the largest international
computing grid to date. It involves over 70 institutions around Europe with
the goal of providing 24-hour access to computing capacity equivalent to 20,000
of todays most powerful personal computers. The second is a project led
by Frances National Centre for Scientific Research, which aims to link
together seven supercomputing centres around Europe on optical networks.
During the last year, Purdue and Indiana University linked up their supercomputers
to run advanced simulations for the US Defence Department. Recently they also
joined the NSF-funded Teragrid, which links their computing facilities with
computing centres around the country on a high-speed (20 to 40 GB) network.
Purdue plans to allocate 15-20 percent of its available compute cycles to this
grid.
In May this year, the National Centre for Supercomputing Applications took 70
Sony Playstation 2 Game Consoles, linked them up using off-the-shelf components,
and created one of the 500 most powerful supercomputers, running on Linux for
$50,000.
If you compare this to the huge costs of buying and operating mainframes or
supercomputers, these new low-cost virtual supercomputers qualify
as a significant event.
Outside Academic Streams
We also saw the rise of distributed computing outside the academic and scientific
realm. For example, the nineties saw the rise of seti@home, a downloadable screensaver
used to mine astronomical data. It runs in the background when users computers
are idle, and currently has 3.8 million users in 226 countries involving 77
different processors.
We have also seen the rise of related peer-to-peer networks such as Napster,
Gnutella and Kazaa which involve individuals sharing files across the Internet.
The rise of distributed computing systems in the academic and scientific community
generated a similar trend in the business sector. We began to see business organisations
increasingly using the distributed computing ideas being developed in the scientific
and academic sectors. For instance, we initially witnessed the rise of cluster-based
computing, which involved loosely-coupled homogenous single or multi-processor
systems connected to each other through the Internet.
Vendors With Grids
The trend towards distributed computing has gained momentum in the last two
years with the introduction of computational grid-based initiatives by several
large technology firms including IBM, Sun, Oracle, HP, Computer Associates and
Avaki. HP, for example, began offering an enterprise grid consulting programme
to allow firms to respond to enlarged workload by tapping additional computing
resources.
Oracle has started shipping its database and application servers (Oracle 10g
and Oracle Application 10g) with grid-enabling technology. In January this year
IBM introduced grid technology-based products targeted at specific industries
that require high-powered computing. These industries include the financial,
pharmaceutical, aerospace, life science and automotive sectors. The idea is
that companies in these industries can use grid-based solutions to meet their
high-end and/or high-throughput computing needs.
Applications On Grid
The types of applications that computing grids are envisaged to solve include
advanced financial or numerical analyses, collaborative design and research,
scientific analysis such as protein folding or genome mapping, simulations,
and analysis of complex systems, among other uses.
These distributed computing systems would also allow firms to handle spikes
in demand as well as earn revenues on unused system capacity (assuming firms
earn revenues on capacity leased to the computing grid), greatly enhancing the
efficiency of their systems while serving as an insurance mechanism for computing
demand fluctuations.
Grids In Enterprises
Grid-based solutions and products can be highly effective and useful for businesses.
One financial company implemented grid technology from IBM to run a numerical
analysis on a wealth management programme, and cut the time required to complete
the analysis from 280 seconds to 15.
Pharmaceutical company Aventis recently implemented data grid technology from
Avaki to allow secure, wide-area access to bioinformatics research data while
allowing users to significantly increase the speed of applications.
Swiss pharmaceutical giant Novartis initially considered buying a supercomputer
to design new drugs. However, the realisation that they had access to a virtual
supercomputer in the unused cycles of the thousands of PCs used in their offices
led them to buy a grid-based solution from United Technologies. This resulted
in a 2,700 desktop-based computational grid that led to the discovery of several
important molecules. The programme was so successful that they plan to expand
the grid to include all the 70,000 computers on their corporate network.
In November last year, HP announced that it was linking up some of its high-end
computers with BAE Systems (a British aerospace and defence systems company),
Cardiff University, the University of Wales, Swansea, and the Institute of High
Performance Computing in Singapore. The aim was to create an inter-organisational,
international, multiple-platform, distributed computing grid to do advanced
simulations and design.
Grid Computing Standards
The increased interest in distributed computing systems has also initiated
a movement to establish sets of standards for grid computing. The Open Grid
Services Architecture is an attempt to define the mechanisms for creating, managing
and exchanging information among entities. It draws from the Globus Toolkit
developed at Argonne National Laboratory and the University of Southern California.
Advantages For Enterprises
There are many advantages to firms moving to the distributed computing environment.
First, it allows companies to exploit under-utilised resources within the organisation
and within other organisations. Additionally, becoming a part of such a system
will give companies access to additional resources in times when they may need
them. Thus, organisations benefit not only from being able to earn revenues
on their idle capacity (from other organisations that are part of the computing
grid), but also from being able to take advantage of other organisations
idle capacity. Such systems would thus allow firms to handle both spikes in
computing demand while simultaneously increasing the efficiency of their systems.
Second, organisations belonging to a computing grid are capable of doing jobs
that benefit from large-scale parallel processing. They also gain access to
hardware and software that are not a part of their own infrastructure. Thus,
they can better manage their own investments in computing resources and gain
access to a more balanced set of resources.
Classes Of Useful Applications
The classes of grid applications that businesses might find useful include:
- Distributed High-Powered Computing (HPC) used in
computational science.
- High-Throughput Computing (HTC) used in large-scale
simulations, chip designs and parameter studies.
- Remote software access used by ASPs and Web services.
- Data-intensive computing used in drug design, particle
physics, and stock prediction.
- On-demand real-time computing such as medical instrumentation
and remote medicine.
- Collaborative computing used to do combined design,
data exploration, and education.
A Few Recommendations
Here are some recommendations for exploiting the emerging high-performance
computing environment:
- A computational environment that can support high-productivity
organisations: Building a computational environment for high-productivity
activities with new generations of high-end programming environments, software
tools, architectures and hardware components would result in many benefits.
It would enhance computational efficiency and performance, reduce the time
and cost of developing applications, and enhance portability to insulate research
and operational applications from system specifics. It would also provide
a common user interface standard for ease of use, improve reliability, reduce
the risk of malicious activities, and create a scalable architecture for processors,
memory, interconnects, system software and programming environments. Further,
a computational environment would provide high bandwidth and design point
tailorability such as programming models and virtual machines.
- Support for a large-scale storage network for data
intensive applications: Forming a support mechanism for applications with
large, high-rate streams called Data Intensive Applications (DIA) would enable
the full use of increasing process capabilities and reduce the under-utilisation
of system resources due to restrictive data flow and high latency.
Organisations would be capable of many new developments by creating these DIAs.
For instance, new data access architectures to optimise processing of high-rate
stream data applications could be developed.
Data Flow and Placement architecture could also be made to
significantly improve data availability. Another benefit would be to create
augmented and adaptive cache techniques to optimise data movement and utilisation.
Other advantages include the design of innovative models of traffic, network
and control, as well as measurement and validation tools.
- A dynamic, trusted, collaboration environment where security
management is a priority: Developing a dynamic, trusted collaboration environment
would support the secure creation of technologies for policy management, group
communication, secure services, data sharing and collaboration spaces.
Examples would include creating a single log-in process, team-based access
control, a common user interface, inter-domain key management, a certificate
cache architecture, facilities for dynamic delegation, system configuration,
administration tools and methods, security documentation of open source
systems, and information assurance methods and tools.
- A unified, high-speed, wired and wireless network infrastructure:
The establishment of a Unified Network Architecture would serve as a highly
dynamic runtime environment to support fine granularity network services as
a basis for describing, provisioning and tailoring resources. It would offer
flexible, efficient and secure protocols for communication strategies, scalable
network management, and quality of service management. Also, it would supply
a quantifiable improvement in network services and fault tolerance, and add
multi-tiered mobile security such as dynamic access control and separate traffic
and administrative services.
Dr Alok Chaturvedi is the Director of Purdue Homeland Security
Institute, Purdue University, West Lafayette, Indiana, USA
|