|
Storage architecture
Pooled storage is the way to go
A look at how enterprise applications can benefit when storage
resources are grouped into a common pool. by Agendra Kumar
IT architecture has undergone rapid and continuous change in the past few decades,
evolving from mainframe-based systems to client-server to multi-tiered, Web-based
architectures. However, the way systems handle storage has remained essentially
the same, resulting in more storage headaches. The next major change in IT architecture
will create a new computing
paradigm in which storage is no longer tied to individual servers, offering
simplification and scalability through consolidation.
Several trends are converging to speed change in this area:
- As data-centric systems become increasingly critical to business operation,
keeping data online and available becomes a primary concern. The Internet,
for example, fuels 24x7 service expectations. To address this, companies adopt
redundant storage techniques such as mirroring and replication, increasing
overall storage capacity and complexity.
- Per gigabyte storage hardware costs are dropping rapidly because of advances
in storage technology. But the demand for storage is increasing more rapidly
than the manageable capacity, fed by a growing number of data-centric applications
and the increasing use of analytic applications using this data. Consider
the daily archival of clickstream data on major websitesthis gives an
idea of the magnitude and growth rate of the problem.
- Hardware costs, although significant, are only a small part of the total
cost of ownership for storage; the costs of managing and maintaining this
storage can dwarf the actual hardware outlay. The total cost of ownership
for storage is growing at a rapid pace, exacerbated by difficulties finding
and keeping qualified administrators for critical environments.
In short, we have reached a point at which hardware advances alone cannot keep
pace with the storage and data requirements of today's IT environment. We need
to change our way of thinking about storage, focusing on how storage fits in
the IT architecture overall, instead of within individual systems.
The next major evolution of IT architecture will alleviate these problems, consolidating
storage in the enterprise by de-coupling storage from individual systems. This
shift has already begun with the strong emergence of the Storage Area Network
(SAN), which marks the beginning of an emerging, consolidated storage environment.
Future Server Architecture
Analysts have predicted these changes will occur with the arrival of a Future
Server Architecture that is already emerging today. This architecture
will leverage the possibilities of SANs for true storage consolidation, with
server, OS, and software enhancements to manage consolidated data.
According to the Gartner Group, the future server architecture
requires software to manage and access consolidated data, like cluster file
systems, logical and physical partitioning, and OS management frameworks. Unix
server and storage management vendors are already working on these solutions,
and this architecture should become mainstream in a few years, driven by the
enormous benefits of simplified management and reduced cost of ownership.
Storage Area Networks (SANs) have taken the first step in this new architecture
by physically de-coupling storage from specific servers. One of the great hopes
for SANs is simplification through consolidation. But the standard operating
system, volume management and file system software accessing this storage is
still designed for the old "dedicated storage per server" model that
becomes increasingly difficult to manage as systems and storage proliferate.
Emerging Storage Requirements
Consolidated storage hardware as exemplified by SANs is a
necessary first step towards a consolidated storage architecture. But achieving
true storage and data consolidation requires supporting technologies that can
offer a single image of shared storage for data access and data management activities.
The emerging storage architecture requires advances in both hardware and software,
including:
- Shared storage hardware architecture (SAN)
- OS, volume and file system software that leverages shared storage for shared
data and consolidated storage management
- Applications designed to leverage shared data
The shared hardware architecture is already in place with SANs, while the system
software component is coming into place with a number of products such as clustered
file systems and SAN management applications. Clustered file system applications
offering shared data access are beginning to deliver the benefits of this architecture:
namely, storage and server consolidation, improved scalability, and reduced
total cost of ownership through simplified management.
Applications designed for a "shared data" or cluster-specific environment
are rare and exist primarily as special-purpose applications, with a few notable
exceptions in the database space. However, many applications are naturally suited
to a clustered storage environment, offering a chance to adopt this architecture
transparently and immediately for key components of the IT space.
Do you need to Share Data?
One might question the need for concurrent access to the same data from multiple
servers. Why do we need software changes given the physical consolidation of
the SAN? The answer is that sharing data solves many pressing problems in IT
today.
An environment that can provide a single image of consolidated storage to multiple
servers has many advantages. While a SAN lets you physically consolidate storage,
this environment lets you consolidate the data itself. For example:
- Rather than maintaining multiple images of important data, you can put
critical data in a single location on the highest performing, most highly
available storage devices available. This simplifies data management and potentially
reduces the aggregate system capacity.
- Load management is simplified because it is easier to add servers for applications
or to switch a server's tasks to take over a different application. You can
simply add the new files to a shared file system, accessible by all file servers.
- Capacity planning is greatly simplified, as storage decisions are de-coupled
from server hardware decisions. You can add new storage when storage demands
exceed capacity, without having to assign the storage to specific servers.
- Shared storage also makes it easier to provide highly available systems.
If multiple servers have mounted the same file system, then they are available
to pick up the load of a failed server almost instantly. Failover times can
be reduced to seconds instead of many minutes.
Applications
The concept of storage linked to specific systems is engrained in current application
design and systems. Several applications in the enterprise can benefit immediately,
without application re-working, from shared storage implementations using clustered
file systems that arbitrate access between multiple servers and ensures file
system consistency.
Data extraction/loads for decision support systems today are one example of
a common situation made more difficult by the lack of a single file system name
space shared between servers. Extract/load operations are often high maintenance
processes relying on hardware dependent scripts developed and maintained locally.
A clustered file system offers a much more manageable solution. Rather than
exporting and loading data, you can simply split the mirror image for the production
system, perform the extract from the split mirror, and access it immediately,
using a shared file system, from the source DSS system.
Application binary repositories can also benefit from storage consolidation.
The problem is particularly acute in object oriented middleware environments,
in which many applications need to access the same binary image of the application
set. Rather than maintaining and upgrading distributed versions, a shared file
system enables centralized access to these files improving overall manageability
as well as consolidating storage for better efficiencies.
Database data is a special case, as it frequently resides outside the file system
in raw partitions. Because the database manages access to data at a very granular
level, the database engine itself must work in a clustered environment. Oracle
Parallel Server, for example, has this ability, to leverage an integrated computing/storage
environment in which storage and data are shared between servers. This leads
to improved scalability and manageability.
Many other potential applications for data sharing exist in the enterprise space,
including the rather common applications of file serving, e-mail serving, and
source code repositories for development environments.
Future Watch
Although the conservative approach is to wait for server vendors to take this
emerging architecture "mainstream," many businesses cannot afford
to do so. Growing e-Businesses and others with pressing data management needs
can achieve significant benefits immediately by moving to shared data implementations
for key applications and servers.
There are products available in the market that offer a single system image
and single storage image in the SAN environment. In many cases, these products
are based on a mature file system, volume management and clustering software
solutions, so they represent an evolutionary change to proven software instead
of a completely new endeavor.
The writer is Country Manager, VERITAS Software.
|