Archives ||  About Us ||  Advertise ||  Feedback ||  Subscribe-
Issue of September 2002 
 Home > Special Feature - Storage
 Print Friendly Page ||  Email this story
Special Feature - Storage
A storage architecture for your enterprise

As the volume of data generated in business grows, it becomes increasingly difficult to manage data and storage subsystems. A storage architecture is the solution, but how do you choose the right one for your enterprise?

Information may be a prized business asset, but what's the use of having it digitized and stored away if one can't access it when it's needed most? Organizations store data on various storage subsystems scattered across the enterprise. So locating a specific piece of information may be a tedious and time-consuming process. Managing multiple storage subsystems can be a harrowing task for network administrators. Another problem is the growth in the volume of data that's generated as business expands. The solution to all this is a proper storage architecture that provides for scalable systems, integration and management of storage subsystems. The solution should also offer data security, protection and recovery.

The acquisition of storage needs to be more than just a disk or tape purchase. Your network should graduate to accommodate the storage architecture and function as a whole.

Part of laying a foundation to a storage solution is to understand the kind of OS (Operating System) environment and the technical architecture that defines how storage is connected to the servers that execute the different enterprise applications.

Protecting enterprise data

Besides choosing a suitable storage architecture and implementing the appropriate solutions, an IT manager must also consider data protection and recovery. For this there are backup/restore strategies and solutions.

To achieve efficient backup and recovery in an enterprise, one should take a consolidated information infrastructure approach. In a traditional distributed enterprise with decentralized storage, each platform has its own storage and backup process. To achieve zero downtime, the information must be centralized through a networked storage infrastructure. This allows the information to be centrally protected, shared, and managed.

The storage software will then enable business continuance on top of the networked storage layer. You can use the usual backup solutions like various tape media and disk drives.

Backup server-less
In a server-less backup architecture, the tape device is delegated the role of a system coordinator. In a traditional LAN environment multiple clients or servers get backed up to one server that has the tape automation subsystem attached. Backup tapes are duplicated at each server or over a LAN using products like Legato NetWorker or Veritas NetBackup. This technique provides a central point of control for the backup software, but has the disadvantage of requiring all data to be passed across a relatively slow LAN to the server performing the backup.

A SAN solution eliminates the LAN bottleneck and provides direct backup from multiple servers to the tape automation subsystems. This LAN-free backup technique provides multiple paths into the tape automation systems typically through a FC hub or switch. The SAN architecture for tape backup also eliminates the need to pass the data from the disk arrays through a server.

With server-less backup, the server issues the backup commands (control path), but the data path goes directly from the disk subsystem to the tape automation subsystem. This technique frees up cycles on the server to do productive processing and simplifies the movement of data.

An enterprise can lose data due to a number of reasons like human errors, disk failure, hardware and software malfunction, and natural disasters. There are a variety of hardware and software tools available to perform efficient backup and recovery. Some examples are tape drives, tape autoloaders, tape libraries, and software like Legato NetWorker, Veritas NetBackup, and CA ArcServe.

Whatever may be the choice of storage, most IT heads need to realize that even in today's tough economy if you create a comprehensive storage strategy, it will translate into competitive advantage and significant financial gain.

Storage Media and backup devices

There are various types of storage media that an enterprise can use for primary storage and backup.

Disk Drives
The most common storage media in use today. The data is stored on magnetic or optical disks. Magnetic disks have storage capacity upto 100 GB and optical disks even more. Many factors affect the performance of a disk drive like data transfer speed, latency, access time, and revolution speed.

Hard disks are either IDE ((Integrated Drive Electronics) or SCSI. The advantage of IDE is its lower cost. And the advantage of SCSI is that up to seven or more devices can be attached to the same controller board. SCSI drives are typically used in high-end servers and storage networks.

Tape Drives
A good choice for creating backup and can be an essential ingredient of a disaster recovery solution. It can be stored in libraries for easy access and management. Currently tape cartridges can support upto 10 GB storage. Tape drives are available in formats like DLT (Digital Linear Tape), LTO (Linear Tape Open), Ultrium (A high-end LTO format), DAT (Digital Audio Tape), QIC, and Travan formats.

Recordable DVD-ROMs
This media provides a high capacity solution. You can share, store, backup, and transport large multimedia and data files. DVDs written on a HP DVD-Writer can be read in existing DVD players and DVD-ROM drives. It has capacities upto 4.7 GB.

Magneto-optical media
Magneto-optical rewriteable and permanent write-once disks for optical jukeboxes and drives have capacities ranging from 1.2 GB to 9.1 GB per disk. Manufacturers claim it has an archival life of 100 years.

Storage solutions should integrate well with OS environments. The various flavors of Unix and Windows NT/2000 represent the open systems area. There is a lot of diversity in these systems and many hardware and software solutions may not interoperate in all these environments. The current versions of NetWare, Windows, and all Unix variants have good support for storage integration.

For instance there's HP UX 11i, which integrates well with HP's storage systems.

Third-party storage software (like that offered by Veritas) should also integrate well with the operating system.
One should also consider storage architectures such as HP's ENSA-2 (Enterprise Network Storage Architecture), HP's FSAM (Federated Storage Area Management) and IBM's SSA (Serial Storage Architecture).

A highly available storage architecture lets you:

  • Store and access information on an as-needed basis.
  • Extend your resources through virtual capabilities.
  • Scale up and scale out to meet changing needs.
  • Simplify storage management.
  • Protect your data and your investment
  • Improve efficiency of storage systems by helping you manage more storage with the same resources.

Careful selection of the type of storage system is another important consideration. For this there are three technical architectures like DAS, NAS, and SAN.

DAS (Direct Attached Storage) is commonly used for storing data. But it is limited because it does not allow an organization to share data easily. It is not an enterprise network storage architecture in a true sense. It is also inflexible and has short-term cost and technology benefits. NAS (Network Attached Storage) and SAN (Storage Area Network) have evolved as more reliable enterprise network storage architectures. Let's look at all three of them in detail.

DAS is the most common form of storage used by companies. Most computer storage devices like disk drives or JBODs (Just a Bunch of Disks), tape devices, and RAID systems are directly attached to a client computer. They use various adapters and standardized software protocols like SCSI (Small Computer Systems Interface) and FC (Fiber Channel).

In DAS, I/O (Input/Output) is done in blocks from the server to the storage. This type of attachment has set the standard for performance and utilization by the server. With DAS, performance is typically characterized by response time for access to data and by bandwidth for aggregate transfer rate of data.

DAS has limited flexibility and does not allow you to scale outward easily. A company's network will typically continue to grow while data on the various DASs will remain scattered at different locations of the network. This will not allow all the users in the network to access and share the data. Management becomes a hellish task with so many disparate 'islands' of information all over. Data security and protection is equally difficult as no form of centralized control can be established.

It's no secret that companies that are serious about data storage cannot rely solely on DAS, as an effective storage strategy—it has too many limitations. IDC estimates that storage management costs can be reduced by 40 percent and IT administrators can manage 750 percent more storage capacity by moving out of the traditional DAS.

However in a SOHO (Small Office Home Office) environment with very few users, a DAS can handle the storage needs fairly well.

NAS is a specialized file server that can be plugged into the network (LAN) just like a network printer, hence the name 'network-attached'. It provides file-level access to data and uses standardized protocols like NFS (Network File System-Unix-based), CIFS (Common Internet File System-Windows-based), and TCP/IP to communicate. One of the attractive features of a NAS is that it can serve both Unix and Windows users seamlessly and share the same data between the different architectures. It is ideal for mid-sized and not-so-large companies who have a fairly large volume of data.

With a NAS, storage does not become an integral part of the server. It has a storage-centric design where the server still handles all the processing of data but the NAS device delivers the data to the user. If you have an overflowing hard disk on your main server, a NAS device can allow you to stretch and offer breathing room. You can move archives or completed projects from the main server to the NAS and still allow users in the network to access it. Data storage, security, and backup management can also be centralized.

Computer systems can access data from a NAS over a network via a file 'redirector' that changes the access to a file from the native file system (on the originating computer system) to a network operation using TCP. The remote NAS device runs software that allows the file system to support an individual client access. The file system on the NAS server determines the location of the data requested by the application client whether it is in its cache or on the storage.

NAS causes overhead for using the LAN since it rides on the TCP/IP protocol stack and consumes processing power. It may bring latency into the network when another processing element is placed in the I/O access path.

A SAN is a high-speed dedicated storage network or subnetwork, which can integrate RAID arrays, tape backups, CD-ROM libraries, and JBODs. The SAN network allows data transfers between computers and disks at the same high peripheral channel speed, as if it is directly attached. It is ideal for large companies that have networks across large geographical areas and medium companies who expect quick growth. SANs are used by industries like petroleum, banks and financial institutions, and retail manufacturing.

SAN deployments are largely driven by the use of the FC (fiber channel) standard as a common interface. FC makes use of a circuit/packet switched topology capable of providing multiple simultaneous point-to-point connections between devices. It offers advantages like good connectivity and scalability, and allows large distances. The SAN advent is driven by new requirements of applications like data warehousing, data mining, and OLTP (On Line Transaction Processing), which need high bandwidth and tolerate zero latency. The FC interconnection protocol allows data transmission speeds up to 1 Gbps, which is much faster than traditional SCSI-based PC and server devices which have a maximum speed of 160 Mbps.

SANs see storage as separate from a server like NAS. But unlike NAS, the SAN architecture involves an independent network or subnetwork. It provides its own network to storage and offloads primary network from all storage related I/O and backups. With a SAN, servers are not directly involved in the storage process. They simply monitor it. And by removing I/O from the servers and the LAN/WAN, you free up bandwidth for applications. This allows enhanced network performance and removes traditional bottlenecks. With the use of a SAN switch you can permit concurrent traffic between all servers of the network and share all the storage devices.

SANs support disk mirroring, backup and restore, archival and retrieval of archived data, data migration from one storage device to another, and sharing of data among different servers in a network. SANs can incorporate subnetworks with NAS systems.

FC is, without any question, the backbone of a SAN architecture but SCSI can be used as the interface to link storage devices to the SAN backbone. This is because FC supports simultaneous transfer of different protocols. SANs also support ESCON (IBM's optical fiber interface).

Instead of putting the storage directly on the network, the SAN concept puts a network in between the storage subsystems and the server. This means that a SAN actually adds network latency to the DAS storage model. SAN standards are still in the formative stage and vendors like EMC, Compaq, and HP have announced proprietary standards. This collection of proprietary architectures may create roadblocks to successful NAS and SAN integration and data sharing between heterogeneous platforms.

- <Back to Top>-  

Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD