-
-
   Home
   Archives
 About Us
   Advertise
 Feedback
 Subscribe

Home > Focus > Full Story

Designing an Enterprise Storage Network

Enterprise Storage Network (ESN) is an architecture used to extend the reach and flexibility of a storage infrastructure in order to leverage its value to enterprises. What this means is that an ESN architecture consists of both enterprise storage and storage networking technologies to deliver a common way to manage, protect, and share information regardless of distance or scale.

While designing and implementing an ESN, you have to take into consideration several aspects such as the possible expansion of the enterprise in the future, the number of connections required, security issues, the scalability factor, availability, performance, and the services required. It is important to understand at the outset that ESN represents an inclusive strategy. It includes SAN, NAS, and direct attached connections, only because the requirements of most practical situations cannot be met by a single connection topology. ESN encompasses both SANs and NAS in order to address the demands of realistic environments.

Enterprise Storage
Enterprise storage is at the heart of an Enterprise Storage Network. In order to meet the demands placed upon an information infrastructure by an organization where information is a critical part of its success, the enterprise storage system must perform certain functions.

These are:

  • Heterogeneous connectivity-The application and computing environment of an organization must be flexible to meet the broad and evolving needs of that organization. In order to permit this flexibility, the information infrastructure must be capable of supporting any of the potential computing environments the organization would need at present and in the future, be it Unix, NT, Midrange, Linux, or Mainframe operating systems and associated hardware platforms.
  • Functionally flexible-As information needs of an organization evolve and grow, so will the applications of an enterprise storage system. In order to extend its useful life and reap the greatest RoI, enterprise storage must be scalable. For instance, a system may start off by providing storage to a legacy mainframe application that will eventually become obsolete. The system can then be re-deployed to provide network attached storage for a web environment.
  • Information protection-As information becomes more and more critical to the success of an organization, the ability to ensure continuous access to that information becomes equally important. Enterprise storage must provide for fault tolerance so as to avoid unplanned outages, information availability in the event of unplanned outages, information recovery following such outages, and enable the continuance of information availability through planned interruptions (e.g. backup or batch processes).
  • Information management-Information explosion and technological advances have resulted in an increased demand for information storage. As the amount of information organizations are demanding continues to grow, the ability to manage that information and the infrastructure that supports it also becomes a very critical factor. Enterprise storage systems need to not only facilitate the operation, maintenance, and allocation of information storage; but in addition, they need to facilitate the ability to access that information from any system across the enterprise--all with fewer and fewer resources per unit of data.
  • Information sharing-Speed and flexibility have become the dominant drivers in the way organizations manage their information. Hence, organizations will have to be able to leverage the same information among many applications and hosts. Enterprise storage needs to provide the functionality to enable multiple systems to access the same information for such applications as web hosting, messaging, data warehousing and engineering systems.
    Enterprise storage provides the foundation and functionality for an Enterprise Storage Network. In order to leverage the value of this foundation across the entire organization, seamless access to information infrastructure must be provided across the enterprise. Storage networking offers two technologies to address this evolving connectivity need: Storage Area Networks (SANs) and Network Attached Storage (NAS).

Storage Area Networks
SANs today extend the point to point SCSI protocol used to connect host servers to their storage arrays into a network architecture by encapsulating it in to the fibre channel protocol that supports a many to many topology. In addition to enabling storage and the hosts to be networked together, fibre channel also provides a distance (10km) and speed (100MB) improvement over SCSI.

As a result of these advantages offered by fibre channel, storage infrastructures can now be designed to offer:

  • Greater consolidation
  • Better utilization of storage assets (disk and tape)
  • Ability to access any storage system from any host

It is important to note that while a fibre channel enables these types of networked architectures, it is a technology based on channel I/O. That is, each host virtually has its own dedicated storage; it just happens that storage may be in multiple systems or even in another building.

Network Attached Storage
Network attached storage leverages the maturity and universality of IP networks to provide access to information. While a SAN essentially provides network topologies to do block-level SCSI I/O, a NAS provides network access to files. In order to accomplish this over IP, specialized file-serving protocols such as NFS (for Unix) and CIFS (for NT) are used by the hosts/servers to communicate with a file server. The file server itself manages and accesses information and distributes it over the network to the initiating host. When you access a network drive through your workstation (e.g. a 'G' drive on Windows), you are using network attached storage, where the file server is some network server that you have been given access to through your network login.

Because the file server provides a centralized point of management for the file system that holds the files to be accessed over the network, the same file could be accessed by any of the hosts or clients logged in to the file server. This makes network attached storage ideal for applications and environments that require the sharing of files among multiple hosts or clients even if they each have different operating systems.

On the other hand, IP and Ethernet networks do not provide the same predictability of performance through a dedicated-like service that channel based technologies such as SCSI or fibre channel are designed to do. As a result, NAS is less suited to those applications and environments that depend on dedicated access to storage (e.g. databases or client-server applications).

Why ESN?
A number of factors have come together to initiate a fundamental change in the way information infrastructures need to be architected to meet the needs of organizations. These include the sheer quantity of information, technology advancements, limited human resources, and the need to be flexible. As a result, traditional point to point storage topologies are unable to meet the emerging needs of organizations because they either don't take advantage of the available technology, or are inflexible, and require more effort to manage and maintain.

Enterprise storage networks, through both SAN and NAS technologies, offer the ability to implement an information infrastructure that addresses these factors.

What's changed?
The following challenges have converged to initiate the need for storage networks:

  • Rate of information growth.
  • Drive for management efficiency.
  • Storage/disk technology advancements.
  • Pace of change in business requirements.

The quantity and criticality of information that organizations are storing, managing, and maintaining is growing at an exponential rate. The contributing factors include the explosion of Internet and web content, the emergence of digital media assets (audio, video, and images), data warehousing and CRM applications. As a result, IT shops are realizing that to deploy storage in traditional (point to point or host captive) architectures is becoming unwieldy and costly to manage the multitude of storage pools that evolve (purchasing, maintaining, providing access to applications, back up, etc.).

Storage networks offer the ability to consolidate both the storage and management for large environments, thereby reducing the acquisition and management costs for the information infrastructure. This is increasingly important in an economy in which the ability to attract capable professionals to manage and operate these information infrastructures is becoming progressively more difficult.

In conjunction with and perhaps fuelling the growth of information are the advancements in storage technology. Both of these enable the storage of large amounts of information and present new challenges in terms of data protection, management, and performance. For instance, disk drive densities tend to double every 1.5 to 2 years--at the same cost! On the other hand, enterprise storage arrays have a fixed number of connections (SCSI, FC, or otherwise). Since the rise in capacity of storage arrays has outpaced the number of connectivity ports, we have arrived at a point where a storage network is necessary to exploit the potential capacity (and therefore value) of enterprise storage arrays.

Finally, the pace of change has accelerated to the point where any infrastructure put in place must either be flexible enough to address both current and anticipated needs, or it must be rebuilt time and again as the needs of the organization change. Applications, databases, and the computers that host them are being re-engineered and re-deployed.

Ultimately, the content itself is the crux around which changing infrastructure revolves. Information storage architectures are tied to those applications, databases, and computers; and therefore, must also be flexible enough to enable the re-engineering and re-deployment of an organization's assets.

Results
The factors described above, result in two "situations" that ultimately relate to issues of distance and connectivity. These situations are:

  • Captive storage
  • Captive hosts

'Captive storage' describes the situation where all connections in a storage array are fully utilized, but the array itself has **available capacity. This typically results from situations where each host has small information storage requirements, the disk drive density increases beyond the needs of the number of hosts that can connect to an array, or when the array has relatively few available connections.

For instance, consider consolidating NT application and file serving information on a storage array with 32 available SCSI connections and an available capacity of 9 Tb of protected storage. If the typical NT server requires 50 Gb of storage and you could connect 32 servers, you would utilize 1.6 Tb or less than 20 per cent of the available capacity of the array.

This results in a higher than optimal cost per unit of storage (because of the fixed costs associated with the array and storage software). Further, that array is now unavailable to the next server forcing it to go onto a second array making it harder to pass information back and forth with the other servers on the first array and increasing the effort it takes to manage the environment.

A 'Captive host' describes the reverse situation as captive storage. Essentially, the storage requirements of the hosts connected to a storage array grow to exceed the capacity of that array. This means that an additional array must be deployed and the hosts and information re-deployed and balanced across the two arrays. This involves substantial effort and downtime associated with planning, executing, and testing the migration of the hosts and storage from the original array to the new configuration. In addition, you end up underutilizing the storage assets if there is available capacity elsewhere in the infrastructure that is impractical to use due to distance or connectivity requirements between host and storage.

Traditional information storage infrastructures are intimately coupled with the application infrastructure (including processors, software, etc.) they support. What this means is that when you need to make a change in the application infrastructure, the information infrastructure is affected as well (and visa versa). For example, consolidating a number of applications onto a single host platform would involve migration of applications as well as configuring and implementing the new (consolidated) host platform. In addition to this application infrastructure change, the storage infrastructure must be revisited to ensure that this new single host has access to the combined information of all the original applications. Ultimately, this coupling makes it more difficult for organizations to be flexible and to rapidly deploy, change, or remove applications to meet their evolving needs.

What is required?
Enterprise storage networks present the ability for organizations to de-couple their application and information infrastructures and thereby generate both the flexibility and speed demanded of information technology today. Practically speaking, this means being able to move, add, and change application hosts and/or information storage systems without having to rebuild the infrastructure around the change. In order to achieve this goal, three design concepts should be considered.

These are:

  • Functional (enterprise) storage-the information infrastructure must be functional enough to be capable of performing information-related tasks (such as data replication) independent of the application/host.
  • Virtualization of storage-the information storage pool must appear (to the applications/hosts) as if it were contained in a single array making information accessible from anywhere (in enterprise).
  • Virtualization of applications (hosts)-applications software must be host agnostic, that is, it must be capable of being deployed on one processor today and on a different processor tomorrow.

While implementation of these design concepts together would create the most flexible infrastructure, it is often neither workable nor necessary. More practically, organizations use the applicable portions of these design concepts to strike a balance of their current and anticipated needs.

Storage Area Network or Network Attached Storage
Given the availability of the multitude storage networking technologies, probably the first consideration should be whether to implement a Storage Area Network or Network Attached Storage.

SANs provide the characteristics of a dedicated storage connection while, at the same time, allow robust network topologies to be created to meet the complex connectivity needs. Despite the 100 Mb bandwidth of a fibre channel connection, the primary benefits of a fibre channel network are the flexibility in distance and connectivity that it offers. Because of these advantages, organizations can realize the benefits of flexibility and speed in their information infrastructure. For these reasons, SANs are well suited to those applications where dedicated storage is required. These include-- client/server applications,databases, and applications that require high performance storage.

NAS enables sharing of one particular file with multiple hosts in a heterogeneous environment at the same time. In addition, NAS leverages the mature local area network technologies to provide networking. These technologies are both cheaper and more mature than their fibre channel counterparts. On the other hand, local area network performance is not as predictable as channel based architectures. For these reasons, NAS suits file serving (network directory) consolidation and file serving applications (e.g. CAD, SW development, web hosting, or web mail).

While both technologies offer advantages and disadvantages that help decide which technology best meets the specific needs of a given environment, it is often the case that enterprise storage network implementations require both SAN and NAS. For instance, a typical enterprise environment will have network shared directories and web serving (well suited for NAS); and at the same time, have client/server and database applications (well suited for SAN). Further, in many cases, the file server used for NAS can be connected on the SAN to its storage.

Design Considerations

SANs Achieving Availability Service Levels
As discussed above, various levels of availability can be achieved through both switch and director architectures. Here, we'll examine the spectrum of availability service levels and how they can be attained using the available technologies.

Connectivity extension can be achieved using any of the SAN technologies; however, if the goal is merely to leverage an existing infrastructure beyond the current distance or connectivity limitations, FC-AL hub topologies represent the most inexpensive way to accomplish this goal. The following diagram depicts an existing infrastructure based on enterprise storage being leveraged to a small number of departmental hosts that may be in a separate server room some distance from the primary data center.

Performance can be layered on top of connectivity extension by deploying a FC-SW switch or director in lieu of a hub. The topology would be quite similar to the above connectivity extension topology. However, it is important to note that in both cases a single path to storage is used to achieve the objectives of connectivity and performance. Such an architecture is still subject to faults and is not classified as high availability.

High availability is achieved through architectures with switches or through component design with directors.

Directors are, by design, highly available devices. This is achieved through redundant internal components and the application of highly available (not modular) fibre channel ports. A high availability architecture using directors can be achieved through single connections between devices and the director. In this architecture, single points of failure in the connection path still exist in the host bus adapter and the director ports (even though they are of a high reliability design).

In order to achieve high availability using switches, dual paths through the fabric must be established in order to achieve the same level of redundancy of switching components as with the director architecture. While sometimes more costly and complex to implement than the director architecture, the switch architecture does eliminate all single points of failure and thereby provides improved path availability over the high availability director architecture.

Fault tolerant (5-9s availability) architectures exceed high availability by providing multiple highly available paths for data. Again, this can be achieved through both a switch or director architecture. The fault tolerant director architecture is achieved through dual paths through the director (i.e., dual, high-availability paths). Ideally, the dual paths would be implemented through separate directors. In order to deliver this level of availability through switch architecture, a minimum of three paths would be required because each path does not provide high availability by itself. A triple or quadruple path switch architecture does provide fault tolerant levels of availability while trading off additional network complexity.

SAN-Architecture for growth
An important design consideration for a SAN implementation is how to accommodate growth and change in the environment. With this in mind, there are three basic SAN architecture philosophies that we will explore. These are Centralized, Departmental, and Mixed (Core/Edge) architectures.

Centralized SAN's are characterized by consolidation of information into a single point of control while allowing the application/host resources to be distributed as necessary throughout the enterprise. One goal that should be kept in mind is to provide the ability to access any information to any application both in the original implementation and maintaining that ability as the environment grows and changes. If the environment is small enough, this can be achieved, at least initially, through a single-tier architecture as depicted below.

The specific number of host and storage ports will determine if a switch or director will be required (directors can support larger numbers of host and storage connections due to the sheer quantity of ports they carry). In order to account for future growth (addition of hosts and storage to the fabric), unused or open switch or director ports need to be provided.

For instance, consider an initial environment with 10 dually connected hosts (20 fabric ports required) and a single storage array with 5 connections (5 fabric ports). This would require 25 fabric ports and could be implemented using a single 25 or even 32 port director. However, if we want to allow for 10 additional hosts and an additional storage array to be connected, we would need 25 additional ports in the future. And more importantly require these additional hosts to access the original storage array and the original hosts to access the new storage array.

In order to accomplish this, we would need an implementation using 2 directors, initially deployed with 12-16 ports each, but with a capacity for up to a total of 64, while still maintaining the "universal access" design goal.

A single-tier architecture will not support larger environments. This is because of the number of hosts and storage ports required. In order to scale to larger environments and still provide universal access and availability requirements, it will be necessary to deploy a 2-tier architecture. The first tier provides the connections to the application/host infrastructure and the second tier provides the connections to the information/storage infrastructure. Extensive use of inter-switch links (ISLs) linking the 1st and 2nd tiers enables the universal access to information for all hosts. The number of ISLs must be carefully determined to allow for both multi-pathing through the fabric (to allow for high availability or fault tolerance) and to provide the necessary bandwidth (to avoid network congestion). As with the single tier architecture, unused ports must be designed in to each switch or director; however, fewer free ports are required per device because any given device need only connect into the fabric in order to have access to all devices already connected.

Departmental SAN implementations apply when the organization has multiple independent departments with relatively independent information needs. In these situations, the application/host infrastructure for each department for the most part leverages its own information infrastructure. On the other hand, the need for common infrastructure management, the occasional need to share information across the organization, and the need of the organization to re-organize and re-focus without having to re-engineer its infrastructure all necessitate that this be tied in to a single enterprise storage network. These types of environments typically will evolve from separate departmental SANs.

In order to link these "SAN islands", it is most practical to create a core fabric that each island can plug into.

Thus, management becomes simple and allows for future growth because it is necessary to design the core for extension while each department can be designed independently. In the above diagram, switches were used to meet departmental needs, while a director was used for the core to minimize complexity of design.

Mixed (Core/Edge) architectures are a hybrid of centralized and departmental. They make the most sense when there is a core central infrastructure that can be leveraged by distributed relatively independent computing environments. In these situations the core information infrastructure is architected for high scalability and availability (typically using a director) and services the primary data center information and application resources.

The distributed departments are typically implemented using switch architectures that are connected back to the core infrastructure in order to leverage the centralized resources (e.g. centralized tape backup, data warehouse databases, ERP systems, etc.)

NAS Designing In Scalability
Network attached storage implementations in an organization tend to begin as general purpose file servers providing NAS to individual departments. As the organizational demand and the mission criticality of NAS increases, more and more of these departmental file servers are deployed. Eventually, the NAS infrastructure reaches a point where it becomes necessary to consolidate in order to achieve the availability requirements and curb the growth of management effort required to maintain the environment.

When implementing the consolidated NAS infrastructure there are several factors that should be considered in the design to ensure it scales well:

  • Management of the infrastructure.
  • Enterprise connectivity.
  • Business continuity.
  • Performance and load balancing.

Management is directly related to the level of consolidation (number of systems) and the ability to provide enterprise management of both the storage and the connectivity portions of the infrastructure. In order to minimize the effort/resources required to manage the infrastructure, it is important to meet the needs of the infrastructure with the fewest number of file servers.

At the same time, each server must be capable of addressing all enterprise storage arrays so that there can be a single management framework for the entire infrastructure. This is typically achieved by leveraging a cluster of high-powered special purpose file servers in front of high-capacity enterprise storage systems. Management efficiency can be achieved through optimal clustering (where one standby file server services multiple live file servers) and leveraging the management capabilities of enterprise storage.

Enterprise connectivity means being able to access any file in any file system from anywhere in the enterprise. To best achieve this, the IP network and file server connection must be optimized to allow for initial infrastructure access needs as well as future growth. As with SAN infrastructures, it is important to design available unused network connections on each file server to ensure it can be connected to network expansion with a minimal of hops through the network.

Likewise, file servers must expand their storage connections to allow for growth of their file systems and connectivity to multiple storage arrays. File servers that support high levels of enterprise connectivity are characterized by multiple network and storage (SCSI or fibre channel) connections. In addition, enterprise connectivity means being able to provide access to all flavors of hosts. Today, simultaneous access to Unix and NT systems has become a standard.

Business continuity means being able to minimize both planned and unplanned downtime (inaccessibility). This becomes more and more critical as NAS is both centralized and consolidated where a single outage could impact a large part of the organization.

Unplanned downtime is typically addressed by eliminating single points of failure in the network, storage, and file server. Enterprise storage systems (unlike JBOD systems) have built in redundancy. At the disk level, this is typically through mirroring or RAID implementations. While any of these schemes eliminate single points of failure, consideration should be given to RAID group size as it directly impacts rebuild times while recovering from a failure--thus extending the exposure to a second failure leading to data loss.

File servers typically utilize a combination of cluster capability (for high availability) and remote mirroring (ideally synchronous, for fault tolerance). Unplanned downtime is primarily due to data backup and maintenance. In order to avoid backup and maintenance from impacting operations, separate backup systems (file servers and disk) must be incorporated to perform that function. Typical implementations have a dedicated backup file server that mounts copies of the "live" file systems in order to perform zero-impact backups during operations.

Performance and load balancing capabilities of the NAS infrastructure will directly affect its ability to provide flexibility and growth. The more hosts/clients that a single file server can service, the greater the level of consolidation, connectivity, and management efficiency that can be attained.

Further, as file systems are created, changed, and expanded, it becomes necessary to balance the load across file servers in the infrastructure so that a single file server is not a choke-point while others remain idle. For these reasons, it is generally less expensive to implement larger, faster file server systems that can share access to all the NAS information.

Conclusion
Enterprise storage networks are a marriage of enterprise storage, storage area networks, and network attached storage. While there is much debate about the ultimate storage network technology winner being either SAN or NAS, it is more practical to consider where each technology can be best applied to achieve the information needs of the organization. It is often the case that the two represent complementary technologies. More importantly, it is important to understand the organizational forces that are driving the need to implement an enterprise storage network. The information infrastructure should then be designed leveraging the available storage, SAN, and NAS technologies to best meet the current and future needs of that organization.

NM - T. Srinivasan, country manager, EMC India, can be reached at vasan_srini@emc.com

>>

- <Back to Top>-  

Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD