|
Data protection
Data protection in heterogeneous IT environments
Here's
a look at ways in which you can deploy data protection measures in a diverse
IT environment. by P K Gupta
There are various topologies and architectures being used in an enterprise
that enable efficient data protection for different IT set-ups. IT infrastructures
can range from single server set-ups to those with hundreds of servers. The
components of a data protection solution for an organisation have to be chosen
depending upon the size of the data set residing on a network, the nature of
business it is in and the type of policy it has adopted to protect information.
The role of software
Software solutions support all the topologies and architectures existing in
the industry and choosing a combination of the modules available with the vendor
lets an organisation protect its data as per corporate policies and requirements.
In a typical client-server architecture, these applications protect servers
(data) by initiating the data flow, as per policies, to a backup device by sending
meta data (indexing information) to the backup server.
Software solutions can backup data from various operating systems onto several
tape drives simultaneously (tape multiplexing), pushing up the performance numbers
for the hardware being used. Controlling the jukebox picker and loader, putting
the right media in the right drive and keeping track of media within and outside
the tape library are some of the key functions of a backup and recovery software.
With the benefit of experience, the industry has succeeded in developing standards
where software solutions allow data that was backed up from one operating system
(OS), to be restored to another OS eliminating the dependence upon a particular
machine and its attached devices.
Selecting hardware
Hardware tends to be at the forefront of any evaluation for a disaster recovery
(DR) solution as it is more tangible. The hardware solution and the selection
of the technology to be deployed is done based on the data set and the incremental
additions to the said data set.
Recovery objectives obviously drive this selection as some technologies are
quick at backups but suffer from slow restore performances. On the other hand,
some are designed for quicker restores but suffer from low data capacities on
the media. It's a trade off and a call has to be taken by the management on
the technology to be adopted by the organisation.
An important consideration is whether the backup hardware is compatible with
the latest connectivity options. For instance, can it be connected to a SAN
from a scalability and investment protection standpoint?
Another important aspect organisations tend to forget while designing their
data protection policies, is the recovery objective for the various data sets
on the network. All policies should be driven by recovery objectives considered
for protection.
Input/Output operations
Backup and recovery operations are I/O-intensive. During these processes, system
performance takes a hit as system resources are utilised to fulfil I/O requests.
Over the past five years, technology has eliminated performance degradation
of systems and there are ways to protect data sets in the network without having
any impact (or a negligible impact) on systems.
Types of backup
We look at the data protection means for the various network topologies in an
enterprise.
LAN-based backup
The most common topology is the good old LAN-based backup. In this system, networked
servers backup their data onto a designated server that has a tape drive connected
to it. The backup server either initiates policy-based backup or clients issue
adhoc backups to the backup device. During backup, there might be a dip in the
LAN and server performance.
Although the technology is elementary and simple to implement, LAN-based backups
are effective in smaller set-ups and work well in environments, which are not
24*7 and can afford network performance degradation for few hours during the
day. To improve performance, dedicated LANs are put into use enabling faster
back-ups and avoiding the use of the production or functional unit's LAN bandwidth.
SAN-based backup
With a Fibre Channel (FC) SAN, CPUs get more data to process. FC SANs not only
facilitate faster data transfer from disks to CPUs but also permit device sharing
that, in turn, leads to LAN and server-free backups.
LAN-free backup
Network performance suffered as a large number of servers came on to the network.
The need was to avoid using the LAN to move data to the backup device.
A backup server can initiate a backup for different clients on the SAN, which,
after receiving the backup request, writes directly to the tape library. The
connectivity to the tape library is generally through the FC HBA (Host Bus Adapter)
located on the server which gives direct connectivity to the SAN switch. The
tape library is also attached to the SAN switch. Please note that in LAN-free
backups, server performance may deteriorate.
Server-free backup
An extension of the LAN-free backup, the server-free backup lets servers maintain
their computing performance for applications. Having reached the device-sharing
stage, freeing up CPU cycles of production servers was not too far away. With
the advent of new storage technologies, making copies of data and mounting it
on different servers was not too difficult. These multiple copies could then
be used for operations such as backup and data warehousing.
One way of doing server-free backups is to make split-mirror copies of the storage
box (primary storage hardware) and leverage the technology provided by the primary
storage vendor. This method requires the integration of backup software with
the disk vendor's hardware APIs to ensure consistent backups.
Data can be restored to the server connected to the backup device and then resynchronised
to the production server via the storage box interface.
NAS-based backup
With
the advent of Network Data Management Protocol (NDMP), NAS devices can pump
data directly to the tape devices connected to the SCSI or the Fibre port of
the NAS box. The backup server might be any machine on the network. It would
issue a backup command to the NAS box and all the data belonging to any operating
system will be backed up to the tape library.
NAS boxes that are not NDMP-complaint can still be backed
up to a tape library attached directly to them. Normally, these NAS solutions
use Windows-based or Linux-based operating systems to power them and they can
be used as backup servers or storage nodes (machines to which backup devices
are attached) to backup data in a LAN-free mode.
NDMP backups are, by virtue of their architecture, server-free backups as well.
No production server's CPU cycles are utilised when backup happens in NDMP or
a NAS-based set-up. Only the NAS box does the I/O operation to the tape devices,
freeing the CPUs of the production server from this operation.
Over the years
Backup and restore technologies have changed drastically over the past few years.
Tape is a slow medium to recover data from. Therefore, some vendors have started
offering backup-to-disk solutions as an alternative. The technology has become
economical with the cost of disks crashing over the last two years. The fast
speeds of hard disks might be a big motivator for enterprises to go in for backup-to-disk,
but as of today, removable media along with the low costs associated creates
a very high barrier preventing most enterprises from looking at disks for backup
and recovery.
The author is the Director - Strategic Development APAC,
Legato Software, a division of EMC.
|