Archives ||About Us || Advertise || Feedback || Subscribe-
Issue of January 2005 

[an error occurred while processing this directive]

 Home > Vendor Voice
 Print Friendly Page ||  Email this story

Data Replication

Introducing the Pull Factor For Advanced Data Replication

Data Replication processes often utilize resources on your primary data site. This reduces the transaction processing speeds. Lim Beng Lay suggests a strategy to prevent that and optimize the output of the data replication architecture.

In today's dynamic business environment, it is quite apparent that business continuity requirements are changing by the day. While trying to address business continuity needs today, enterprises must respond to new business drivers such as round-the-clock operations, higher service-level expectations, closer regulatory scrutiny for out-of-region data protection requirements, and increased sensitivity to loss of data and information assets. Hence the challenge is to reduce risk and increase business resilience, as well as reduce costs and increase efficiency.

Degree of resilience

While the need for business continuity is universally applicable for all organizations, the degree of resilience in business continuity depends upon several aspects. For instance, in many industries and geographies, government regulations require companies to have effective business continuity plans that enable them to protect information assets and maintain their service capabilities in spite of local or regional disasters. The most commonly regulated industries likely to adopt out-of-region strategies worldwide include telecom, transportation, banking and other financial services, government, utilities, healthcare, and e-commerce.

Other factors

Other factors that are crucial when charting out business continuity plans include data replication with guaranteed integrity and consistency, scope of definition of data that needs replication, better Recovery Time Objective (RTO) and Recovery Point Objective (RPO).

While trying to ensure resilience through business continuity, organizations often have to meet these requirements under the budget and personnel constraints. In order to cope with this, organizations take measures to reduce costs and ensure efficiency.

Storage consolidation

One of the most commonly followed methods is popularly known as storage consolidation. While this approach may work, it has its own associated risks; it's like putting all your eggs in one basket. A consolidated platform requires a greater degree of data protection and disaster resilience.

Most data replication and business continuity solutions can use remote replication capabilities for data protection but these replication solutions themselves may consume resources that could affect application performance. Besides they also introduce significant management complexities.

This is further augmented in heterogeneous storage environments while organizations are trying to support local and remote replication processes.

Remote applications

Remote replication processes have long been considered the most acceptable means of protecting organizational data. This method has its own strengths and weaknesses. The introduction of disk based replication systems has boosted remote replication. It is because disk-based remote replication has significantly improved RTO and RPO.

Replication strategies

There are two types of replication strategies adopted by organizations: synchronous replication for local disasters and asynchronous replication for regional disasters.

Synchronous replication is used in an in-region hot site that is used for business continuity and data protection. However, synchronous replication is limited to relatively short distances--typically less than 50 miles--and is suitable for replication to in-region recovery sites. This approach does not account for regional disasters that may affect both the production site and the in-region recovery site.

On the other hand, asynchronous replication typically involves organizations that maintain current data copies at out-of-region recovery sites. Evolving needs and proliferating data sets that require replication have pushed the limits of asynchronous replication solutions. Furthermore, regulations worldwide endorse out-of-region replication for critical industries such as banking and securities trading.

Remote issues

Many issues need to be addressed as far as remote replication is concerned. One of the biggest issues that arise from the use of remote copy solutions is their tremendous consumption of resources. In storage-based solutions, replication uses part of the storage system cache to capture changes and transmit them to the other side. It also uses processing cycles on the storage systems--primarily at the originating (production) data center.

These resources are, in effect, taken away from production applications. The result is lower application performance, or increased cost of added resources to maintain required performance and throughput. Growing data further aggravates this problem. In such a situation, the obvious solution would be to return IT resources to where they belong--the applications.

Remote replication processes can cause significant bandwidth problems that lead to momentary link failures. While both synchronous and asynchronous remote replication processes can co-exist within an organisation, existing solutions require storage for multiple copies of the data, as well as complex management and scripting.

In such a situation, a replication strategy that uses a disk-based journaling and a pull-based replication engine to reduce resource consumption and costs, while increasing performance and operational resilience may turn out to be the best bet. A replication solution providing these features can make data protection and business continuity more efficient and cost-effective than traditional replication methods.

Using this kind of a strategy would mean that the replication solution would essentially, write the designated records to a set of journal volumes at the time of data collection itself. By writing the records to journal disks instead of keeping them in the cache, the replication solution overcomes the limitations of earlier asynchronous replication methods.

Further in the process, writes to the journal can be cached for better application performance. They can then quickly be de-staged to disk to minimize cache usage. In order to achieve this, the journal disks have to be specially designed and optimized for maximum performance.


The journals can contain metadata for each record to ensure the integrity and consistency of the replication process. Each transmitted record set should include a time stamp and sequence number information, enabling the replication engine to verify that all the records are received at the remote site, and to arrange them in the correct write order for storage.

By using local disk-based journaling and a pull-based remote replication engine, the solution releases critical resources that are consumed by other asynchronous replication approaches at the primary site, such as disk array cache in storage-based solutions, or server memory in host-based software approaches. This kind of a solution improves cache utilization, lowers costs and improves performance of production transaction applications.

It also maximizes the use of bandwidth by handling the variations of the replication network resources, enabling enterprises to manage bandwidth cost and RPO more flexibly and intelligently.

The pull-based replication engine also contributes to resource optimization. It controls the replication process from the secondary system and frees up valuable production resources on the primary system.

Such a solution can also increase resilience if the replication solution logs the changes to the journal disk at the primary site and updates the data at the secondary site while maintaining the most current data even in case of a network or bandwidth outage.

Further, if the replication solution is able to pull data depending upon the available bandwidth by buffering journal volumes at the primary site when there is no adequate bandwidth available for transfer, it can result in vastly improved RTOs and RPOs coupled with lower costs and increased resilience.

Moreover, if the solution enables mapping this kind of an approach across three Data Center configurations, organisations can benefit significantly out of a more efficient, affordable and cost effective solution for their data protection needs.

The author is the Product Manager, Asia-South of Hitachi Data Systems

- <Back to Top>-  

Copyright 2003: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD