Archives ||  About Us ||  Advertise ||  Feedback ||  Subscribe-
-
Issue of April 2003 
-
  -  
 
 Home > Case Study
 Print Friendly Page ||  Email this story
Case Study: NSE’s disaster recovery initiatives
NSE - Not down in disaster

Even during a disaster, National Stock Exchange's critical trade and settlement operations continue normally, thanks to a well-devised business continuity management plan and an elaborate Disaster Recovery (DR) site at Chennai. Here's a look at the exchange's DR operations. by Soutiman Das Gupta

The National Stock Exchange (NSE) conducts a very high number of trade and settlement transactions everyday. It has a feature-rich nationwide network, which can't afford to be down for a moment, for fear of losing revenue and reputation. To ensure uninterrupted service, the exchange implemented adequate measures to support business recovery, which includes a DR site in another city.

"The exchange's daily critical operations like trade and settlement will carry on even if there's a disaster," promised C. Kajwadkar, Vice-President, NSE.IT Limited. NSE.IT Limited is an organization which implements all IT-related projects at NSE. "We have implemented a Business Continuity Management (BCM) plan and have deployed elaborate DR solutions as a part of the BCM exercise. So if disaster strikes, we're still ticking."

Participation of FIIs in the Indian stock market was important to promote economic reforms in the country. And it demanded that Indian exchanges follow the practices and guidelines of international exchanges. In order to maintain international standards and be committed to continuity, NSE has set up and manages its own DR site in Chennai. All the critical applications and hardware are replicated at the site, which works as a failover in case of an untoward event at the primary site.

"This has made NSE the only exchange in India with a live DR site," claimed Kajwadkar.

NSE also conducted a number of live drills where daily operations at the exchange were conducted entirely out of the DR site. In each drill the performance levels were very satisfactory and the changeover was transparent to the nationwide users.

We look at the plans, solutions, and strategies NSE used to build its DR infrastructure.

DR is the subset
Kajwadkar feels that BCM is a holistic study of which DR is only a subset. He also feels that NSE's focus has been on BCM practices and the DR site implementation was a part of the overall BCM strategy. "BCM is about the ownership of continuity which resides on technology and business heads. And DR takes care of IT systems. The goal is primarily BCM."

BCM through the years
NSE had deployed a basic BCM program in 1997 in its primary site in Bandra-Kurla Complex, Mumbai. It comprised redundant systems with adequate backup and failover. The BCM infrastructure was further developed over the next few years and in 1998, the exchange set up its first DR site in Pune, in line with its BCM policy. The DR site was subsequently migrated to Chennai in 2002.

Technology and loss of business
NSE uses 3,000 VSAT links, owns two VSAT hubs, and is linked to around 1,000 leased lines. Each VSAT link connects to multiple traders in 360 cities nationwide. The systems support around 7,000 concurrent users daily.

To understand the loss of business NSE may suffer in case of a disaster, let's look at the areas that NSE has to manage and its daily turnover:

  • The Wholesale Debt Market (WDM) - Rs 2,900 Crore
  • The Capital Market (CM)/secondary market - Rs 2,500 Crore
  • The derivatives market - Rs 2,600 Crore
  • The entire retail debt market

The exchange runs mission critical applications like trading, clearing and settlement, surveillance, position monitoring, and risk management. In terms of spread of business, NSE's operations in 360 cities connect to brokerage houses, banks, and two depositories.

Analyzing impact
In case of a disaster the tangible and intangible losses will be tremendous. The daily turnover-based revenue in WDM, CM, and derivatives will be affected. Average brokerage of 0.5 percent on daily trading will be lost. Earnings of the exchange's business partners like clearing corporations, depositories, and clearing banks will be affected. And legal liabilities will also arise.

There'll be a large loss to the trading members due to loss of trading opportunity. It will be a loss of image for NSE, the Indian securities industry, and the nation at large. There will be loss in customer base and goodwill.

And the lack of a BCM plan and DR infrastructure will result in unpredictable recovery time and chaotic recovery of operations.

Four-step methodology
NSE achieved its BCM and DR goals in the form of a four-step plan:

Business Impact Analysis: This studied the impact, in case business fails to run. This was necessary to justify any investment in BC and DR infrastructure. Collective wisdom from NSE's business and operation heads was sought. The critical applications pertinent to the business were defined and covered in the DR infrastructure.

"A DR solution is a perpetual effort. So all you do at the primary site must be replicated at the DR site. If you spend x at the primary site, you must spend a similar amount for DR," explained Kajwadkar.

Strategy Selection: BC strategy requirements were identified. Business, technology, and non-technology recovery issues were looked into and aspects like timeframes, options, locations, personnel, and communications were provisioned.

An alternative recovery strategy and the risk associated with each alternative recovery strategy were made. A cost benefit analysis of recovery strategies and present findings were presented to the senior management. Alternate storage sites were identified. Provisions were made for emergency telecommu-nications and data communication.

Detailed Plan development and plan maintenance: This defined aspects like:

  • Plan development requirements (job descriptions, action plans, checklists, matrices and flowcharts).
  • Recovery management and control requirements like team description and team organization.
  • Plan components, drafts, and BC procedures.
  • IT recovery procedures.

Testing, revisions and modifications: This included:

  • Establishing an exercise program, defining exercise requirements, developing realistic scenarios, and creating schedules.
  • Post exercise reporting.
  • Establishing review criteria.
  • Setting audit objectives and scope.
  • Reviewing policies periodically or after events.

The first DR site
In 1997, NSE leased premises in Pune and began to build a DR site. It went live in 1998. Pune was chosen mainly because it was geographically near Mumbai, making it easy to move staff between the primary site and the DR site. The exchange maintained skeleton staff at this facility.

Live drills were performed from Pune, where critical applications were entirely run from the DR infrastructure. Essential
staff was shifted from Mumbai prior to the drill.

The new DR site
In 2001, the DR site was migrated to Chennai. The new site was operational in mid-2002. Time was taken mostly because NSE had to set up its own VSAT hub at the site to complement its VSAT hub at the primary site. The exchange also had to wait for statutory clearance from government agencies.

The reasons for choosing Chennai as the new DR site venue are:

  • Mumbai and Pune are in the same seismic zone.
  • Chennai is in a relatively less sensitive seismic zone.
  • Data has to be replicated daily at the DR site. But Pune at that time did not have very good connectivity options from telecoms and ISPs.
  • NSE wanted the DR site in a state with a different political climate.
  • The staff at Pune was underutilized. Personnel at the well-staffed DR site in Chennai are also involved in development and management of applications when there are no disasters and drills.

A 2 Mbps fiber link connects the primary and DR site and incremental data is shipped every three minutes. This is backed by an ISDN link. A Satellite SCPC (Single Channel Per Connection) link is also being considered. Live drills were also conducted in phases.

What to replicate at the DR site?
NSE used three policies to govern what to replicate at the DR site:

First - All business critical applications must run without compromise. Applications like trading, clearing and settlement; surveillance, position monitoring, and risk management are mission critical and require guaranteed response.

Second - Certain other applications operations will continue to run, but may show less response time and performance levels.
Third - In case of disaster, certain applications will not be performed at all. Delivery deadlines for these applications, like software development and benchmark testing, may be extended.

This is mainly because the primary site has adequate resources. But the DR site may not have those luxuries. A process of resource optimization is necessary at the DR site.

Applications for DR
Eight Stratus fault-tolerant Mainframe class machines run the critical applications at the primary site. They also run other non-mission critical applications like development, testing, load-testing, benchmarking, and simulation. The DR site has three such machines that only replicate the critical applications. The non-mission critical applications are not run during a disaster.

A number of Unix servers and the Oracle database are used for back office activities at the primary site. These are typically clearing and settlement applications. NSE calculated the minimum critical mass needed to carry out these applications to be six servers. So the DR site has six Unix servers, which perform these applications. NT servers are used for mail applications.

The extranet server is hosted at the primary site and mirrored at an external ISP. This is because brokers can always connect to the external ISP through alternate routes via the Internet in case of a disaster.

Wisdom and the future
NSE.IT has gained a lot of expertise and knowledge about BCM from all the projects that its team has implemented for NSE. NSE.IT now hopes to pass on this wisdom to other organizations by way of offering consultancy and planning services.

Kajwadkar says, "The biggest risk to business continuity is the lack of conviction that a risk actually exists." And he hopes enterprises will take BCM more seriously and not wait for disaster to strike first.

In future, NSE plans to enhance the BCM and DR policies, and follow them. As the primary site adds services, the DR site's infrastructure will be enhanced accordingly. And the live drills will continue.

In a nutshell

The company
The National Stock exchange (NSE) is one of the largest exchanges in India and uses 3,000 VSAT links, owns two VSAT hubs, and is linked to around 1,000 leased lines. Each VSAT link connects to multiple traders in 360 cities nationwide. The systems support around 7,000 concurrent users daily.

The need
To attract international investors, offer Business Continuity (BC), and follow BC policies, NSE needed to set up a Disaster Recovery (DR) infrastructure.

The solution
NSE set up a live DR site in Pune and migrated it to Chennai. Data from the critical business applications are replicated at the site.

The benefits
All critical daily operations at the exchange can continue during a disaster. The live DR site can immediately take over in case of a disastrous event.

Soutiman Das Gupta can be reached at soutimand@networkmagazineindia.com

 
     
- <Back to Top>-  

© Copyright 2001: Indian Express Newspapers (Bombay) Limited (Mumbai, India). All rights reserved throughout the world.
This entire site is compiled in Mumbai by the Business Publications Division (BPD) of the Indian Express Newspapers (Bombay) Limited. Site managed by BPD.