|
Even during a disaster, National
Stock Exchange's critical trade and settlement operations
continue normally, thanks to a well-devised business
continuity management plan and an elaborate Disaster
Recovery (DR) site at Chennai. Here's a look at the
exchange's DR operations. by Soutiman Das Gupta
The National Stock Exchange
(NSE) conducts a very high number of trade and settlement
transactions everyday. It has a feature-rich nationwide
network, which can't afford to be down for a moment,
for fear of losing revenue and reputation. To ensure
uninterrupted service, the exchange implemented adequate
measures to support business recovery, which includes
a DR site in another city.
"The exchange's daily critical
operations like trade and settlement will carry on even
if there's a disaster," promised C. Kajwadkar,
Vice-President, NSE.IT Limited. NSE.IT Limited is an
organization which implements all IT-related projects
at NSE. "We have implemented a Business Continuity
Management (BCM) plan and have deployed elaborate DR
solutions as a part of the BCM exercise. So if disaster
strikes, we're still ticking."
Participation of FIIs in the
Indian stock market was important to promote economic
reforms in the country. And it demanded that Indian
exchanges follow the practices and guidelines of international
exchanges. In order to maintain international standards
and be committed to continuity, NSE has set up and manages
its own DR site in Chennai. All the critical applications
and hardware are replicated at the site, which works
as a failover in case of an untoward event at the primary
site.
"This has made NSE the
only exchange in India with a live DR site," claimed
Kajwadkar.
NSE also conducted a number
of live drills where daily operations at the exchange
were conducted entirely out of the DR site. In each
drill the performance levels were very satisfactory
and the changeover was transparent to the nationwide
users.
We look at the plans, solutions,
and strategies NSE used to build its DR infrastructure.
DR is the subset
Kajwadkar feels that BCM is a holistic study of which
DR is only a subset. He also feels that NSE's focus
has been on BCM practices and the DR site implementation
was a part of the overall BCM strategy. "BCM is
about the ownership of continuity which resides on technology
and business heads. And DR takes care of IT systems.
The goal is primarily BCM."
BCM through the years
NSE had deployed a basic BCM program in 1997 in its
primary site in Bandra-Kurla Complex, Mumbai. It comprised
redundant systems with adequate backup and failover.
The BCM infrastructure was further developed over the
next few years and in 1998, the exchange set up its
first DR site in Pune, in line with its BCM policy.
The DR site was subsequently migrated to Chennai in
2002.
Technology and loss of business
NSE uses 3,000 VSAT links, owns two VSAT hubs, and is
linked to around 1,000 leased lines. Each VSAT link
connects to multiple traders in 360 cities nationwide.
The systems support around 7,000 concurrent users daily.
To understand the loss of business
NSE may suffer in case of a disaster, let's look at
the areas that NSE has to manage and its daily turnover:
- The Wholesale Debt Market
(WDM) - Rs 2,900 Crore
- The Capital Market (CM)/secondary
market - Rs 2,500 Crore
- The derivatives market -
Rs 2,600 Crore
- The entire retail debt market
The exchange runs mission critical
applications like trading, clearing and settlement,
surveillance, position monitoring, and risk management.
In terms of spread of business, NSE's operations in
360 cities connect to brokerage houses, banks, and two
depositories.
Analyzing impact
In case of a disaster the tangible and intangible losses
will be tremendous. The daily turnover-based revenue
in WDM, CM, and derivatives will be affected. Average
brokerage of 0.5 percent on daily trading will be lost.
Earnings of the exchange's business partners like clearing
corporations, depositories, and clearing banks will
be affected. And legal liabilities will also arise.
There'll be a large loss to
the trading members due to loss of trading opportunity.
It will be a loss of image for NSE, the Indian securities
industry, and the nation at large. There will be loss
in customer base and goodwill.
And the lack of a BCM plan
and DR infrastructure will result in unpredictable recovery
time and chaotic recovery of operations.
Four-step methodology
NSE achieved its BCM and DR goals in the form of a four-step
plan:
Business Impact Analysis: This
studied the impact, in case business fails to run. This
was necessary to justify any investment in BC and DR
infrastructure. Collective wisdom from NSE's business
and operation heads was sought. The critical applications
pertinent to the business were defined and covered in
the DR infrastructure.
"A DR solution is a perpetual
effort. So all you do at the primary site must be replicated
at the DR site. If you spend x at the primary site,
you must spend a similar amount for DR," explained
Kajwadkar.
Strategy Selection: BC strategy
requirements were identified. Business, technology,
and non-technology recovery issues were looked into
and aspects like timeframes, options, locations, personnel,
and communications were provisioned.
An alternative recovery strategy
and the risk associated with each alternative recovery
strategy were made. A cost benefit analysis of recovery
strategies and present findings were presented to the
senior management. Alternate storage sites were identified.
Provisions were made for emergency telecommu-nications
and data communication.
Detailed Plan development and
plan maintenance: This defined aspects like:
- Plan development requirements
(job descriptions, action plans, checklists, matrices
and flowcharts).
- Recovery management and
control requirements like team description and team
organization.
- Plan components, drafts,
and BC procedures.
- IT recovery procedures.
Testing, revisions and modifications:
This included:
- Establishing an exercise
program, defining exercise requirements, developing
realistic scenarios, and creating schedules.
- Post exercise reporting.
- Establishing review criteria.
- Setting audit objectives
and scope.
- Reviewing policies periodically
or after events.
The first DR site
In 1997, NSE leased premises in Pune and began to build
a DR site. It went live in 1998. Pune was chosen mainly
because it was geographically near Mumbai, making it
easy to move staff between the primary site and the
DR site. The exchange maintained skeleton staff at this
facility.
Live drills were performed
from Pune, where critical applications were entirely
run from the DR infrastructure. Essential
staff was shifted from Mumbai prior to the drill.
The new DR site
In 2001, the DR site was migrated to Chennai. The new
site was operational in mid-2002. Time was taken mostly
because NSE had to set up its own VSAT hub at the site
to complement its VSAT hub at the primary site. The
exchange also had to wait for statutory clearance from
government agencies.
The reasons for choosing Chennai
as the new DR site venue are:
- Mumbai and Pune are in the
same seismic zone.
- Chennai is in a relatively
less sensitive seismic zone.
- Data has to be replicated
daily at the DR site. But Pune at that time did not
have very good connectivity options from telecoms
and ISPs.
- NSE wanted the DR site in
a state with a different political climate.
- The staff at Pune was underutilized.
Personnel at the well-staffed DR site in Chennai are
also involved in development and management of applications
when there are no disasters and drills.
A 2 Mbps fiber link connects
the primary and DR site and incremental data is shipped
every three minutes. This is backed by an ISDN link.
A Satellite SCPC (Single Channel Per Connection) link
is also being considered. Live drills were also conducted
in phases.
What to replicate at the
DR site?
NSE used three policies to govern what to replicate
at the DR site:
First - All business critical
applications must run without compromise. Applications
like trading, clearing and settlement; surveillance,
position monitoring, and risk management are mission
critical and require guaranteed response.
Second - Certain other applications
operations will continue to run, but may show less response
time and performance levels.
Third - In case of disaster, certain applications will
not be performed at all. Delivery deadlines for these
applications, like software development and benchmark
testing, may be extended.
This is mainly because the
primary site has adequate resources. But the DR site
may not have those luxuries. A process of resource optimization
is necessary at the DR site.
Applications for DR
Eight Stratus fault-tolerant Mainframe class machines
run the critical applications at the primary site. They
also run other non-mission critical applications like
development, testing, load-testing, benchmarking, and
simulation. The DR site has three such machines that
only replicate the critical applications. The non-mission
critical applications are not run during a disaster.
A number of Unix servers and
the Oracle database are used for back office activities
at the primary site. These are typically clearing and
settlement applications. NSE calculated the minimum
critical mass needed to carry out these applications
to be six servers. So the DR site has six Unix servers,
which perform these applications. NT servers are used
for mail applications.
The extranet server is hosted
at the primary site and mirrored at an external ISP.
This is because brokers can always connect to the external
ISP through alternate routes via the Internet in case
of a disaster.
Wisdom and the future
NSE.IT has gained a lot of expertise and knowledge about
BCM from all the projects that its team has implemented
for NSE. NSE.IT now hopes to pass on this wisdom to
other organizations by way of offering consultancy and
planning services.
Kajwadkar says, "The biggest
risk to business continuity is the lack of conviction
that a risk actually exists." And he hopes enterprises
will take BCM more seriously and not wait for disaster
to strike first.
In future, NSE plans to enhance
the BCM and DR policies, and follow them. As the primary
site adds services, the DR site's infrastructure will
be enhanced accordingly. And the live drills will continue.
|
|
The company
The National Stock exchange (NSE) is one of the
largest exchanges in India and uses 3,000 VSAT
links, owns two VSAT hubs, and is linked to around
1,000 leased lines. Each VSAT link connects to
multiple traders in 360 cities nationwide. The
systems support around 7,000 concurrent users
daily.
The need
To attract international investors, offer Business
Continuity (BC), and follow BC policies, NSE needed
to set up a Disaster Recovery (DR) infrastructure.
The solution
NSE set up a live DR site in Pune and migrated
it to Chennai. Data from the critical business
applications are replicated at the site.
The benefits
All critical daily operations at the exchange
can continue during a disaster. The live DR site
can immediately take over in case of a disastrous
event.
|
Soutiman Das Gupta can be reached
at soutimand@networkmagazineindia.com
|