|
Power conditioning is essential for
any enterprise that wants to provide five-nines uptime.
Here are some typical power-related problems and ideal
strategies to counter them. by Soutiman Das Gupta
The word 'outage'
is likely to scare a CTO more than the man-eating shark
did in the film Jaws. And why not? An outage in your
IT infrastructure will result in loss of productivity,
management dissatisfaction, loss of revenue, loss of
transactions, and may lead to customer dissatisfaction.
The shocking
truth
System downtime
may be the result of various factors: the high-tech
ones like network failures or hardware/software crashes,
or the natural ones like earthquakes, floods, and fires.
But according to
the "Cost of Downtime" survey conducted by Contingency
Planning Research Inc., power-related problems were
the most frequent reason for outage in an enterprise.
The survey also
showed that, power outages took place 25 percent of
the time, and power surges/spikes three percent of the
time. Together (28 percent), they comprised the most
common outages, and were followed by storms, floods,
hardware errors, and fire. Network outages occurred
only two percent of the time.
This shows that
while CIOs/CTOs have been busy creating redundant network
architectures with no single point of failure, the aspect
of power supply and solutions took a 'bit of a back
seat'.

Defining power
conditioning
Power conditioning
is a well-defined strategy to ensure that an enterprise
has continuous power availability, which is clean, steady,
and free of irregularities. It also encompasses power
redundancy in terms of backup and alternate supply.
Enterprise power
conditioning is a subset of Business Continuity Management
(BCM). BCM's goal is to provide 100 percent business
uptime, of which power conditioning is an important
aspect. This is because electricity is the basic necessity
to run the hardware and the applications, which enable
a business.
Another survey
jointly conducted by MAIT, Emerson Network Power (ENP)
India, and Feedback Consulting in 325 Indian enterprises
showed that over 60 percent firms experienced power
disruption more than once a month. And India Inc. could
be losing over Rs 20,000 crore annually, in direct losses,
due to poor power quality and operating environment
related downtime. The final bill for India Inc. would
be much higher if the indirect losses are taken into
account.
Beyond
an UPS
An UPS (Uninterrupted
Power Supply) is the core component of a power conditioning
strategy. It has features like the inverter, voltage
stabilizer, and EMI/RFI filters. It also has a number
of built-in intelligent features, which take care of
most of the typical power-related problems (See box:
What can go wrong with power?). UPSs can also be managed
remotely through browsers. And this provides a lot of
flexibility.
But one has to
look beyond a UPS in order to deploy a total power conditioning
strategy. An enterprise also needs to plan redundancy
and backup at various levels of operation. Redundancy
should also be built into each zone and into all pieces
of equipment. Ground faults and wrong wiring issues
have to be dealt with. And the success of the solution
has to be reviewed through audits and checks.
Besides, older
UPSs do not usually take care of all the power-related
problems. For example, a UPS must have an isolation
transformer (to protect from EMI/RFI noise) in its output
circuit to qualify as a power conditioner.
Click on image for larger
view
 |
| An ideal power conditioning
strategy |
An ideal power
conditioning strategy
An ideal strategy
is one that encompasses all power-related problems.
In large enterprises, power conditioning is usually
the shared responsibility of the Operations Manager
and the IT Head. In smaller enterprises, the IT Head
should be the one to swing into action.
It is difficult
to suggest an ideal strategy since different companies
from different verticals have different business and
operational
needs. But one may use a broad framework.
1. At the point/gateway
where the power enters your enterprise from the electric
supply, there needs to be an automatic transfer switch.
This is because most companies will have a backup diesel
generator set. The transfer switch swaps between the
two feeds. You can install a manual transfer switch,
but it needs a person to be deployed at the site around
the clock. An automatic transfer switch is useful especially
in the case of remote locations.
2. Power from the
transfer switch flows into a surge suppresser. This
controls any high power fluctuations, which are likely
to damage equipment. It has somewhat the same functions
as that of a domestic PC spike buster, but on a larger
scale.
3. AC power now
passes through an UPS which has a battery backup and
automatically switches over to the alternative supply
in case of outages. The power is now distributed to
various departments and sections of an enterprise through
a power distribution cabinet. Some telecom switches
and equipment require DC supply. In this case the company
needs to set up DC power systems and interfaces.
4. Cables must
be robust and the conduits and pipes must be laid according
to safety principles. The embedded AC/DC power supply
is also critical. This is the power supply grid present
inside servers, switches, and other devices. Critical
hardware should have dual power grids, so that one acts
as failover.
5. The customer
also has to evaluate and identify the critical areas
for which uptime needs to be enhanced. There may be
possibility of distribution faults or some fault in
the facility. An enterprise can deploy dual power supplies,
dual distribution equipment, and static switches at
the load end.
6. The capability
to monitor operations from remote locations has emerged
as an important feature for any solution. So, all these
solutions should allow browser-based monitoring. Information
of impending failures like a weak battery bank and alarm
conditions which need manual intervention, can be retrieved
from anywhere in the world.
Some management
software can also send SMS and e-mail messages as alerts.
The remote monitoring and message delivery functions
should be closely integrated with the customer's backend
network.
7. As a part of
a complete power solution offering, all the equipment
has to be wrapped around by services. Servicing starts
right from pre-sales and carries on as a life-long commitment
made by the power vendor to its customers.
Assessing downtime
costs
Manoj John, Industry
Manager, Industrial Technologies Practice, Frost &
Sullivan (India) feels that the extent of damage to
an organization in case of a power outage differs from
one company to the other. Downtime cost can be calculated
using a formula based on a number of common cost factors
which reflect business realities.
Some common cost
factors are:
- User productivity
- Revenue or transaction impact
- IT productivity
- Lost future revenue
- Market impact
- Fees/penalties/other charges
In functions like
ATM (Automated Teller Machine) transactions, mobile
telecom, rail reservations, and air reservations, downtime
has direct impact on revenue and goodwill. The cost
can be calculated by multiplying transaction volume
per minute with the number of minutes lost in downtime.
This revenue is lost forever.
The MAIT and ENP
survey reported that over 50 percent of the respondents
mentioned impact on business processes, PCs, and servers
as the most severe manifestation of downtime. In today's
networked environment, any break in business continuity
not only results in monetary loss, but also erodes customer
confidence and adversely impacts the image of the organization.
Soutiman Das Gupta
can be reached at soutimand@networkmagazineindia.com
|
Anomaly in electrical
supply is usually unnoticed, perhaps because electricity
works silently and invisibly. But a lot can go
wrong with it. Here are some typical power-related
problems.
|
Surge:
A short-term increase in voltage which lasts at
least 1/120 of a second. Surges result from presence
of high-powered electrical motors like air conditioners
and electric pumps. When this equipment is switched
off, the extra voltage is dissipated through the
power line. |
Spike:
A very high but momentary rise in voltage. The worst
spikes are from lightning strikes on the power wiring
which can damage your servers seriously. Spikes
can also originate from the power grid. Spikes and
surges can progressively damage power supply and
other components. So, after a number of spikes and
surges, your server may just die. You may lose a
power supply, your motherboard; and even your hard
drive and everything on it. Fortunately, direct
lightning strikes to power lines are rare because
a power line is usually well isolated from earth,
and lightning looks for an earth. |
Over-voltage:
Sustained voltages that are higher than normal.
Over-voltage causes a computer's power supply components
to overheat and die. |
Brownouts:
Also called 'sags', a brownout is a sustained decrease
in voltage level. If the voltage is too low for
the computer's power supply to compensate, the computer
can freeze or behave erratically. Erratic behavior
is the worst because it can cause major data corruption
on your hard disk. Hard disk drive motors may also
overheat. Over-voltage is the most common power
problem. It's usually caused by the start-up power
demands of electrical devices like motors, compressors,
elevators, and shop tools. The closer to maximum
capacity a power supply is, the less likely it is
to handle a given surge or sag. A computer with
a 300 watt Power Supply Unit (PSU) is likely to
deal better with line irregularities than one with
a 235 watt PSU, although it may not ever need more
than 200 Watt of the PSU's possible output. |
Power
failure: Any
loss of power of more than 1/120th second. In other
words, the plain old power blackout. Servers may
lose data in the RAM, cache, and FAT (File Allocation
Table). |
EMI/RFI:
Electro-magnetic Interference/Radio Frequency Interference
may occur due to electromagnetic noise from devices
like printers, radio transmitters, and industrial
equipment. It may cause mysterious hangs and other
problems. |
Switching
Transient:
Instantaneous ender voltage (tapering off), which
is shorter than a spike and may occur for a nanosecond.
It can result in quirky computer behavior and puts
stress on components, which can lead to premature
failure. |
Harmonic
Distortion:
Distortion of the normal power waveform. This happens
due to the use of variable speed motors, disk drives,
copiers, and fax machines. It can cause communication
errors, overheating, and hardware damage. |
Wrong
Wiring: Wiring
may be deployed incorrectly, but the equipment may
work when plugged in. The equipment will face the
risk of shock and early failure. |
|
Ground
Faults:
Faulty earthing of equipment. This can cause mysterious
malfunctions and destroy network equipment seemingly,
without explanation.
|
|
UPS topologies
Manoj John, Industry
Manager, Industrial Technologies Practice, Frost
& Sullivan (India) says that UPSs can be deployed
in three topologies: online, offline and interactive.
Online UPS protection
provides the highest level of power quality protection,
power conditioning, and power availability. In
an online UPS, the inverter supplies conditioned
AC power to critical equipment even when the mains
supply is not available. And raw power from the
main source is used by a rectifier-cum-battery
charger to power the inverter. The load equipment
will never receive raw power from mains in any
condition.
Offline UPS protection,
also called standby, is a cheap and cost-effective
choice for small, non-critical and stand-alone
applications. In this configuration, mains raw
power is continuously supplied to the load till
it is available. The inverter is normally off.
The inverter will start only after the mains power
fails and there is a relay changeover with a small
break in output power. The break in power is normally
shorter than that required to stop or reboot the
computer operation.
Line-interactive
UPS protection provides effective power conditioning,
which is better than offline or standby UPS and
backup. This is particularly suitable in areas
where power outages are rare, but there are wider
and frequent voltage fluctuations. In this configuration
also, power is normally fed by mains but through
a voltage stabilizer. The inverter runs and supplies
power only during mains failure conditions.
|
|
Conditioned environment
Servers and other
hardware also need environment conditioning to
perform optimally. S.S. Bapat, Country Champion,
Uptime Solution, Emerson Network Power (India)
Private Limited says, "Today's servers and
communication switches generate up to ten times
the heat per square foot as systems manufactured
just ten years ago and that is dictating new approaches
to heat removal."
For example, batteries
used in a UPS are one of the many pieces of equipment
affected by severe heat. The optimal battery temperature
is 77° F. At 111° F the design life of
the battery is reduced by more than 80 percent
The study of environmental
conditioning is very extensive and elaborate,
and will need an entire cover story dedicated
to it. However, in short, servers and other computers
ideally need low temperatures, controlled humidity,
and flow of air over its chassis. It's common
knowledge that domestic type air conditioners
can cause static electricity on the hardware because
they reduce the humidity in the air. Servers and
other hardware need normal humidity condition.
The best solution
is to use precision cooling systems, which work
around the clock and are customized for a particular
enterprise. In data centers and server rooms,
conditioned air should be pumped in from vents
on the floor or ceiling to maintain free air flow.
Handheld digital thermometers can be used near
servers and racks to check local temperatures.
|
|