technology offers tremendous benefits for enterprises. Understanding
how VoIP works, and the protocols/issues involved, is the
first step towards evaluating this technology. by Soutiman
has become an accepted standard for communications over data
networks worldwide, and enterprises have successfully implemented
it. In 1995, IP found a new passenger, which it had to transport
between networks and devicesVoice. Voice can now be
packetized and sent over a data network using VoIP (Voice
over Internet Protocol) technology. This revolutionary technology
is now setting off a new trendthe convergence of voice
and data networks. The convergence has also paved the way
for a wide array of new applications.
In India, restrictions on IP Telephony services are about
to be lifted and enterprises will soon be able to fully leverage
on these services. As an immediate benefit, an enterprise
can cut costs on long distance calls. Other benefits may go
beyond the monthly phone bill. What's more, being a bit late
to adopt, we can learn from the limitations faced by early
adopters of the technology in other countries. This will help
us introduce a much-evolved, stable, and reliable architecture
in our backyard.
To understand VoIP and its packet-switching nature, we have
to take a closer look at circuit-switching and the PSTN (Public
Switched Telephone Network) architecture.
In a traditional PSTN telephony architecture, when a phone
call is made to a particular number, a dedicated channel is
assigned for the connection. This channel/circuit typically
uses 64 Kbps bandwidth in each direction totaling the transmission
rate to 128 Kbps. The voice traffic passes through the carrier's
switches at the caller's and receiver's end, and all the switch
ports that support the connection are used throughout the
duration of the call. TDM (Time Division Multiplexinga
technology that transmits multiple signals simultaneously
over a single transmission path) is used to accommodate more
connections within the limited capacity.
Circuit-switching has many disadvantages. Let's say that you
talk for 10 minutes. Since bandwidth used is 128 Kbps (or
16 KB), the total transmission for the length of the conversation
is 9600 KB or roughly 9.4 MB. During this time, the circuit
is continuously open between the two phones. In a typical
voice conversation while you talk, the other party listens.
This means that only half of the connection is in use at any
There are lots of 'gaps' or silent periods when no party is
talking. Since the circuit is dedicated, the bandwidth during
the silent periods is utilized and wasted. So, we can in effect
cut the file in half down to about 4.7 MB. Subsequently, the
switch ports at the local exchanges are also dedicated for
the entire duration of the call regardless of the silence.
This makes bandwidth provisioning difficult and time-consuming.
The PSTN switching infrastructure may not be able to handle
high traffic easily and may buckle. The carrier is not able
to cut down on operations costs and consumers do not get any
cost benefits. Also, billing is based on time and distancenot
on amount of traffic.
It is not easy to initiate new services and/or incorporate
any changes in a PSTN network. Capital costs are high which
is partly due to the over-engineering required to support
peak-time traffic. You cannot integrate the network well with
multimedia applications and existing voice margins are very
flat. You need to have different networks for different services.
switching versus packet switching
not allow inclusion of new services easily
technology-allows new services to be integrated easily
Data is switched
64 Kbps in each direction
length and can be
compressed to 8 Kbps
according to time and duration
according to usage
uses packet switchinga technology commonly used by data
networksinstead of circuit switching. While circuit
switching keeps the connection open and constant, packet switching
opens the connection just long enough to send a small data
packet, from one system to another. Each packet in a packet
switched network contains a destination address. This allows
all packets in a single message to be dynamically routed on
different paths over the network depending on availability.
The destination computer reassembles the packets back into
their proper sequence.
A technique called statistical multiplexing is used to accommodate
more signals in a channel. This technique analyzes the traffic
and dynamically changes its pattern of interleaving to use
all the available capacity of an outgoing channel.
Packet switching minimizes the connection time between two
systems and reduces the load on the network. It frees up the
two systems communicating with each other so that they can
accept information from other systems simultaneously. Several
telephone calls can occupy the amount of space used by only
one call in a circuit-switched network. And the size of each
call can be further reduced with the use of data compression.
Like any other technology, VoIP is defined by standards. There
are various standards out in the VoIP arena like H.323, SIP,
and MGCP (Media Gateway Control Protocol). A better way to
understand VoIP is to get a lowdown on these standards.
The H.323 specification was ratified in May 1996 by the ITU
(International Telecommunications Union). It defines how voice,
data, and video traffic will be transported over IP-based
LANs and WANs.
H.323 is actually a suite of protocols developed for specific
applications. Recommendations in the H.323 series include
H.225.0 packet and synchronization, H.245 control, H.261 and
H.263 video codecs, G.711, G.722, G.728, G.729, and G.723
audio codecs, and the T.120 series of multimedia communication
protocols. A codec, which stands for coder-decoder, converts
an audio signal into a compressed digital form for transmission
and back into an uncompressed audio signal for replay.
SIP (Session Initiation Protocol) is an IETF (Internet Engineering
Task force) signaling protocol for establishing real-time
calls and conferences over IP networks. Each session may include
different types of data like audio and video although currently
most of the SIP extensions address audio communication.
As a traditional text-based Internet protocol, it resembles
the HTTP (Hyper Text Transfer Protocol) and SMTP (Simple Mail
Transfer Protocol). SIP uses SDP (Session Description Protocol)
for media description. SIP's basic architecture is client/server
in nature. The main entities in SIP are the User Agent, the
SIP Proxy Server, the SIP Redirect Server and the Registrar.
SIP is independent of the packet layer. The protocol is an
open standard and is scalable. It has been designed to be
a general-purpose protocol. However, extensions to SIP are
needed to make the protocol truly functional in terms of interoperability.
The protocol also enables personal mobility by providing the
capability to reach a called party at a single, location-independent
Both SIP and H.323 define mechanisms for call routing, call
signaling, capabilities exchange, media control, and supplementary
services. SIP is a new protocol that promises scalability,
flexibility and ease of implementation when building complex
systems. H.323 is an established protocol that has been widely
used because of its manageability, reliability and interoperability
(Media Gateway Control Protocol) is a master/slave protocol
that provides tight coupling between the MG (Media Gateway)
which is the endpoint and the MGC (Media Gateway Controller)
server. MGCP-based VoIP solutions separate call control (signaling)
intelligence and media handling.
Like SIP and H.323 it relies on a variety of other existing
protocols like SDP for describing the media aspects of a call,
and RTP/RTCP (Real Time Protocol/RT Control Protocol) which
are used by MGs for handling the real-time transport of media
for the enterprise
Originally regarded as a novelty, VoIP technology is attracting
more and more users worldwide because of the benefits it offers
to the enterprise, service provider, and ultimately, the consumer.
Savings: Users can bypass long-distance carriers and their
per-minute usage rates and run their voice traffic over the
Internet for a flat monthly Internet-access fee.
IP networks can be significantly less expensive to operate
and maintain. The simplified network infrastructure of an
Internet Telephony solution cuts costs by connecting IP phones
over the LAN wiring system and eliminates the need for dual
Internet Telephony can also eliminate toll charges on site-to-site
calls via global four-digit dialing. And, by using the extra
bandwidth on your WAN for IP Telephony, you leverage the untapped
capabilities of your existing data infrastructure to maximize
the return on your current network investment.
and flexibility: Employees are no longer confined by geographic
location. IP telephones work anywhere on the network, even
over a remote connection. Service can be extended to remote
sites and home offices over cost-effective IP links.
Simplicity and consistency: A common approach to service
deployment can allow cost-savings with the use of common management
tools, resource directories, and a consistent approach to
The ability to network existing PBXs using IP can bring new
values to the enterprise. For example, the ability to consolidate
voice mail onto a single system, or to fewer systems, makes
it easier for voice mail users to network.
Ubiquity: Internet Telephony is supported over a wide variety
of transport technologies. A user can gain access to just
about any business system, whether it's through an analog
line, a DSL line, a LAN, Frame Relay, ATM (Asynchronous Transfer
Mode), SONET or wireless.
Agility: You can add new services and users to the network
with fewer burdens on the existing system. This can pave the
path for more revenue-earning possibilities.
Benefits for the service provider
SPs (Service Providers) can look forward to enterprises outsourcing
their data and voice requirements through them. An existing
service provider can benefit in many ways.
network capital cost: The SP has already established a
MAN or a WAN for data transport. Adding voice services does
not require heavy capital expenditure. Packet infrastructure
is cheaper than switching infrastructure and offers better
granularity and flexibility than circuit switches. Moreover,
a distributed switching infrastructure can provide modularity.
Transport cost: Since SPs can add voice services over
the same network, the cost of data transport will increase
in the beginning. But with economies of scale due to business
volumes, the cost will reduce substantially. An SP can also
interconnect with other major PSTN carriers to extend its
reach and use minutes at a competitive rate.
cost: SPs can use many ways to control bandwidth cost.
It can use compression techniques to increase capacity, billing
can be based on usage, and the flexibility of IP can be used
to get efficient bandwidth prioritization.
Cost: Operational costs will not increase very much because
skill sets required to manage the VoIP infrastructure are
common. You don't need specialized VoIP personnel in your
organization. Some vendors even promise that you can maintain
unmanned sites in your network.
services: An SP can offer a range of new services to the
consumer like long distance calls, calling cards, Voice VPN,
UM (Unified Messaging), and mobile services.
converged networking environment
VoIPs core benefit is its ability to make next generation
converged network a reality. In a converged network environment,
telephony and data signals are transmitted as packets over
the data network.
A typical office has a separate network for data transmission
and voice (telephone). The data transmission network uses
a switch to connect workstations and segment them. The switches
are connected to a router which in turn connects to the corporate
WAN or a SPs WAN network. The voice network in the office
has a PBX on which telephone lines from the local exchange
terminate. All the telephones in the office are connected
to the PBX.
A converged network can enable you to transmit voice over
existing data network. Resources that have traditionally been
restricted to data can now be used for telephony. This maximizes
the efficiency of your network. The traditional voice circuits
can be used as backup or even eliminated.
It simplifies your network architecture. A single infrastructure
is capable of carrying both data and telephony traffic. You
don't need to pull separate cables for services. This approach
reduces repair time and streamlines network installations
Network deployments and reconfigurations are simplified, and
service can be extended to remote sites and home offices over
cost-effective IP links.
A future VoIP network will include IP-based PBXs (iPBXs),
which will emulate the functions of a traditional PBX. These
will allow both standard telephones and multimedia PCs to
connect to either the PSTN or the Internet, providing a seamless
migration path to VoIP. An iPBX can also combine the features
of today's switches and routers and could become the gateway
for variety of value-added services like directories, message
stores, firewalls and other network-based servers. Such a
VoIP system would also combine real-time and non real-time
QoS is a major concern for the converging network. It's simple
economics really. If you give people better quality in any
dimension like better sound quality, and better application
availability, they will spend more time using your applications
Although QoS usually refers to the fidelity of the transmitted
voice and facsimile documents, it can also be applied to network
availability, use of value added features like conferencing
and calling number display, and scalability.
There are three factors that can profoundly impact the QoS.
They are delay, jitter, and packet loss.
High end-to-end delay in a voice network gives rise to echo
and talker overlap. Echo becomes a problem when the round-trip
delay is more than 50 milliseconds. Since echo is perceived
as a significant quality problem, VoIP systems must address
the need for echo control and implement some means of echo
cancellation. Talker overlap (the problem of one caller stepping
on the other talker's speech) becomes significant if the one-way
delay becomes greater than 250 milliseconds. The end-to-end
delay budget is therefore the major constraint and drives
the requirement for reducing delay through a packet network.
A technique, called silence suppression, detects whenever
there is a gap in the speech and suppresses the transfer of
things like pauses, breaths, and other periods of silence.
This can amount to 50-60 percent of the time of a call, resulting
in considerable bandwidth conservation.
Jitter is the variation in inter-packet arrival time due to
variable transmission delay over the network. Removing jitter
requires collecting packets and holding them long enough to
allow the slowest packets to arrive in time to be played in
the correct sequence. This causes additional delay. The jitter
buffers add delay, which is used to remove the packet delay
variation that each packet is subjected to as it transits
the packet network.
IP networks cannot provide a guarantee that there will be
no packet loss and the packets will certainly be delivered
in order. Packets may be dropped under peak loads and during
periods of congestion caused by link failures or inadequate
capacity. Due to the time sensitivity of voice transmissions,
the normal TCP-based re-transmission schemes are not suitable.
Packet losses greater than 10 percent are generally not acceptable.
There are three techniques that can be used (separately or
in combination) to improve network QoS:
Controlling networking environment: You have to provide a
controlled networking environment in which the capacity can
be pre-planned and adequate performance can be assumed. This
would generally be the case with a private IP network or an
Intranet that is owned and operated by a single organization.
management tools: You can use management tools to configure
the network nodes, monitor performance, and manage capacity
and flow on a dynamic basis. Traffic can be prioritized by
location, by protocol, or by application type. This allows
real-time traffic to be given precedence over non-critical
traffic. Queuing mechanisms can also be manipulated to minimize
delays for real-time data flows.
control protocols and mechanisms: You can add control
protocols and mechanisms that help avoid or alleviate the
problems inherent in IP networks. Protocols like RTP (Real
Time Protocol) and RSVP can also be used to provide greater
assurances of controlled QoS within the network. Other mechanisms
like admission controls and traffic shaping may also be used
to avoid overloading a network.
Migrating from an Ethernet LAN gives rise to a few delay issues.
Ethernet frames are variable in length, and Ethernet has no
mechanism for prioritizing one frame over another. Therefore,
as network traffic increases, small frames carrying a voice
payload may often have to wait in switch buffer queues behind
large frames carrying data. With voice having a small delay
tolerance, the lack of prioritization across a switched Ethernet
network may degrade the quality of voice communications.
The most promising solution is to handle the problem at Layer
3 via the RSVP (Resource ReSerVation Protocol). RSVP operates
by reserving bandwidth and router/switch buffer space for
certain high priority IP packets like those carrying voice
traffic. In effect, RSVP enables a packet-switched network
to mimic some of the characteristics of a circuit-switched
RSVP is still only able to set up paths for high priority
traffic on a 'best effort' basis, and thus it cannot guarantee
the delay characteristics of the network. Furthermore, as
an OSI level 3 protocol, its support requires routing functionality
to be added to switches.
Fast Ethernet and Gigabit Ethernet presents a clearer migration
path than ATM. Like shared Ethernet, switched Ethernet still
uses a collision-based mechanism, but it is a viable medium
for voice because there is only one desktop talking on a given
Migrating from an ATM network to a converged VoIP infrastructure
may be principally simple because ATM was designed specifically
to support both voice and data traffic over a common infrastructure.
It also provides multiple QoS levels. ATM's good QoS model
is the basis for defining QoS for all other solutions like
IP. It provides support for a variety of needs through a choice
of three bandwidth allocation mechanisms, namely CBR (Constant
Bit Rate), VBR (Variable Bit Rate), and ABR (Available Bit
The sophistication and complexity of ATM make it a costly
option and often times not as cost effective as Ethernet or
IP. Other protocol issues and problems are quickly being resolved
through the work of industry groups like the IETF.
VoIP is on its way. And while the higher-ups in the management
are yet to pass your VoIP procurement budget, you can do a
few things to prepare your existing network infrastructure
for the inclusion of new technology.
The priority is to make your LAN (Local Area Network) as efficient
as possible. This can be accomplished in several ways. The
most important way is to get rid of unnecessary protocols.
NetWare supports IP now, Windows 2000 no longer requires NetBIOS
for file and print traffic, and most mainframes now support
IP. Although it may require considerable effort to remove
IPX, SNA, NetBIOS over TCP, and other protocols it will result
in more available bandwidth on the network, more available
memory, and faster response time in the clients and servers.
Once you have removed all these protocols, you may find that
the level of broadcasts on your network doesn't justify such
small subnets anymore. Moving to larger, 'flat' networks will
allow you to remove routers that are no longer needed. These
routers are not only points of failure, but are frequently
bottlenecks, and usually a major source of delay. Now your
VoIP traffic will be more reliable and higher quality.
QoS is a major requirement for VoIP. So why wait until the
last minute to QoS-enable your network? Since the configuration
of QoS and VoIP are both relatively complex tasks, you don't
want to be doing them at the same time if you don't have to.
By working the bugs out of your QoS configuration now, your
VoIP project will be less likely to encounter glitches.
And last, it's never too early to begin planning. Among the
things you can do before you even pick a vendor are:
Identify the types of traffic on your network and prioritize
them. Voice may actually not be the most important.
Determine existing call-traffic statistics and predict future
statistics, including cost, average simultaneous calls,
average duration, and source/destination pairs.
Prepare your network management system for VoIP, including
upgrading your RMON probe and protocol analyzers to recognize
and decode VoIP.
Soutiman Das Gupta can be reached at firstname.lastname@example.org