Home > Cover Story
 Print Friendly Page ||  Email this story
VoIP: Packaged voice and more

VoIP technology offers tremendous benefits for enterprises. Understanding how VoIP works, and the protocols/issues involved, is the first step towards evaluating this technology. by Soutiman Das Gupta

IP has become an accepted standard for communications over data networks worldwide, and enterprises have successfully implemented it. In 1995, IP found a new passenger, which it had to transport between networks and devices—Voice. Voice can now be packetized and sent over a data network using VoIP (Voice over Internet Protocol) technology. This revolutionary technology is now setting off a new trend—the convergence of voice and data networks. The convergence has also paved the way for a wide array of new applications.

In India, restrictions on IP Telephony services are about to be lifted and enterprises will soon be able to fully leverage on these services. As an immediate benefit, an enterprise can cut costs on long distance calls. Other benefits may go beyond the monthly phone bill. What's more, being a bit late to adopt, we can learn from the limitations faced by early adopters of the technology in other countries. This will help us introduce a much-evolved, stable, and reliable architecture in our backyard.

To understand VoIP and its packet-switching nature, we have to take a closer look at circuit-switching and the PSTN (Public Switched Telephone Network) architecture.

PSTN circuit switching

In a traditional PSTN telephony architecture, when a phone call is made to a particular number, a dedicated channel is assigned for the connection. This channel/circuit typically uses 64 Kbps bandwidth in each direction totaling the transmission rate to 128 Kbps. The voice traffic passes through the carrier's switches at the caller's and receiver's end, and all the switch ports that support the connection are used throughout the duration of the call. TDM (Time Division Multiplexing—a technology that transmits multiple signals simultaneously over a single transmission path) is used to accommodate more connections within the limited capacity.

Circuit-switching has many disadvantages. Let's say that you talk for 10 minutes. Since bandwidth used is 128 Kbps (or 16 KB), the total transmission for the length of the conversation is 9600 KB or roughly 9.4 MB. During this time, the circuit is continuously open between the two phones. In a typical voice conversation while you talk, the other party listens. This means that only half of the connection is in use at any given time.

There are lots of 'gaps' or silent periods when no party is talking. Since the circuit is dedicated, the bandwidth during the silent periods is utilized and wasted. So, we can in effect cut the file in half down to about 4.7 MB. Subsequently, the switch ports at the local exchanges are also dedicated for the entire duration of the call regardless of the silence. This makes bandwidth provisioning difficult and time-consuming.

The PSTN switching infrastructure may not be able to handle high traffic easily and may buckle. The carrier is not able to cut down on operations costs and consumers do not get any cost benefits. Also, billing is based on time and distance—not on amount of traffic.

It is not easy to initiate new services and/or incorporate any changes in a PSTN network. Capital costs are high which is partly due to the over-engineering required to support peak-time traffic. You cannot integrate the network well with multimedia applications and existing voice margins are very flat. You need to have different networks for different services.

Circuit switching versus packet switching
Circuit switching Packet switching
Uses TDM Uses statistical multiplexing
Inflexible-does not allow inclusion of new services easily Flexible technology-allows new services to be integrated easily
Data is switched Data is routed
Connection-oriented Connectionless
64 Kbps in each direction Variable length and can be

compressed to 8 Kbps

Billing according to time and duration Billing according to usage

Packet switching
VoIP uses packet switching—a technology commonly used by data networks—instead of circuit switching. While circuit switching keeps the connection open and constant, packet switching opens the connection just long enough to send a small data packet, from one system to another. Each packet in a packet switched network contains a destination address. This allows all packets in a single message to be dynamically routed on different paths over the network depending on availability. The destination computer reassembles the packets back into their proper sequence.

A technique called statistical multiplexing is used to accommodate more signals in a channel. This technique analyzes the traffic and dynamically changes its pattern of interleaving to use all the available capacity of an outgoing channel.

Packet switching minimizes the connection time between two systems and reduces the load on the network. It frees up the two systems communicating with each other so that they can accept information from other systems simultaneously. Several telephone calls can occupy the amount of space used by only one call in a circuit-switched network. And the size of each call can be further reduced with the use of data compression.

Standards everywhere

Like any other technology, VoIP is defined by standards. There are various standards out in the VoIP arena like H.323, SIP, and MGCP (Media Gateway Control Protocol). A better way to understand VoIP is to get a lowdown on these standards.

Widely used H.323

The H.323 specification was ratified in May 1996 by the ITU (International Telecommunications Union). It defines how voice, data, and video traffic will be transported over IP-based LANs and WANs.

H.323 is actually a suite of protocols developed for specific applications. Recommendations in the H.323 series include H.225.0 packet and synchronization, H.245 control, H.261 and H.263 video codecs, G.711, G.722, G.728, G.729, and G.723 audio codecs, and the T.120 series of multimedia communication protocols. A codec, which stands for coder-decoder, converts an audio signal into a compressed digital form for transmission and back into an uncompressed audio signal for replay.

A little SIP

SIP (Session Initiation Protocol) is an IETF (Internet Engineering Task force) signaling protocol for establishing real-time calls and conferences over IP networks. Each session may include different types of data like audio and video although currently most of the SIP extensions address audio communication.

As a traditional text-based Internet protocol, it resembles the HTTP (Hyper Text Transfer Protocol) and SMTP (Simple Mail Transfer Protocol). SIP uses SDP (Session Description Protocol) for media description. SIP's basic architecture is client/server in nature. The main entities in SIP are the User Agent, the SIP Proxy Server, the SIP Redirect Server and the Registrar.

SIP is independent of the packet layer. The protocol is an open standard and is scalable. It has been designed to be a general-purpose protocol. However, extensions to SIP are needed to make the protocol truly functional in terms of interoperability. The protocol also enables personal mobility by providing the capability to reach a called party at a single, location-independent address.

Both SIP and H.323 define mechanisms for call routing, call signaling, capabilities exchange, media control, and supplementary services. SIP is a new protocol that promises scalability, flexibility and ease of implementation when building complex systems. H.323 is an established protocol that has been widely used because of its manageability, reliability and interoperability with PSTN.

Master/slave MGCP

MGCP (Media Gateway Control Protocol) is a master/slave protocol that provides tight coupling between the MG (Media Gateway) which is the endpoint and the MGC (Media Gateway Controller) server. MGCP-based VoIP solutions separate call control (signaling) intelligence and media handling.

Like SIP and H.323 it relies on a variety of other existing protocols like SDP for describing the media aspects of a call, and RTP/RTCP (Real Time Protocol/RT Control Protocol) which are used by MGs for handling the real-time transport of media streams.

Benefits for the enterprise

Originally regarded as a novelty, VoIP technology is attracting more and more users worldwide because of the benefits it offers to the enterprise, service provider, and ultimately, the consumer.

Cost Savings: Users can bypass long-distance carriers and their per-minute usage rates and run their voice traffic over the Internet for a flat monthly Internet-access fee.

IP networks can be significantly less expensive to operate and maintain. The simplified network infrastructure of an Internet Telephony solution cuts costs by connecting IP phones over the LAN wiring system and eliminates the need for dual cabling.

Internet Telephony can also eliminate toll charges on site-to-site calls via global four-digit dialing. And, by using the extra bandwidth on your WAN for IP Telephony, you leverage the untapped capabilities of your existing data infrastructure to maximize the return on your current network investment.

Portability and flexibility: Employees are no longer confined by geographic location. IP telephones work anywhere on the network, even over a remote connection. Service can be extended to remote sites and home offices over cost-effective IP links.

Simplicity and consistency: A common approach to service deployment can allow cost-savings with the use of common management tools, resource directories, and a consistent approach to network security.

The ability to network existing PBXs using IP can bring new values to the enterprise. For example, the ability to consolidate voice mail onto a single system, or to fewer systems, makes it easier for voice mail users to network.

Ubiquity: Internet Telephony is supported over a wide variety of transport technologies. A user can gain access to just about any business system, whether it's through an analog line, a DSL line, a LAN, Frame Relay, ATM (Asynchronous Transfer Mode), SONET or wireless.

Operational Agility: You can add new services and users to the network with fewer burdens on the existing system. This can pave the path for more revenue-earning possibilities.

Benefits for the service provider

SPs (Service Providers) can look forward to enterprises outsourcing their data and voice requirements through them. An existing service provider can benefit in many ways.

Low network capital cost: The SP has already established a MAN or a WAN for data transport. Adding voice services does not require heavy capital expenditure. Packet infrastructure is cheaper than switching infrastructure and offers better granularity and flexibility than circuit switches. Moreover, a distributed switching infrastructure can provide modularity.

Transport cost: Since SPs can add voice services over the same network, the cost of data transport will increase in the beginning. But with economies of scale due to business volumes, the cost will reduce substantially. An SP can also interconnect with other major PSTN carriers to extend its reach and use minutes at a competitive rate.

Bandwidth cost: SPs can use many ways to control bandwidth cost. It can use compression techniques to increase capacity, billing can be based on usage, and the flexibility of IP can be used to get efficient bandwidth prioritization.

Operational Cost: Operational costs will not increase very much because skill sets required to manage the VoIP infrastructure are common. You don't need specialized VoIP personnel in your organization. Some vendors even promise that you can maintain unmanned sites in your network.

New services: An SP can offer a range of new services to the consumer like long distance calls, calling cards, Voice VPN, UM (Unified Messaging), and mobile services.

A converged networking environment

VoIPs core benefit is its ability to make next generation converged network a reality. In a converged network environment, telephony and data signals are transmitted as packets over the data network.

A typical office has a separate network for data transmission and voice (telephone). The data transmission network uses a switch to connect workstations and segment them. The switches are connected to a router which in turn connects to the corporate WAN or a SPs WAN network. The voice network in the office has a PBX on which telephone lines from the local exchange terminate. All the telephones in the office are connected to the PBX.

A converged network can enable you to transmit voice over existing data network. Resources that have traditionally been restricted to data can now be used for telephony. This maximizes the efficiency of your network. The traditional voice circuits can be used as backup or even eliminated.

It simplifies your network architecture. A single infrastructure is capable of carrying both data and telephony traffic. You don't need to pull separate cables for services. This approach reduces repair time and streamlines network installations and reconfigurations.

Network deployments and reconfigurations are simplified, and service can be extended to remote sites and home offices over cost-effective IP links.

A future VoIP network will include IP-based PBXs (iPBXs), which will emulate the functions of a traditional PBX. These will allow both standard telephones and multimedia PCs to connect to either the PSTN or the Internet, providing a seamless migration path to VoIP. An iPBX can also combine the features of today's switches and routers and could become the gateway for variety of value-added services like directories, message stores, firewalls and other network-based servers. Such a VoIP system would also combine real-time and non real-time communications.

QoS factors

QoS is a major concern for the converging network. It's simple economics really. If you give people better quality in any dimension like better sound quality, and better application availability, they will spend more time using your applications and services.

Although QoS usually refers to the fidelity of the transmitted voice and facsimile documents, it can also be applied to network availability, use of value added features like conferencing and calling number display, and scalability.

There are three factors that can profoundly impact the QoS. They are delay, jitter, and packet loss.

High end-to-end delay in a voice network gives rise to echo and talker overlap. Echo becomes a problem when the round-trip delay is more than 50 milliseconds. Since echo is perceived as a significant quality problem, VoIP systems must address the need for echo control and implement some means of echo cancellation. Talker overlap (the problem of one caller stepping on the other talker's speech) becomes significant if the one-way delay becomes greater than 250 milliseconds. The end-to-end delay budget is therefore the major constraint and drives the requirement for reducing delay through a packet network.

A technique, called silence suppression, detects whenever there is a gap in the speech and suppresses the transfer of things like pauses, breaths, and other periods of silence. This can amount to 50-60 percent of the time of a call, resulting in considerable bandwidth conservation.

Jitter is the variation in inter-packet arrival time due to variable transmission delay over the network. Removing jitter requires collecting packets and holding them long enough to allow the slowest packets to arrive in time to be played in the correct sequence. This causes additional delay. The jitter buffers add delay, which is used to remove the packet delay variation that each packet is subjected to as it transits the packet network.

IP networks cannot provide a guarantee that there will be no packet loss and the packets will certainly be delivered in order. Packets may be dropped under peak loads and during periods of congestion caused by link failures or inadequate capacity. Due to the time sensitivity of voice transmissions, the normal TCP-based re-transmission schemes are not suitable. Packet losses greater than 10 percent are generally not acceptable.

QoS solutions

There are three techniques that can be used (separately or in combination) to improve network QoS:

Controlling networking environment: You have to provide a controlled networking environment in which the capacity can be pre-planned and adequate performance can be assumed. This would generally be the case with a private IP network or an Intranet that is owned and operated by a single organization.

Using management tools: You can use management tools to configure the network nodes, monitor performance, and manage capacity and flow on a dynamic basis. Traffic can be prioritized by location, by protocol, or by application type. This allows real-time traffic to be given precedence over non-critical traffic. Queuing mechanisms can also be manipulated to minimize delays for real-time data flows.

Adding control protocols and mechanisms: You can add control protocols and mechanisms that help avoid or alleviate the problems inherent in IP networks. Protocols like RTP (Real Time Protocol) and RSVP can also be used to provide greater assurances of controlled QoS within the network. Other mechanisms like admission controls and traffic shaping may also be used to avoid overloading a network.

Migration issues
Migrating from an Ethernet LAN gives rise to a few delay issues. Ethernet frames are variable in length, and Ethernet has no mechanism for prioritizing one frame over another. Therefore, as network traffic increases, small frames carrying a voice payload may often have to wait in switch buffer queues behind large frames carrying data. With voice having a small delay tolerance, the lack of prioritization across a switched Ethernet network may degrade the quality of voice communications.

The most promising solution is to handle the problem at Layer 3 via the RSVP (Resource ReSerVation Protocol). RSVP operates by reserving bandwidth and router/switch buffer space for certain high priority IP packets like those carrying voice traffic. In effect, RSVP enables a packet-switched network to mimic some of the characteristics of a circuit-switched multiplexer network.

RSVP is still only able to set up paths for high priority traffic on a 'best effort' basis, and thus it cannot guarantee the delay characteristics of the network. Furthermore, as an OSI level 3 protocol, its support requires routing functionality to be added to switches.

Fast Ethernet and Gigabit Ethernet presents a clearer migration path than ATM. Like shared Ethernet, switched Ethernet still uses a collision-based mechanism, but it is a viable medium for voice because there is only one desktop talking on a given LAN segment.

Migrating from an ATM network to a converged VoIP infrastructure may be principally simple because ATM was designed specifically to support both voice and data traffic over a common infrastructure. It also provides multiple QoS levels. ATM's good QoS model is the basis for defining QoS for all other solutions like IP. It provides support for a variety of needs through a choice of three bandwidth allocation mechanisms, namely CBR (Constant Bit Rate), VBR (Variable Bit Rate), and ABR (Available Bit Rate).

The sophistication and complexity of ATM make it a costly option and often times not as cost effective as Ethernet or IP. Other protocol issues and problems are quickly being resolved through the work of industry groups like the IETF.

Network warm-up exercises

VoIP is on its way. And while the higher-ups in the management are yet to pass your VoIP procurement budget, you can do a few things to prepare your existing network infrastructure for the inclusion of new technology.

The priority is to make your LAN (Local Area Network) as efficient as possible. This can be accomplished in several ways. The most important way is to get rid of unnecessary protocols. NetWare supports IP now, Windows 2000 no longer requires NetBIOS for file and print traffic, and most mainframes now support IP. Although it may require considerable effort to remove IPX, SNA, NetBIOS over TCP, and other protocols it will result in more available bandwidth on the network, more available memory, and faster response time in the clients and servers.

Once you have removed all these protocols, you may find that the level of broadcasts on your network doesn't justify such small subnets anymore. Moving to larger, 'flat' networks will allow you to remove routers that are no longer needed. These routers are not only points of failure, but are frequently bottlenecks, and usually a major source of delay. Now your VoIP traffic will be more reliable and higher quality.

QoS is a major requirement for VoIP. So why wait until the last minute to QoS-enable your network? Since the configuration of QoS and VoIP are both relatively complex tasks, you don't want to be doing them at the same time if you don't have to. By working the bugs out of your QoS configuration now, your VoIP project will be less likely to encounter glitches.

And last, it's never too early to begin planning. Among the things you can do before you even pick a vendor are:

  • Identify the types of traffic on your network and prioritize them. Voice may actually not be the most important.
  • Determine existing call-traffic statistics and predict future statistics, including cost, average simultaneous calls, average duration, and source/destination pairs.
  • Prepare your network management system for VoIP, including upgrading your RMON probe and protocol analyzers to recognize and decode VoIP.

Soutiman Das Gupta can be reached at soutimand@networkmagazineindia.com

- <Back to Top>-  

Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD