|
P2P
is supposed to revolutionize the way one perceives traditional
client/server computing. Here's a lowdown on this technology.
by A. K. Vanwasi
P2P
(Peer-to-Peer) is the new buzzword that is supposed to change
the way one perceives traditional computing.
A P2P network enables its 'member nodes' to search for and
share content and spare computing resources in real-time,
on the Internet. P2P information sharing paradigm is expected
to enhance workgroup productivity and support community activity.
The P2P philosophy differs significantly from conventional
client/server computing. Currently, most of the Internet applications
use client/server computing. Here, a central server stores
information and distributes it to clients in response to their
requests. However, the information stored is static and
can be updated only by the central server.
P2P
networking is an alternative to client/server computing. It
is server-less computing. There exists many varieties of P2P
networks like Gnutella, Grid computing, SETI@home, etc. Member
nodes of a particular P2P network share common networking
software and may be located on the Internet or any other private
network. They actively participate in information search and
distribution. Each member node also contributes to information
repository.
P2P networks also enable sharing of spare computing power
and storage of data among member nodes. Thus, P2P network
enables a web of millions of computers to act as a gigantic
distributed virtual super-computer.
Here, we provide an overview of various P2P networks and related
issues.
P2P basics
A better way to understand P2P computing would be to grasp
the related components and terminology.
P2P network: A P2P network comprises of online 'member
nodes' located on the Internet. Each member node has common
P2P networking protocol software and an individual information
repository. A member node can search for and directly share
information repository and computational resources of other
member nodes.
Member Node: A member node can act both as a client
and server. It also maintains IP addresses of few other member
nodes.
Open P2P network: Such a P2P network is able to distribute
all type of content i.e. music, video and data files.
Content-based search: A P2P network supports content-based
search. A member node i.e. source, queries the P2P network
for a specific content. The P2P network software on other
member nodes matches the information repository for the requested
content. If a match is found on a member node, then the source
is a sent response. Now, the source can directly download
the content.
Dynamic indexing: A member node can only be accessed
if it is online. This ensures that the information repository
is always updated.
Server-based indexing: A single server maintains the
index of contents of all the member nodes. It replies to request
of a member node for content with list of IP addresses of
other member nodes having the requested content.
P2P multicast: A source node sends only one copy to
a multicast group. This copy is replicated along downstream,
taking advantage of the overlapping distribution paths. This
eases traffic on the Internet.
P2P network load-balancing: A P2P network load-balancing technique
monitors traffic and the demand profile for a particular information
item and then redistributes content to ease the load on an
individual node. The load-balancing policy may also locate
the content close to usage.
P2P networks
A P2P network can be something as simple as a LAN workgroup,
or something on the lines of Napster the popular file-sharing
program that lets you share files over the Internet. Here
are few examples of existing P2P networks, and their functionality.
LAN workgroups: Small P2P networks have their origin
in LAN environments. Workgroups in a LAN use popular operating
systems like Apple Macintosh or Windows for sharing files.
Napster: Napster is a popular online music file sharing
global network. Here, member nodes are located on the Internet.
Every member node has proprietary Napster client software
loaded on the computer.
Napster uses a central server to maintain a directory of music
files that are available with different member nodes. Due
to use of a central server, Napster is not a true P2P network.
Its operation involves:
-
Request by a client member node for a MP3 song to central
server.
-
On receipt of a request, the server searches for the desired
music file and returns the list of member nodes having the
requested content.
-
Now, client member establishes connection with the member
node having the desired file and downloads the music file.
Gnutella: It is an open, online P2P network. It does
not require a central server for indexing data files and is
the least proprietary.
Gnutella protocol is a specialized software and communication
protocol. It enables a host computer to function as both a
client and server. To use it, a software application called
'Servent' (Server + Client) is left running on each member
node.
Using the Gnutella protocol, a member node can search for
other online member nodes. Usually, each member node keeps
track of four or five other member nodes. Thus, there is a
web of virtual interlinked member nodes.
Gnutella protocol enables a member node to designate a file
as a shared file. To coordinate communication among member
nodes, the current Gnutella protocol (Version 0.4) defines
five descriptors: Ping, Pong, Query, Query-Hit and Push.
Steps involved in communica-tion are:
-
To find other online member nodes, a member node sends a
Ping packet that announces the presence of a member node.
-
When another member node receives a Ping packet, it responds
by sending a Pong packet comprising IP address, port number
and directory of shared files. Further, the receiving node
forwards the copy of Ping packet to other member nodes.
-
Gnutella Query packet enables a requesting member node to
search for the desired data files on other member nodes,
on the basis of content.
-
If a member node has found a match, then it responds to
the requesting member node by sending Query-Hit packet comprising
IP address of member node, port number and data file name.
-
On receipt of Query-Hit, requesting member node can download
the desired data file. Normally, every Query packet has
'Time To Live' field that defines the lifetime of Query
packet in the network. This avoids indefinite circulation
of Query packet in the network.
Freenet: It is a Java-based open P2P network. Like
Gnutella, it also does not use a central server. It operates
across platforms. It uses different routing and caching schemes.
This enables efficient file distribution and scalability.
Morpheus and Kaza: These are two popular server-less
video file swapping software programs. Member nodes with these
software can swap video files. They are also used to transport
songs and grainy rerun of TV shows.
Distributed Computing
Distributed computing is a popular application area for P2P
networks. It is based on the premise that during normal operation,
90 percent of desktop computer's processing cycle remains
unused. An estimate suggests that Internet connected PCs offer
at least an aggregate 10 billion MHz of processing power and
1000 TB of storage. P2P networking enables organizations to
exploit these globally distributed spare computing resources.
Further, the distributed computing decentralizes computation
tasks and reduces project cost and administration. Here are
some key applications for distributed computing.
SETI @ home: This P2P network uses thousands of online PC
nodes located on the Internet to help in search of extraterrestrial
intelligence. Users participate by running a free program
that downloads and analyzes radio telescope data. There are
rare chances that a computer will detect any faint signal
of an extraterrestrial civilization but this has not prevented
enthusiasts from participating in the SETI program.
Grid Computing: A Grid links PCs to form a virtual super computer
using 'Globus' middleware. Globus is downloadable from the
Internet (www.globus.org). It enables organizations to create
an infrastructure that allows them to share data, applications,
processing power and storage. It provides an automatic way
of discovering resources and scheduling applications to run
computation tasks on shared computational resources available
on the Internet.
There are now hundreds of grids composed of millions of PCs
known as nodes that are linked all around the globe. Grid
computing should enable corporates to have cheap but massive
computing power on tap and more importantly on a pay-as-needed
basis.
It is estimated that in the near future, scientific and commercial
computing grids will be linked together into national grids.
Eventually, national grids will be linked into a global grid.
Drug Research Program: As philanthropy, Intel has initiated
a drug research program. To participate in this program, one
can download an application from www.intel.com/cure.
To find a potential cure or new drug, researchers must evaluate
the cancer-fighting potential of hundreds of millions of molecules.
It is estimated that the task will require a minimum of 24
million hours of number crunching.
Commercial Application: Data synapse's 'Web Proc' product
enables a customer to tap unused processing power of PCs to
conduct complex financial and business computations. The company
pays people to use their idle computers.
'Web Proc' software breaks complex computing jobs into smaller
tasks and using the Internet distributes them in encrypted
form to PCs having spare computing capacity.
Endeavor Technology has developed P2P products that enable
an enterprise to sell digital goods and information directly
from their own computers without a website or server.
Peer issues
Like most new technologies, P2P is still evolving. There are
some factors that affect the smooth functioning of P2P networks.
Solution to these problems will be key for the success of
P2P.
-
Bandwidth: Contrary to asymmetrical channel requirement
of client/server computing, P2P networks require symmetrical
communication channel i.e. same upstream and downstream
channel bandwidths. Further, P2P computing applications
are bandwidth hungry and calls for better quality of service
(QoS). Telecom service providers have to make suitable provision
for it.
-
Computing power:
PCs and computers used in P2P applications will require
more computing power to manage the communication and server
overhead.
-
Security: In P2P networking many applications can access
other PC's hard disk directly. Here, security is the main
concern. Downloading of files from other computers makes
the system vulnerable to viruses. Further, it is required
for the computers to authenticate other machines.
-
Interoperability:
Member nodes of a P2P network have variety of operating
systems, networking technologies and other platforms in
business applications. Thus, advanced interoperability techniques
will be necessary. XML-based interoperability technologies
hold lot of potential for P2P computing.
-
Copyright:
P2P networks allow members to share information that may,
subsequently, raise issues of copyright infringement, intellectual
piracy and undesirable spread of malicious contents. Thus,
P2P computing may form basis of many chaotic and lawless
communities. We only can expect that as P2P communities
evolve, they become more stable.
-
QoS:
Success of P2P computing requires better quality of service
and contents. User feedback-based performance score can
provide information for improving connection reliability
and content quality.
-
IP addressing:
A P2P network is a transient community. They do not have
always-on connectivity to the Internet. In other words,
member nodes do not have permanent IP addresses. This calls
for a new IP addressing scheme.
Conclusion
P2P computing offers a better alternative to client/server
computing for searching and sharing contents. However, success
of P2P computing requires satisfactory resolution of security
issues, upgradation of PCs and provisioning of high-capacity
symmetrical bandwidth network.
P2P is still in an evolutionary phase to have a real impact
on client/server computing in the near term. However, it has
the potential to revolutionize the way we perceive traditional
computing, and in the long term offer significant cost benefits.
A. K. Vanwasi is GM (R&D),ITI
LTD. Naini. He can be reached at vanwasi_nni@itiltd.co.in
|