-
-
   Home
   Archives
 About Us
   Advertise
 Feedback
 Subscribe

Home >Technology > Full Story

The Protocols Behind E-mail

E-mail was the first business-to-business killer application of the Internet. E-mail ranks with the printing press, the telephone and television in mass impact. Today, more than three billion-plus messages zip back and forth each day and have become the oxygen of the Internet.

E-mail is the preferred method for many personal communications as well. A message may comprise from a few lines to several pages. As an attachment, audio/video clips and faxes can also be transported.

E-mail is the fastest, cheapest and reliable method of transporting messages over long distances. Further, unlike fax, the message is available right on your desktop. The e-mail service is location independent. A recipient who is away from his usual location can connect to his e-mail server from anywhere in the world to access new mail.

E-mail services were invented nearly 20 years ago. However, the essential features of e-mail remain the same.

The e-mail system is a distributed application. It deploys TCP/IP protocol for reliable message transport.

A typical e-mail system comprises message preparation, 'Message Transport System' (MTS) and the message retrieval by the recipient. 'Simple Mail Transport Protocol' (SMTP) was the basic protocol of the e-mail system. It has the facility of message preparation, transportation and retrieval. This protocol is the workhorse of the e-mail system. SMTP is a simple and reliable protocol of the TCP/IP protocol suit. It was good for a homogeneous network. Now, the Internet is a heterogeneous network and to overcome this limitation, for message retrieval 'Post Office Protocol 3' (POP3) was standardized. POP3 is a very popular protocol and is widely used. However, it supports offline working only. It does not support message retrieval from multiple locations.

'Internet Message Access Protocol 4' (IMAP4) overcomes the above limitations of the POP3 protocol. IMAP4 is also an open standard. It has many useful features: Offline, online and disconnected-user access modes and message retrieval from multiple locations.

Currently, despite limitations, POP3 is widely used. Now, IMAP4 deployment is gaining momentum. A user is confronted with a choice between POP3 and IMAP4 protocol.

Basics Of E-mail System
The e-mail system is used to transport messages between users over the Internet. It is based on 'Internet Engineering Task Force' (IETF) defined open standards. It comprises of client-side and serve-side parts. SMTP protocol governs server-to-server and sending-client to mail server communications.

Client side: It is also known as the user side. It is a human being (or a process on behalf of a human being) wishing to use mail transfer service.

User: An originator or recipient of e-mail.

Sending user: A user who originates a mail is termed as the sending user.

Receiving user: The recipient of the mail is known as a receiving user. A receiving user retrieves his mail from the designated (also nearest) mail server.

Multiple recipients: A sending user may opt to send the mail to more than one receiving users. These recipients could be located anywhere on the Internet.

E-mail: A sequence of ASCII characters of arbitrary length that conforms to RFC 822 standard.

E-mail address: It comprises of two parts—local part and domain name. The local part is the mailbox ID. It is the name of an individual mailbox. The domain name pertains to mail destination. It is a hierarchically structured global character string address, registered in the domain name server.

User agent (UA): It is a client-side software that is installed on a user machine. It performs dual functions. While sending a mail, it functions as sending UA and while retrieving the mail, it functions as receiving UA.

Sending UA: It accepts the message and formats it as per RFC 822 standard. The sending UA adds header field that comprises e-mail addresses of sending user and receiving user, date and subject field etc.

It is also responsible for establishing TCP connection with the nearest (designated) mail server and sends the formatted mail. It follows SMTP defined rules for transporting the mail.

It also acts as SMTP client to sending mail server.

Receiving UA: It is installed on receiving-user's machine. It establishes TCP connection with the designated receiving mail server to retrieve the mail. It uses one of two standard protocols (POP3/IMAP4) to retrieve the mail.

Server side: It is also known as 'Mail Transport System' (MTS). It comprises of a network of mail servers over the Internet. Each mail server functions as 'Mail Transfer Agent' (MTA). MTAs mediate mail server-to-server communication using SMTP. A MTA is installed on every mail server. It can function as a sending MTA, receiving MTA and an intermediate MTA.

Sending MTA: It is the designated MTA (mail server) for the sending user. It performs the following functions:

  • It accepts mail from the sending UA using SMTP.
  • It obtains the IP address of the recipient from domain name server, using given e-mail address.
  • It tries to transport the mail directly to receiving MTA, if possible. Alternatively, it obtains the IP address of an intermediate MTA and relays the mail using SMTP.

Receiving MTA: It is the designated MTA for the receiving user. It is installed on a receiving mail server. It is responsible for reception and storage of mail in the mailbox of the receiving user.

On a receiving mail server along with MTA, 'Mail Delivery Agent' (MDA) is also located. MDA shares local work with MTA. A receiving UA contacts MDA for its mail.

Intermediate MTA: For end-to-end transport of e-mail, SMTP compliant intermediate MTAs are installed at the network nodes. An intermediate MTA relays/forwards the mail for onward transport towards receiving MTA.

Mailbox: The receiving mail server stores the incoming mail in the mailbox. It is the designated space in memory for the receiving user. It is identified by a unique ASCII character string.

Mail retrieval: A receiving UA can retrieve (access) mail in offline, online and disconnected-user access modes. Here, is a brief description.

Offline access: Here, a client application (receiving UA) periodically connects to the receiving mail server, either when a user dials in or at some preset interval. The receiving UA downloads all messages. Mostly, downloaded

messages are deleted from the receiving mail server's mailbox. Now, messages can be processed at the user end.
Offline upload: Similar to offline access, messages from the sending end can be transferred using client applications.

Online access: Here, connection between the receiving end user and the receiving mail server is maintained all the time. The receiving UA does not store the messages locally. It only retrieves the messages from the mailbox as needed. Here, connectivity between mail server and receiving user is essential for operation.

Disconnected-user access mode: This access method has the characteristics of both offline and online access modes. Here, a user can connect to mail server periodically to download the messages. These can then be read, deleted and organized in offline manner.

Here, while messages are downloaded to receiving-user, the same are not deleted at the server end. Thus, when a receiving user again contacts to the mail server, he can access previous mail as well. This non-deleting feature for mail enables a user to read his mail from multiple locations as well.

E-mail format standard: E-mail system uses RFC 822 for formatting an e-mail. This protocol is defined for text messages. The SMTP protocol requires that RFC 822 is used for constructing e-mail. This protocol considers message as comprising of two parts: envelop and content.

An envelope contains information fields, which are required for transporting and delivery of a message. The envelop contains arguments pertaining to date, from, subject, to and cc etc.

The envelope and message body are separated (delimited) by a blank line. Message begins with 'Hello'. Message is in the form of memo and comprises of ASCII characters only.

SMTP: It is an application level protocol. It is a member of TCP/IP protocol suit. It deploys two-way (duplex) communication channel for transporting/relaying messages. It requires a reliable transport protocol such as TCP for reliable delivery of mail. SMTP jips/relays messages between two mail servers or between sending UA and the sending mail server.

Normally, SMTP is independent of message format/content. However, for standardization of the e-mail system, it requires the use of RFC 822 for formatting/constructing e-mail.

SMTP requires use of a 7-bit ASCII character set. However, TCP protocol supports 8-bit characters only. Thus, higher-order bit is set as clear zero to accommodate 7-bit characters.

SMTP tags information regarding the route followed by a message in the header portion. SMTP uses TCP for message transport. However, it does not guarantee to recover from the lost messages. Here, no end-to-end acknowledgement to the sending user is sent for a successfully transmitted message. Further, error indications are also not returned.

Limitations of SMTP:

  • SMTP cannot transmit executable files. For transmission of binary files using SMTP, it is necessary to convert binary files into text files. There are many popular methods such as Unix UU encode/UU decode and Bin Hex. However, these are not standard methods.
  • It cannot transmit text data that contains national language characters. These national language characters use 8-bit codes with values of 128 decimal or more.
  • It is limited to 7-bit ASCII characters only.
  • AMTP servers may reject mail messages beyond some specific length.
  • SMTP gateways to X-400 electronic mail networks cannot handle non-textual data including X.400 messages.
    Some SMTP implementations do not adhere to standards completely. Some of the typical problems are:
  • Deletion, addition or recording of carriage returns and line feed.
  • Truncating or wrapping lines longer than 76 characters.
  • Removal of trailing white space.
  • Padding of lines in a message to the same length.

'Multipurpose Internet Mail Extension' (MIME): MIME attempts to resolve the above limitations of SMTP. It extends the RFC 822.

It includes the following elements.

  • Five new message header fields are defined. These include MIME version, content type, content transfer encoding, content ID and content description. These fields provide the information about the body of a message.
  • A number of content formats are defined. These enable the e-mail system to support multimedia transport as e-mail attachment.
  • Transfer encoding standards have also been defined. These enable conversion of contents from one format to another format. Thus, now contents cannot be modified by the e-mail system during transport.

POP3: It is the most common client protocol used for retrieving e-mail from the receiving mail server over the LAN or the Internet. It enables intermittent checking for messages stored at the receiving-mail server.

The important features are:
As default mode of operation, it supports offline access mode. A POP3 compliant mail server enables a receiving UA to download the mail. After successful download, the mail is deleted at the mail server.

A POP3 compliant server can also be asked to retain the message after downloading. However, in this mode, when a receiving user establishes a new connection, it will download all the previously stores messages along with attachments, again and again.

Some of the limitations are:

  • Normally, the storage space in the mailbox is limited and POP3 server should inform about impending exhausting of quota. POP3 has no provision for sharing mailboxes or messages.
  • POP3 has limited capability to handle complex messages that contain multimedia attachments.
  • POP3 client program is the best suited for people who have one mailbox and read their e-mails from one PC.
  • POP3 does not have the facility to encrypt user names and passwords.

'Internet Mail Access Protocol' (IMAP4): It is a newer e-mail retrieval protocol. It is described in RFC 2060. It has not been implemented widely.

Its salient features include:

  • IMAP4 requires a reliable and ordered data stream such as provided by TCP. When TCP is used, IMAP compliant mail server listens on port 143.
  • It supports online, offline and disconnected-user access modes. This allows a client to access and manipulate e-mail messages on a mail server.
  • It permits manipulation of remote mailboxes like a local mailbox. This includes operations for creating, deleting and renaming mailboxes, checking for new messages, permanently removing messages, setting and clearing flags etc.
  • IMAP clients do not send the complete contents of every file. The IMAP mail server sends a short menu of waiting messages. Thus, important messages can be transported quickly.
  • Here, messages are stored on mail servers only. Thus, messages can be accessed from multiple IMAP clients and a user can still see the same status information for all messages.
  • IMAP server understands MIME file extensions. These enable an IMAP client to select the desired portion of an
    e-mail for retrieval.
  • IMAP allows the messages to be stored in a hierarchical structure on the IMAP server.
  • IMAP sends the passwords and user names as encrypted.

POP3 Vs IMAP4
Contrary to offline access mode supported by POP3, IMAP4 supports offline, online and disconnected-user access modes.

  • IMAP4 allows users to access and manage messages from more than one location/ computer.
  • IMAP4 supports folder hierarchies and concurrent access to shared mailboxes.
  • IMAP4 has database capabilities. This enables a user to search and select messages and part of messages stored on the server.
  • IMAP4 supports MIME. Thus, a user can access audio/video/graphic clips easily.
  • IMAP4 does not rely on file-access protocols and does not need to know the server's file storage format.

Implementation Issues
Changing a mail system from POP3 to IMAP4 is easy but reverting back to POP3 and can be an administrative nightmare. Compared to POP3, IMAP4 server requires more storage. Thus, storage could be critical. Compared to POP3, server connect time to IMAP4 server could be longer. Users may need time to review message headers and deciding which attachment he likes to download. IMAP4 server could lead to higher technical support cost. Thus, it may be best suitable for controlled environments i.e. universities and large corporates. IMAP4 by itself offers no solution for calendaring or scheduling. One has to adopt other standard-based solutions.

Directory access services: E-mail system has global nature. There exist several e-mail systems over the Internet.

Each of these e-mail systems have their own directory services that give information about users, systems and services. Standard access to these different directories is possible by using X.500 and 'Light Weight Directory Access Protocol' (LDAP).

X.500: X.500 is a directory assistance system for computers in a distributed network. Such a directory contains e-mail addresses of all users. X.500 directory services define different object classes that can be used to identify objects within the directory service tree. These objects include aliases, country codes, localities, organization, organization units and people. However, implementation of such a global directory requires vast storage and computational resources.

LDAP: The LDAP is designed to provide access to X.500 and non-X.500 directories. It is standardized by IETF (RFC 2251). LDAP is specifically targeted at management applications and browser applications that provide read/write interactive access to directories. LDAP is designed to run over connection-oriented reliable protocol such as TCP.

It uses 8-bit coding for a character.

X.400 standard: It is an international e-mail interchange standard. It is standardized by ITU-T. It is complex and costly to implement. Thus, its use is confined to large enterprise-wide networks and commercial e-mail service providers. It encompasses all electronic messaging technologies such as telex, tele-text, and facsimile and proprietary e-mail systems.

Current Status
The latest trend in e-mail systems is towards using open standards. Thus, SMTP is the widely used protocol for relaying mail over the Internet. As regards mail-retrieval by the recipient, there is a choice between POP3 and IMAP4. Here, is a brief description of the available IMAP4 implementations.

Critical Path offers 'In Scribe Messaging Server' and 'InJoin Directory Server'. These run on Windows NT/2000, Solaris and SGI Irix and are claimed to scale up to million users.

Eudora offers 'World-Mail Server 2.0' for Windows NT/2000. It is based on Isocor N-Plex code. It is easier for small organizations to install and run it.

Gordano offers 'NTMail'. It can be deployed on Windows NT/2000, Solaris and Linux. It supports upto 10,000 users. It has been selected as 'Editor's Choice' by PC Magazine and Network Computing reviews. It supports 'Wireless Access Protocol' (WAP) also.

Infinite Technology offers 'InterChange'. It combines an e-mail server, HTTP server and e-mail client proxy into a single package. It can be used as a standalone server also. It provides IMAP, POP and SMTP services. Alternatively, it can provide web-based gateway to e-mail systems that do not offer one on their own. It can support WAP also.

Iplanet offers several different IMAP-compliant messaging servers. The offering includes the 'Sun Internet Mail Server' and 'Netscape Messaging Server'. Sun Internet Mail Server requires the Solaris operating system. Netscape Messaging Server runs on Alpha, PA-RISC, Linux and NT. Both servers are carrier grade.

IPswitch offers 'Mail Server' for NT/2000. It costs US$1500 for unlimited use and presupposes modest system requirements.

MiraPoint offers standards-compliant software into server appliances that supports 300 to 2500 users depending on the model. It is claimed that it works fast with WAP since they do not need gateways to translate between the WAP and IMAP

Novell offers 'Novell Internet Messaging' system. It can run on NetWare, Solaris and Linux.

'Rockliffe system' offers 'Mailsite' that runs on Windows NT/2000. It can be used for upto 1 million users per server reliably.

'Sendmail' is a well-known freeware program. Now, it has been enhanced and supported commercially. It is claimed that it can support hundreds of thousand users on a single server. It also supports IMAP folder management feature.

Technical Glossary
Domain Naming System (DNS): DNS is a member of TCP/IP protocol suit that resolves IP addresses to names.

Internet Message Access Protocol (IMAP): IMAP is a member of the TCP/IP protocol suit that enables a client to retrieve the mail from the mail server. RFC 2060 standard describes this protocol. It can operate in online, offline and disconnected-user access modes. It allows messages to be stored in a hierarchical order on the IMAP server.

Post Office Protocol (POP): It is a member of the TCP/IP protocol suit. It is used by a recipient's user agent to retrieve mails from the mail server. Its latest version is POP3 and is described in RFC 1939.

Simple Mail Transfer Protocol (SMTP): It is a TCP/IP standard for mail interchange between mail servers and between mail server and sending user. It is described in RFC 821.

Aliases: It is an e-mail address that actually refers a different e-mail mailbox. Aliases are commonly set up to reference a function within an organization.

Multipurpose Internet Mail Extensions (MIME): It is a file encoding method that provides a mechanism for translating non-ASCII messages into ASCII format for transmission over the Internet. Different MIME types are defined that allow each file category to be encoded in a defined manner. It is an extension to RFC 822. It overcomes the limitations of SMTP. The specification is provided in RFC 1521 and 1522.

RFC 822: It defines the format for an e-mail. Here, message is viewed as having envelops and contents.
Bin Hex: It is a storage protocol. It translates a binary data file into an encrypted text version using hexadecimal.
UU encode/UU decode: Unix to Unix encoding is an alternative method for converting raw binary data into a text representation. It is used for sending binary attachments via a text-based Internet mail system.

A.K.Vanwasi is GM (R&D) ITI Ltd. Naini, Allahabad. He can be reached at vanwasi_nni@itiltd.co.in

>>

- <Back to Top>-  

Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD