Network system design has evolved significantly over the last 20 years. Engineers have abandoned the monolithic, "spaghetti-like", do-all-in-one programs in favor of layered protocol stacks and object-oriented communication systems. A stack-based architecture allows network designers to decompose functionality into multiple abstraction layers [2]. Each layer contains code that implements a protocol, or a set of rules and conventions for communication between multiple parties. Every protocol in a stack has a public interface through which it provides services to the layer directly above it. Furthermore, each layer also contains a peer interface, that defines how data is exchanged between identical layers on two hosts. The underlying details of how these services are implemented remain private or hidden from other layers. For example, the July 2000 Connector column presents how the "The Internet Protocol" (IP) works. By encapsulating the implementation details, changes can be made to a single layer without affecting other layers in the stack. If a layer's public interface does not change, backward compatibility is guaranteed. In addition, the modular design of the protocol stack promotes code reuse, and is easier to develop and to maintain.
However, a layered architectural design requires significant upfront planning. Determining how to allocate functions within a protocol stack is not always straightforward. Saltzer et al. write, "In a system that includes communications, one usually draws a modular boundary around the communication subsystem and defines a firm interface between it and the rest of the system. When doing so, it becomes apparent that there is a list of functions each of which might be implemented in any of several ways: by the communication subsystem, by its client, as a joint venture, or perhaps redundantly, each doing its own version" [4].
For example, error detection and correction can be performed at either the data link, network, or transport layer. Errors can be discovered faster if error detection is performed at a lower level (e.g., data link layer). On the other hand, one can detect and correct a larger set of errors at the network level. Be that as it may, relying only on network layer error detection can lead to unreliable end-to-end connections because faulty network components can introduce errors that the end hosts cannot detect. Alternatively, one might implement error detection at multiple layers. However, redundency related to poor planning can unnecessarily burden a system.
The end-to-end argument suggests that certain functionality can only be implemented at higher levels in the protocol stack. Saltzer et al. were the first to explicitly define this design concept:
"(Certain functions) can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing the questioned function as a feature of the communication system itself is not possible. (Sometimes an incomplete version of the function provided by the communication system may be useful as a performance enhancement.) [4].
To clarify the above statement, let me briefly review one of the famous examples [4] used by advocates of the end-to-end argument. Back in the earky 80's, researchers at MIT were using an internetwork that consisted of several local networks connected by gateways. The network used a packet checksum on the hops from one gateway to the next. It was assumed that the main source of errors occurred during transmission between hops. Application developers were convinced that the network provided a reliable transmission, and did not perform any additional error checking. Unfortunately, data was left unprotected while being stored in the gateways. Saltzer et al. write: "One gateway computer developed a transient error in which while copying data from an input to an output buffer a byte pair was interchanged, with a frequency of about one such interchange in every million bytes passed. Over a period of time many of the source files of an operating system were repeatedly transferred through the defective gateway. Some of these source files were corrupted by byte exchanges, and their owners were forced to the ultimate end-to-end error check: manual comparison with and correction from old listings." In this example, checking for errors at the network level was clearly not sufficient. A more comprehensive end-to-end solution was needed to guarantee an error-free communication channel.
The User Datagram Protocol (UDP) and the Transmission Control Protocol (TCP) are examples of end-to-end protocols. That is, they are implemented at the end hosts and not inside the network infrastructure. Both protocols offer multiplexing functionality, which enables more than one process to use the network simultaneously [2]. However, UDP and TCP differ in the kind of service they offer to the application layer protocols.
UDP offers a connectionless, unreliable datagram service. It ensures that corrupted packets are not delivered to the application layer, but makes no effort to request a retransmission of a corrupted or missing segment. It assumes that the application layer will provide error correction if it is necessary.
TCP, on the other hand, offers a reliable, connection-oriented, byte-stream service. TCP assumes that the underlying network infrastructure is unreliable and could lose or duplicate segments during their transmission. TCP makes every effort to ensure that the application protocol receives its data in order, without duplicates, exactly as it was sent. In addition, TCP supports flow control and congestion control. Flow control ensures that the sender does not send too much data to overrun the receiver's buffer. Congestion control ensures that the sender does not transmit more data into the network than it can handle.
Unfortunately, early versions of TCP suffered from buggy and inefficient code, due in part to the increased protocol complexity. Many software developers opted instead to use UDP and write their own reliability operations at the application level. These customizations were possible due to the layered architecture of the TCP/IP protocol stack. TCP implementations improved with time and became efficient and error-free. Most well-known application level protocols, such as the Simple Mail Transfer Protocol (e-mail), the Hypertext Transfer Protocol (HTTP), and the File Transfer Protocol (FTP) use TCP to transfer data. In addition, applications that once used only UDP (e.g., NFS), now support TCP too [5].
Because developers can leverage the existing functionality provided by TCP (or UDP), applications can be developed faster and easier. For example, HTTP remained a simple, lightweight protocol because it was built on top of TCP. It's interesting to note that HTTP/1.0 did not utilize TCP connections in an efficient manner. Instead, the architects of HTTP decided to release a working version quickly and then refine its performance in a subsequent version. The flexibility of a layered architecture allows such development cycles. Moreover, Reed notes that "the e-mail and web infrastructure that permeates the world economy would not have been possible if they hadn't been built according to the end-to-end principle" [3].
The end-to-end argument can be applied to areas other than network system design. Transmeta has recently introduced a new family of microprocessors, named Crusoe, especially designed for mobile computing devices. The Crusoe microprocessors use conventional CMOS (Complementary Metal-Oxide Semiconductor) technology and are fully x86 compatible. Yet these microprocessors differ from conventional chips in their power management capabilities: they continuously adjust their clock speed and voltage to supply only the power that is required, thereby conserving battery life.
The Crusoe microchips are interesting not simply because of their functionality -- though their 700 MHz speed is indeed impressive. Rather, what is particularly noteworthy about these chips is the manner in which they were designed. The conventional approach to microchip design involves adding new features to the hardware. This approach has been followed by Intel, AMD, and others, leading to a Complex Instruction Set Computer (CISC) architecture [1], which has yet to deliver both high-performing and power efficient chips. Transmeta took an alternative and more innovative approach to the problem: they implemented an end-to-end solution that included both hardware and software.
The chip itself is rather limited in its capabilities, and is encapsulated by a software layer, which expands upon its functions. This design permits a significant reduction in the number of power-consuming transistors on the actual chip. The software layer, called Code Morphing Software, implements all the necessary x86 functionality and provides an interface to the layers above it (i.e., the operating system and end-user applications). As a result, many highly sophisticated yet rarely used functions are not hardwired on the chip and instead are implemented at higher levels [6].
The proliferation of mobile devices and the introduction of efficient wireless communication protocols has placed new requirements on existing protocols. Current protocols must evolve to meet the needs of this new Internet. Although the networking community has not yet resolved a number of issues, the end-to-end argument may again prove helpful in creating efficient, reliable and flexible open systems.
Consider, for example, the problem of providing reliable data delivery to the application layer in an Internet that includes wired and wireless networks. Today wireless links tend to be found at the last hop of a communication channel. A low-level error detection and correction mechanism may be sufficient to ensure reliable delivery. Wireless links are expected, though, to continue to proliferate and will exist anywhere in an internetwork. In this case, reliability cannot be efficiently ensured via a hop-by-hop solution. An end-to-end mechanism is clearly needed.
One may argue that including TCP in the limited memory of a wireless phone may not be technically feasible or economically reasonable. Instead, a simpler transport protocol, like UDP, should be used. Or, perhaps, one can avoid the transport layer altogether. In light of the principles of layered system design and the end-to-end argument, one should be cautioned against these approaches.
This article explored the benefits of the end-to-end argument in layered network architectures. The end-to-end argument is by no means an absolute rule, but it can lead to better systems when applied properly. When end-to-end reliability is needed, the end-to-end argument should be applied without further ado. Even when other approaches are tempting, network system designers should not overlook the power of this argument. Finally, the end-to-end argument applies to other system design areas, such as computer architecture, with remarkable results.