Draft for Comment The Birth of the Internet: An Architectural Conception for Solving the Multiple Network Problem by Ronda Hauben rh120@columbia.edu Section V of Computer Science and the Role of Government in Creating the Internet: ARPA/IPTO (1962-1986) "The traditional method of routing information through the common-carrier switched network establishes a dedicated path for each conversation. With present technology, the time required for this task is on the order of seconds. For `voice communication,' that overhead time is negligible, but in the case of many short transmissions, such as may occur between computers, that time is excessive. Therefore, ARPA decided to build a new kind of digital communication system employing wideband leased lines and `message switching,' wherein a path is not established in advance and each message carries an address. In this domain the project portends a possible major change in the character of data communication services in the United States." F.E. Heart, R.E. Kahn, S.M. Ornstein, W.R. Crowther and D.C. Walden, "The Interface Message Processor for the ARPA Computer Network" AFIPS-Conference Proceedings, Vol 35, Spring Joint Computer Conference, 1970, pg.551 "It is not enough to just draw pictures and show arrows. You have to figure out what kind of algorithms would you use and what information gets passed back and forth. It is one thing when you plug a socket into the wall and electrons flow; it is another thing when you have to figure out for every electron which direction it takes, and how many of them, how many per unit time; what happens if that machine is down, and these buffers are full....It wasn't quite theoretical, it was more of a system design issue, an architectural issue to a protocol design, in part. The whole field of protocols and architectures was really in its very early infancy back then...." Robert Kahn, Babbage Institute Interview (Judy E. O'Neill) 24 April 1990 "The Internet as we now know it embodies a key underlying technical idea, namely of open architecture networks. In this approach, the choice of any individual network technology was not dictated by a particular architecture but rather could be selected freely by the provider and made to internetwork with the other networks through a meta-level `Internetworking Architecture'. From Leiner, et al. A Brief History of the Internet February 20, 1998 http://www.isoc.org/internet/history/brief.html ---------- I - Communication Science and Cybernetics as Prelude In his book published in 1949, Norbert Wiener predicted that research done on communications and control theory during WWII would lead to important scientific breakthroughs in the postwar period. He writes: Many perhaps do not realize that the present age is ready for a significant turn in the development toward far greater heights than we have ever anticipated. The point of departure may well be the recasting and unifying of the theories of control and communication in the machine and animal on a statistical basis. The philosophy of this is contained in my book Cybernetics. Norbert Wiener, Extrapolation, Interpolation and Smoothing of Stationary Time Series, The Technology Press of Massachusetts Institute of Technology and John Wiley & Sons, 1949, pg v. How Wiener's prediction has proven true will be the subject of the latter part of this paper. But first it is important to review some of the early conceptual foundations of cybernetics to understand the developments we are seeing in the creation and spread of the Internet. These are not isolated developments but developments which are a major contribution to the field of communications science and engineering which was developed in the early postwar WWII period. Wiener describes communications engineering as "the study of messages and their transmission, whether these messages be sequences of dots and dashes, as in Morse Code, or the teletypewriter, or sound-wave patterns, as in the telephone or phonograph, or patterns representing visual images as in telephoto service and television." (1) Emphasizing the process of communication that is at the heart of communications engineering, Wiener writes: The message to be transmitted is represented by some array of measurable quantities distributed in time. In other words, by coding or the use of voice or scanning, the message to be transmitted is developed into a time series. This time series is then subjected to transmission by an apparatus which carries it through a succession of stages, at each of which the time series appears by transformation as a new time series. These operations, although carried out by electrical or mechanical or other such means, are in no way essentially different from the operations computationally carried out by the time-series statistician with slide rule and computing machine. Ibid. Describing the essential nature of communications and of communications engineering, Wiener identifies the transmission of messages as the variable information that is to be transmitted. He writes: The transmission of a single fixed item of information is of no communication value." Ibid. That is because, as Wiener observes, "We must have a repertory of possible messages and over this repertory a measure of the probability of these messages." (Ibid.) Claude Shannon was instrumental in making the transition from treating the communication process as guesswork to putting the study of communication on a scientific footing.(2) In describing the parts of a communications process Shannon provides a helpful diagram listing the essential components and their interrelationship. The components that Shannon includes are an information source, a transmitter, a communications channel, a receiver, and a destination. He describes how a message is introduced by an information source. A transmitter changes the message into a signal that can be communicated, distorting the message. The signal is transmitted through a channel to a receiver. The receiver will operate on the signal to restore the message so that it can be conveyed to its destination. However, noise can also be introduced into the channel and can affect the signal. Noise is defined as something statistical but unpredictable. Restoring the message from the signal in the presence of noise can be a difficult or even unachievable task. Following is the diagram Shannon developed to represent the communication process: information information [source] -> [transmitter] -> [channel] -> [receiver] -> [destination] signal ^ received | signal | Noise Source Claude E. Shannon,"Communication in the Presence of Noise", Proceedings of the I.R.E., vol 37, pp. 10-21, January 1949. A social scientist, Karl W. Deutsch learned about communication theory from Norbert Wiener in the late 1940s. Deutsch wrote a chapter on communication theory in his book Nationalism and Social Communication. He enumerates how communication via different media such as radio, tv, or making a recording on records, involves certain common aspects. Identifying these common aspects is part of the field of communications engineering. Deutsch writes: Communications engineering transfers information. It does not transfer events; it transfers a patterned relationship between events. When a spoken message is transmitted through a sequence of mechanical vibrations of the air and of a membrane, thence through electric processes in a broadcasting station and through radio waves, thence through electric and mechanical processes in a receiver and recorder to a set of grooves on the surface of a disk, and if finally played and made audible to a listener -- what has been transferred through this chain of processes, or channel of communication is something that has remained unchanged, invariant, over this whole sequence of processes. It is not matter, nor any one of the particular processes, nor any major amount of energy, since relays and electronic tubes make the qualities of the signal independent from a considerable range of energy inputs. Karl W. Deutsch, Nationalism and Social Communications, The M.I.T. Press, Cambridge, MA, 1953, 1966, pg. 93. Deutsch shows how the process of taking a photograph relies on a similiar process of representing the pattern of events on a different media. He writes: The same applies to the sequence of processes from the distribution of light reflected from a rock to the distribution of chemical changes on a photographic film, and further, to the distribution of black and white dots on a printing surface, or the distribution of electric "yes" or "no" impulses in picture telegraphy or television. What is transmitted here is neither light rays nor shadows, but information, the patterns of relationships between them. Ibid. Deutsch also explains the concept of a "state description" of the event to be communicated and then of the concept of information as that which the state descriptions of an event on different media have in common. He writes: In the second group of examples, we could describe the state of the rock in terms of the distribution of light and dark points on the surface. This would be a "state description" of the rock at a particular time. If we then describe the state of the film after exposure in terms of the distribution of the dark grains of silver deposited on it and of the remaining clear spaces, we should get another state description. Each of the two state descriptions would have been taken from a quite different physical object -- a rock and a film -- but a large part of these two state descriptions would be identical, whether we compared them point by point or by mathematical terms. There would again be a great deal of identity between these two descriptions and several others, such as the description of the distribution of black and white dots on the printing surface, or of the electric "yes" and "no" impulses in the television circuits, or of the light and dark points on the television screen. The extent of the physical possibility of transferring and reproducing these patterns corresponds to the extent that there is "something" unchanging in all the relevant state descriptions of the physical processes by which this transmission is carried on. That "something" is information -- those aspects of the state description of each physical process which all these processes had in common. Ibid, pg 93-94. The transmission of this "information" is at the heart of how communications is carried out and of the subject matter that communications engineers consider. (3) In the process, noise or other disturbances can be introduced, making it more difficult to extract a message from the signal that has been transmitted. But being able to generalize the process by which communications occurs over diverse kinds of media means that it is possible to approach communications as a science rather than as guesswork. II - Computers Need a Means of Communication To understand the connection between the Internet and the scientific foundation it is built on, it is important to explore the early development of the Internet and the conception that gave it birth. Computers need a means of communication different from the circuit switching technology and the theory developed for telephone communication. While telephone technology is appropriate for the continuous flow of voice communication, it is not well matched to the bursty nature of computer data. The creation of the ARPANET was an effort to solve the problem of the mismatch between telephone technology and the bursty nature of computer communication. (4) Voice transmission via the telephone is based on opening up a circuit, maintaining that circuit for the entire phone call, and then closing the circuit when the phone call ends. Computers, however, send data in finite spurts and there are periods when no data is transmitted. Since computer communication occurs in such finite bursts of data, rather than as a continuous data stream, this set up time is expensive overhead for computer communication. It is therefore inefficient to maintain a dedicated circuit for one user's computer communication or to create a circuit as frequently as would be needed for data transmission, unless the setup and teardown can be done quickly relative to the transmission time. Instead of circuits, store and forward technology, which depends on relaying like that used for telegraphy, has been found to be a more appropriate choice. Breaking a message up into segments, or packets, is more in accord with how a computer stores, processes and then transfers data in finite length segments, than the practice of creating a dedicated circuit for voice communication.(5) The ARPANET was designed to demonstrate the concept of packet switching and to explore its use for resource sharing. The ARPANET was developed using long haul telephone lines, and building a store and forward computer communications network on top of the long haul lines. At a symposium in 1970, Robert Kahn, who was responsible for the network design, as part of the BBN IMP team, explains: As most of you may know, the ARPA Computer Network is a distributed store and forward network that interconnects large time-sharing computers via interface message processors (IMPs) developed by Bolt, Beranek and Newman, Inc. (BBN) and wideband leased circuits supplied by the telephone company.... The implementation began about two years ago and is continuing today. As of December 1970, the ARPA Network contains 13 IMPs and 14 wideband circuits." Robert Kahn, "Terminal Access to the ARPA Computer Network," Courant Computer Science Symposium, November 30 - December 1 1970 in Computer Networks, Prentice Hall, Engelwood Cliffs, 1972, pg. 148. Since computer communication required overcoming certain of the basic technical limitations of telephone technology, circa 1960 - 1970, there was a need to develop a whole new body of experience to learn how the results of communication science could be utilized with computers for computer communications. The development of the ARPANET provided the vital experience in this process to create a functioning computer communications network appropriate to the nature of the computer. Not only did the developing network provide a means of demonstrating a new form of communications technology; it also, and more importantly, provided the environment for developing this new form of technology and the concepts that would guide its development. Discussing this important aspect of ARPANET development, Kahn writes: The network provides a means for studying store and forward data communication networks. As such, the net serves as a vehicle for experimentation, for measurement and for modeling. The network is also serving to clarify the issues that relate computation and communication. Ibid., pg. 148 It was necessary to explore the nature of computation and of communication in designing a computer communications network. The researchers also had to understand the nature of the relationship between these two fields. Kahn observes that "the introduction of a message-switched distributed communications network is....a service offering of a radically different nature...." from the circuit switching used for the telephone system.(Kahn, pg 149) This new technology was greeted by a hostile reaction from those involved with the development of telephone technology. Their reaction was similar to that of those involved with telegraphy technology following the introduction of the telephone. (6) Kahn describes how the goal of resource sharing of human and computer resources inspired those developing the ARPANET: The ARPA Network was envisioned by many people as a step in the direction of achieving greater human-to-human communication by means of increased technical inspiration among the different participants in the network. Ibid., pg. 148-149 The ARPANET was developed as a store and forward model. Describing this design, Frank Heart, Bob Kahn, Severo Ornstein, Will Crowther and Dave Walden, participants on the BBN IMP team, write: In a nationwide computer network, economic considerations also mitigate against a wideband leased line configuration that is topologically fully connected. In a non-fully connected network, messages must normally traverse several network nodes in going from source to destination. The ARPA Network is designed on this principle and, at each node, a copy of the message is stored until it is safely received at the following node. The network is thus a store and forward system and as such must deal with problems of routing, buffering, synchronization, error control, reliability, and other related issues. F.E. Heart, R.E. Kahn, S.M. Ornstein, W.R. Crowther and D.C. Walden, "The Interface Message Processor for the ARPA Computer Network" AFIPS-Conference Proceedings, Vol 35, Spring Joint Computer Conference, 1970, pg.551 The ARPANET architecture was built on a design that separated the Host computers in the network from a subnetwork of computer processors called Interface Message Processors. By late 1969 a 4-node test network was installed. And the question to be answered was: What would be the relationship between the Hosts and the subnetwork? What would be the functions assigned to each? (7) The responsibility for communication functionality was assigned to the IMP subnetwork which was designed so that its "essential task is to transfer bits reliably from a source location to a specified destination." (8) To the Host computers was assigned the task of establishing a "Host-to-Host protocol and the enormous problem of planning to communicate between different computers." (9) This latter issue was separated from the problems of the design of the communications subnetwork. Significant progress was made in sorting out the research issues and in having the needed experience to explore these issues. For example, tools were created to monitor the performance of the network.(10) Also problems were identified like the problem called "reassembly lockup" described in an article by Kahn and Crowther on flow control in the ARPANET. (11) In 1970, the IMP team predicted that network growth would have an explosive effect on others around the world. They wrote: On a more global level, we anticipate an explosive growth of message switched computer networks, not just for the interactive pooling of resources, but for the simple convenience and economies to be obtained for many classes of digital data communication. We believe that the capabilities inherent in the design of even the present subnet have broad application to other data communication problems of government and private industry. Heart et al, pg 566. III - The Problem of Intercommunication of Dissimilar Packet Networks By 1972, the impact of computer networking was creating a new challenge for those interested in computer communications. This was the problem that Kahn called the Multiple Network Problem. A number of researchers around the world, like Louis Pouzin in France and Donald Davies in Great Britain were already exploring how to build packet switching networks that would conform to their national and local needs. The problem of how to communicate across the boundaries of dissimilar networks was on the horizon. In the article, "Resource-Sharing Computer Communications Networks" published in the "Proceedings of the IEEE" in November, 1972, Kahn considered adjustments at the planning stage of the networks to make interconnection possible. (pg. 1407) This would require agreements at the design stage of the networks and could exclude those networks that had already been developed. The ARPANET, for example, had not been designed with the aim of communicating with other networks. The conception that gave birth to the ARPANET was a conception requiring all those interested in computer networking to become a component part of the ARPANET. The development of Cyclades in France or NPL in Great Britain, however, demonstrated that those designing packet switching networks had their own technical, administrative and political needs and goals to serve. It wasn't feasible for them to either become part of the ARPANET nor to wait till a common plan for interconnection was decided upon to develop their networks. It was becoming ever more urgent that those designing packet switching networks determine how to solve the problem of the interconnection of dissimilar networks. Since it was not appropriate to require all networks to await a common decision of what design parameters they should adopt to be able to connect with other networks, nor that all new networks should become a component part of the ARPANET, a different approach was needed. Recognizing the nature of the problem, Kahn proposed a technical solution. Shortly after the successful ICCC'72 ARPANET demonstration, Kahn left BBN and went to work at the Information Processing Techniques Office (IPTO) at ARPA. Joining the IPT office as a program manager, Kahn set out to develop certain projects and also took over responsibility for one that had already been funded. A new initiative was to create a ground-based packet radio network. An existing initiative was to create a satellite-based packet switching network. (12) Focusing on radio broadcasting technology, Kahn led an effort to create a ground packet radio network. This kind of packet communications network was of particular interest to the Department of Defense with their need for mobile communications. Kahn planned to build on the experience gained by researchers at the University of Hawaii who created AlohaNet. AlohaNet had demonstrated that packet radio technology was feasible for a one-hop system. (13) Kahn's objective was to create a multinode ground packet radio network (PRNET) where each node could be mobile. In parallel, he sought to create a packet satellite network (SATNET) utilizing Intelsat satellites. Thinking about how to create a ground packet radio network, (PRNET) Kahn realized that it would be desirable (indeed necessary) for users on PRNET to be able to access the computational resources on ARPANET. The packet satellite network was mainly intended for transiting to european sites, but there was still the problem of connecting (in both directions) to computer resources over there as well. How then to link up these three packet networks, two that would be based on utilizing radio transmission and the other which used shared point to point leased lines from the telephone company? At first, Kahn considered creating a local protocol for PRNET to make it possible to use the ARPANET Host-to-Host protocol NCP (Network Control Protocol). However, the limitations of NCP meant that would not be an option. These problems included: 1) NCP addressing did not have a way of addressing Hosts on other networks. 2) NCP required a reliable IMP subnetwork to transmit packets to their destination. But other networks were not always likely to have reliable end-end communication. 3) Error control was essential on an end-end basis, but did not exist for NCP. Thus dissimilar and possibly unreliable networks required a different architecture and different protocol design from the architecture and protocol design creating the ARPANET. Once there are other networks, the challenge becomes: How to provide for their interconnection and how to insure end-end communication for the attached host computers? Is there a new architecture and protocol design which can support resource sharing across the boundaries of dissimilar networks? Simplifying the problem leads to the question: How to transmit computer data messages on dissimilar networks without requiring changes to the participating networks? What protocol design will allow for the diversity that will exist in administrative, political, and technical aspects of dissimilar packet switching networks? Before exploring this problem, it will be helpful to briefly discuss the role that protocols play in communications. Let's look at a situation where there is a desire to have communication between people in two different countries who speak different languages. We can refer to these languages as language A and language B. For communication to take place, there is a need to be able to translate the meaning conveyed in language A for those who speak language B. And vice versa. ARPANET researchers had a similar problem to solve. Their effort to connect dissimilar computers is in some ways like the situation of different people who speak different languages and yet want to communicate. To understand the problem it is helpful to look at how a letter can be mailed across national and language boundaries. The postal services in these different countries do not have to agree to anything about the contents of the letter. But they do have to agree upon a format for the address on the letter. We could say that the addressing format is the convention agreed to as a protocol to make it possible to send a a written letter across national and language boundaries. In a similar way, it was necessary for the researchers working on the ARPANET to have a protocol that would allow them to send messages to dissimilar computers. These computers were made by different manufacturers according to different specifications. They used different operating systems, different programming languages, different character sets and different word sizes. Researchers from the different sites were given the task of creating a common protocol to make it possible to transmit messages from one computer to another over the ARPANET. (14) They formed the Network Working Group. The protocol they created is described in the article "Host-to-Host Protocol" by Carr, Crocker, and Cerf. They called the protocol the Network Control Protocol (NCP). Then programmers at each ARPANET site had to write a program to embed the protocol in the computer operating system of the participating Host computers. The program was called "the Network Control Program." Along with the work done by the Host computers to transmit messages on the ARPANET, was the role played by the IMP subnetwork. To transmit a message on the ARPANET, the Host computers sent the message to the nearby IMP. The local IMP broke the message into packets, transmitted the packets through the subnetwork to the IMP connected to the destination Host computer. The message was then reassembled by the destination IMP and transmitted to the destination Host computer. With the division of functionality between the IMP subnetwork and the Host computers, the Host-to-Host protocol used for the ARPANET was more like a device driver than a protocol that specified communications functions. Communications functions include breaking a message into segments or packets, providing for flow control, the routing of the packets, error control, reassembling the message from the packets, and so forth. (15) The ARPANET solved the difficult problem of communication in a network with dissimilar computers and dissimilar operating systems. However, when the objective is to share resources across the boundaries of dissimilar networks, the problems to be solved are compounded. Different networks mean that there can be different packet sizes to accommodate, different network parameters such as different communication media rates, different buffering and signaling strategies, different ways of routing packets, and different propagation delays. Also dissimilar networks can have different error control techniques and different ways of determining the status of network components.(16) IV - Creating an Architecture for the Internet Though Kahn originally considered the possibility of seeking changes to each of the constituent networks to solve the Multiple Network Problem, he soon recognized the advantage of an architecture that would directly accommodate the diversity of networks. (17) To do so he conceived of a meta-level architecture independent of the underlying network technology. The means of achieving this was to design a protocol to be embedded in the operating system of Host computers on each participating network. The protocol would also specify how black boxes or "gateways" would interface between networks and how they would participate in routing packets through dissimilar networks. Describing the thinking that went into solving the Multiple Network Problem, "A Brief History of the Internet" outlines the origin of the conception that Kahn would call the open architecture networking environment. The article explains: The Internet was based on the idea that there would be multiple independent networks of rather arbitrary design, beginning with the ARPANET as the pioneering packet switching network, but soon to include packet satellite networks, ground-based packet radio networks and other networks. The Internet as we now know it embodies a key underlying technical idea, namely that of open architecture networking. In this approach, the choice of any individual network technology was not dictated by a particular network architecture but rather could be selected freely by a provider and made to interwork with the other networks through a meta-level "Internetwork Architecture." Up until that time there was only one general method for federating networks. This was the traditional circuit switching method where networks would interconnect at the circuit level, passing individual bits on a synchronous basis along a portion of an end-to-end circuit between a pair of end locations. Recall that Kleinrock had shown in 1961 that packet switching was a more efficient switching method. Along with packet switching, special purpose interconnection arrangements between networks were another possibility. While there were other limited ways to interconnect different networks, they required that one be used as a component of the other, rather than acting as a "peer" of the other in offering end-to-end service. Barry M. Leiner. Vinton G. Cerf, David D. Clark, Robert E. Kahn, Leonard Kleinrock, Daniel C. Lynch, Jon Postel, Larry G. Roberts, and Stephen Wolff A Brief History of the Internet, pg. 4. http://www.isoc.org/internet/history/brief.html To create an environment where the networks would be peers of each other, rather than where one would have to be a component of the other, there was the need to design a protocol to embody this open architecture concept. Such a protocol would make it possible to communicate across the boundaries of dissimilar packet switching networks. The challenge in accommodating dissimilar networks is at once a conceptual and architectural problem. Kahn recognized the need for a communications protocol to transmit packets from one network, and reformat them as needed for transmission through successive networks. This would require that there be black boxes or gateway computers and software that would provide the interfaces between the dissimilar networks and which would route the packets to their destination. (18) Also there would need to be software to carry out the functions required by the protocol. Appropriate software modules, and perhaps other modifications to allow efficient performance, would then have to be embedded in the operating systems of the host computers in each of the participating networks and gateways would have to be introduced between them. The design for such a protocol would be a guide to create the specification standard for the software and hardware that each network would agree to implement to become part of an internetwork communications system. The standards or agreements to cooperate would be set out in the protocol. The research creating the ARPANET had developed the conception of networking protocols and the need for such protocols. Robert Metcalfe is the inventor of the Ethernet, the most widespread technology used for local area networking. In his PhD thesis, he reviews the technical experience gained from developing the ARPANET and ALOHANET. Metcalfe describes the role of protocols in developing computer networking. He writes: The ways in which processes organize their (local and remote) cooperation are called "protocols". We use the word to refer to a set of agreements among communicating processes relating to (1) rendezvous (who and where), (2) format (what and how), and (3) timing (when) of data and control exchanges. Robert M. Metcalfe, "Packet Communication", Peer-to-Peer Communication, Inc., San Jose, 1996, pg. 100. Metcalfe notes what these areas include: (...) at least four problem areas in which protocol agreements must be made: (1) routing, (2) flow, (3) congestion, and (4) security. Ibid. An internetworking protocol would need to be a communications protocol. As such it would specify the software and hardware to do flow control, error checking, to break a message into packets in the sending Host computers, and to provide for packet reassembly in the destination Host computers, to provide a means of addressing computers on other packet networks and other needed functions. The protocol would specify the role and software for the gateways. Metcalfe enumerates some of the issues that Kahn had identified to create the architecture for the Internet and for the protocol that would make an Internet possible. Metcalfe writes: Among these issues were optimal packet and message size, message fragmentation and reassembly, flow and congestion control, naming, addressing, and routing, store-and-forward delay, error control, and the texture of interprocess communication. Metcalfe, pg. xx Before he left BBN in 1972, Kahn had written a memo about his thinking about a communications-oriented set of operating system principles titled "Communications Principles for Operating Systems".(19) Metcalfe refers to the memo as influential in his thinking about protocol development. Elaborating the notion of open architecture, the authors of "A Brief History of the Internet" write: In an open architecture network, the individual networks may be separately designed and developed and each may have its own unique interface which it may offer to users and/or other providers, including other Internet providers. Each network can be designed in accordance with the specific environment and user requirements of that network. There are generally no constraints on the types of network that can be included or on their geographic scope, although certain pragmatic considerations will dictate what makes sense to offer. Leiner et al, pg. 4 The ground rules Kahn worked out to guide the creation and design of an open architecture environment include: o Each distinct network would have to stand on its own and no internal changes could be required to any such network to connect it to the Internet. o Communication would be on a best effort basis. If a packet didn't make it to the final destination, it would shortly be retransmitted from the source. o Black boxes would be used to connect the networks; these would later be called gateways and routers. There would be no information retained by the gateways about the individual flows of packets passing through them, thereby keeping them simple and avoiding complicated adaptation and recovery from various failure nodes. o There would be no global control at the operations level. Leiner et al. All of these ground rules are a significantly different conceptual approach from that used on the ARPANET. The ARPANET required any computer system sharing resources with other computers on its network to become a component part of it. Communications on the IMP subnetwork were via dedicated logical links and once a packet was sent to the IMP subnetwork its transmission was guaranteed via an error free transmission system. The IMPS carried out the interface function with the communication subnetwork and the Host computers. The IMP subnetwork was a complex rather than a simple system. The ground rules for an open architecture network environment are such that all networks are welcome to join in the interconnection on a peer basis, rather than one as a component of the other. Messages are to be broken into packets, and the packets retransmitted until there is an acknowledgement of their successful transmission. This simplifies the error detection process and provides for a beginning flow control mechanism. Black boxes are to be used as gateways but their functions are to be limited so they can be kept simple. No one entity is to be allowed to establish control at the operational level of the participating networks. Communications theory provides for a conceptual model to understand the ground rules Kahn set out. These rules are a significantly different conceptual approach from that used on the ARPANET. Communication theory provides a model for the transmission of messages by breaking the messages up and transmitting them via some transmission media such as radio waves or electrical wires, or satellites, or telephone wires. The signals created in this process are transmitted over a communication channel such as air, wire, and so forth. During this process signals can be combined with noise. Then the receiver tries to extract the message from the signal. The Shannon communication model puts the burden for communication on the means of transmitting a message by putting it into a format that is appropriate for the communication channel. The receiver extracts the message from the signal at the other end of the communication link. The Shannon model separates the transmitting and receiving functions from the message. Similarly, messages can be considered independently of the technology being used for their transmission. In his conception for the architecture for the Internet, Kahn put the burden for message transmission on the protocol. The protocol would specify how the message would be transmitted and received, rather than requiring changes in the design of the participating networks. All a network would need to do to become part of the communications system would be to interface with one or more gateways already connected to the system. Instead of an architecture like the ARPANET, where each new computer system would have to get permission to connect to the ARPANET, the open architecture environment supports a cooperative relationship among participating networks. The Internet architecture makes it possible to accommodate the political, administrative and technical dissimilarity of the participating networks. The communications model developed by Shannon provides for an information source, a transmitter, a channel, a receiver, and the reconstitution of the message. The communications model of the Internet provides for an information source, a transmitter, a channel, a receiver, and the reconstitution of the message. The difference is that the message in the model developed by Shannon has to be extracted from the signal. And at times the noise which has been introduced in the channel makes it impossible to extract the message from the signal. In the Internet model, the message is broken into segments, called packets, and the packets are transmitted until there is an acknowledgement that they have been successfully received and reassembled. The problem of noise in the Shannon model, is not solved in the Internet model. However, an error detection process is generally adequate for the accurate transmission of messages by using a feedback mechanism. Cybernetics has contributed the ability to continue the transmission of the segments of a message until there is the confirmation of successful transmission. And a sliding window mechanism forms the basis for a flow control mechanism that adjusts itself to the feedback received about the successful transmission of packets. More packets are sent if there is feedback that packets are successfully being received by the destination Host computers. Fewer packets are transmitted when feedback is not received about successful transmission. The creation of a functioning computer communications system that makes communication and resource sharing possible among an ever growing system of interconnected networks is a significant advance in communications science, and an important conceptual achievement. The Internet is a world wide communications system that has been created showing that the successful transmission of messages in the form of computer data is indeed theoretically and practically feasible. The Internet, the achievement of 40 years of scientific research to apply the science of communication theory to the new phenomena of computer communications data is now a precious gift that has been presented to the people of the world. This is the kind of surprising achievement that Norbert Wiener predicted would be possible by those who would build on the achievements of communications scientists in the post World World II period. How this development has been achieved, how a protocol was designed which would embody an internetwork architecture, and how PRNET and SATNET were developed and linked with the ARPANET will be the subject of the next section of this paper. (to be continued) ---------------- Footnotes 1) Norbert Wiener, "Extrapolation, Interpolation and Smoothing of Stationary Time Series," The Technology Press of Massachusetts Institute of Technology and John Wiley & Sons, 1949, pg 2. 2) See chapter 6 "Cybernetics, Time-sharing, Human-Computer Symbiosis and Online Communities: Creating a Supercommunity of Online Communities" in "Netizens: On the History and Impact of Usenet and the Internet", IEEE Computer Society Press, Los Alamitos, 1997, pg. 76-95, Claude Elwood Shannon, "Collected Papers", edited by N.J.A. Sloane and Aaron D. Wyner, IEEE Press, 1992. 3) Deutsch also describes the process by which noise is introduced into the communications process and the problem it presents to the communications engineer. He writes: To the extent that the last state description in such a sequence differs from the first, information has been lost or distorted during its passage through the channel. This amount of lost information can be measured. We can measure it in very refined ways, as in telephone or television engineering, where a message is broken up into very many electric impulses or image points. The percentage of the impulses or image point at the other end is measured on a statistical basis, and their significance is evaluated in terms of the change each of them makes in the probability distribution of the picture which is already there. Or, we can measure it in simpler terms by breaking up a message into a few simple parts, and asking how many of these parts were transmitted within a given minimum standard of accuracy, and how drastically the probability of the picture at the other end was changed by the absence of the pieces which were lost. Refined or crude, more accurate or less, each of these methods would give us some quantitative measure of the fidelity of a communications channel in comparison with other channels. By either technique, we may derive a measure for the "efficiency" of a channel, as well as of the relative efficiency or "complementarity" of any parts or stages of the channel in relation to others. Other measures for the performance of a communications system, or for the complementarity of its parts, would be the speed at which information could be transmitted, or the range of different kinds of information that could be carried. Common to all of these approaches would be the fact that patterns of information can be measured in quantitative terms. They can be described in mathematical language, analyzed by science, or transmitted or processed on a practical industrial scale. Ibid. pg 94-94. 4) See also, "The traditional method of routing information through the common-carrier switched network establishes a dedicated path for each conversation. With present technology, the time required for this task is on the order of seconds. For `voice communication,' that overhead time is negligible, but in the case of many short transmissions, such as may occur between computers, that time is excessive. Therefore, ARPA decided to build a new kind of digital communication system employing wideband leased lines and "message switching," wherein a path is not established in advance and each message carries an address. In this domain the project portends a possible major change in the character of data communication services in the United States." F.E. Heart, R.E. Kahn, S.M. Ornstein, W.R. Crowther and D.C. Walden, "The Interface Message Processor for the ARPA Computer Network" AFIPS-Conference Proceedings, Vol 35, Spring Joint Computer Conference, 1970, pg.551 5) Kahn, Gronemeyer, Burchfiel, and Kunzelman describe the good match between packet switching and computers. They write: Packet switching was originally designed to provide efficient network communication for "bursty" traffic and to facilitate computer network resource sharing. It is well known that the computer traffic generated by a given user is characterized by a very low duty cycle in which a short burst of data is sent or received followed by a longer quiescent interval after which additional traffic will again be present. The use of dedicated circuits for this traffic would normally result in very inefficient usage of the communication channel. A packet of some appropriate size is also a natural unit of communication for computers. Processors store, manipulate and transfer data in finite length segments, as opposed to indefinite length streams. It is therefore natural that these internal segments correspond to the computer generated packets, although a segment could be sent as a sequence of one or more packets. Computer resource sharing techniques which exploit the capabilities inherent in packet communications are still primarily in the research stage, but significant progress has already been achieved in this area. from Robert E. Kahn, Steven A Gronemeyer, Jerry Burchfiel, Ronald C. Kunzelman, "Advances in Packet Radio Technology," Proceedings of the IEEE, Vol 26, no. 11, November 1978, p. 1468. 6) Robert Kahn, "Terminal Access to the ARPA Computer Network," Courant Computer Science Symposium, November 30 - December 1 1970 in Computer Networks, Prentice Hall, Engelwood Cliffs, 1972, pg. 148. 7) In their article, "The Interface Message Processor for the ARPA Computer Network", the researchers from the BBN IMP Team, wrote, "This paper discusses the design of the subnet and describes the hardware, the software, and the predicted performance of the IMP. The issues of Host-to-Host protocol and network utilization are barely touched upon; these problems are currently being considered by the participating Hosts and may be expected to be a subject of technical interest for many years to come." (pg 552) The researchers elaborate, "The basic notion of a subnet leads directly to a series of questions about the relationship between the Hosts and the subnet. What tasks shall be performed by each? What constraints shall each place on the other? What dependence shall the subnet have on the Hosts? In considering these questions, we were guided by the following principles: (1) The subnet should function as a "communications system" whose essential task is to transfer bits reliably from a source location to a specified destination. Bit transmission should be sufficiently reliable and error free to obviate the need for special precautions (such as storage for retransmission on the part of the Hosts; (2) The average transit time through the subnet should be under a half second to provide for convenient interactive use of remote computers; (3) The subnet operation should be completely autonomous. Since the subnet must function as a store and forward system, an IMP must not be dependent upon its local Host. The IMP must continue to operate whether the Host is functioning properly or not and must not depend upon a Host for buffer storage or other logical assistance such as program reloading. The Host computer must not in any way be able to change the logical characteristics of the subnet; this restriction avoids the mischievous or inadvertent modification of the communication system by an individual Host user; (4) Establishment of Host-to-Host protocol and the enormous problem of planning to communicate between different computers should be an issue separated from the subnet design." AFIPS-Conference Proceedings, Vol 35, Spring Joint Computer Conference, 1970, pg. 552-553. 8) Ibid., pg. 552 9) Ibid., pg. 553 10) The researchers write: "Because the network is experimental in nature, considerable effort has been allocated to developing tools whereby the network can supply measures of its own performance. The operational IMP program is capable of taking statistics on its own performance on a regular basis; this function may be turned on and off remotely. The various kinds of resulting statistics, which are sent via the network to a selected Host for analysis, include "snapshots", ten-second summaries, and packet arrival times. Snapshots are summaries of the internal status of queue lengths and routing information. A synchronization procedure allows these snapshots, which are taken every half second, to occur at roughly the same time in all network IMPs; a Host receiving such snapshot messages could presumably build up an instantaneous picture of overall network status. Ten-second summaries include such IMP-generated statistics as the number of processed messages of each kind, the number of retransmissions, the traffic to and from the local Host, and so forth; this statistical data is sent to a selected Host every ten seconds. In addition, a record of actual packet arrival times on modem lines allows for the modeling of line traffic. (As part of its research activity, the group at UCLA is acting as a network measurement center; thus, statistics for analysis will normally be routed to the UCLA Host.) Perhaps the most powerful capability for network introspection is "tracing". Any Host message sent into the network may have a "trace bit" set in the leader. Whenever it processes a packet from such a message, the IMP keeps special records of what happens to that packet -- e.g., how long the packet is on various queues, when it comes in and leaves, etc. Each IMP that handles the traced packet generates special trace report messages that are sent to a specified Host; thus, a complete analysis of what has happened to that message can be made. When used in an orderly way, this tracing facility will aid in understanding at a very detailed level the behavior of routing algorithms and the behavior of the network under changing load conditions. Ibid., pg 557 11) Kahn and Crowther write: "This flow control technique is reliable when the destination IMP has a sufficiently large amount of reassembly buffer storage to hold arriving packets on all the links in use. However, when a limited amount of reassembly space is available, the subnet buffer storage can become filled if messages are sent into the net on a sufficient amount of links destined for a given host. Equivalently, the subnet buffers will become filled with backed-up messages if a sufficient number of messages arrive for a host (or hosts) at a faster rate than the host is accepting them. Deadlock conditions are known to be possible in systems that involve competition for limited resources, and precautions must be taken to prevent their occurrence (9)(10). A type of deadlock called "reassembly lockup" can occur in the subnet when reassembly space is unavailable to store incoming multipacket messages....A simulation program that models an early version of the ARPANET was used to obtain some quantative measures of this aspect of system performance. The simulation showed the occurrence of reassembly lockup for the simple case of eight- packet traffic on many links between a pair of hosts, as well as for other traffic patterns involving multipacket traffic among more than two hosts.... Robert E. Kahn and William R. Crowther, "Flow Control in a Resource-Sharing Computer Network," in Proc of the Second ACM IEEE Symposium on Problems in the Optimization of Data Communication System, Palo Alto, California, October 1971, pg 541-542. 12) There was a third project, a project to develop a security system for networking relying on end-to-end security techniques, that Kahn also took on during this early period of his work at IPTO in early 1973. See Kahn, Babbage Institute Interview (Judy E. O'Neill), April 1990. 13) See N. Abramson, "The ALOHA SYSTEM - Another Alternative for Computer Communications," FJCC, 1970, pgs 281-285. And Franklin F. Kuo and Norman Abramson, "Some Advances in Radio Communication for Computers," Proceedings of Comcon 73, IEEE Institute of Electrical and Electronics Engineers, 1973, pgs. 57-60. 14) See the description of the creation of this protocol in Chapter 7: "Behind the Net: The Untold Story of the ARPANET and Computer Science" in Netizens: On the History and Impact of Usenet and the Internet, IEEE Computer Society Press, Los Alamitos, CA, 1997, pg. 96-114. Also see C. Stephen Carr, Stephen D. Crocker and Vinton G.Cerf, "Host-Host communication protocol in the ARPA network", Proceedings of the AFIPS SJCC, 1970, pgs 589-597. 15) Some helpful articles include for example F.E. Heart, R.E. Kahn, S.M. Ornstein, W.R. Crowther and D.C. Walden, "The Interface Message Processor for the ARPA Computer Network" AFIPS- Conference Proceedings, Vol 35, Spring Joint Computer Conference, 1970, pg. 551-567. Robert E. Kahn and William R. Crowther, "Flow Control in a Resource-Sharing Computer Network," in Proc of the Second ACM IEEE Symposium on Problems in the Optimization of Data Communication System, Palo Alto, California, October 1971, pp 539-546. Robert E. Kahn, "Resource-Sharing Computer Communications Networks" published in the Proceedings of the IEEE in November, 1972, pg. 1397-1407. Robert Kahn, "Terminal Access to the ARPA Computer Network," Courant Computer Science Symposium, November 30 - December 1 1970 in Computer Networks, Prentice Hall, Engelwood Cliffs, 1972, pg. 147-166. Howard Frank, Robert Kahn, and Leonard Kleinrock, "Computer Communication Network Design - Experience with Theory and Practice" AFIPS - Conference Proceedings, Vol. 40, Spring Joint Computer Conference, 1972, pg. 255-270. 16) See Vinton G. Cerf and Robert E. Kahn, "A Protocol for Packet Network Intercommunication", IEEE Transactions on Communications, vol Com-22, No. 5, May 1974, pg. 637. 17) Reference to be added. 18) In the article, Robert E. Kahn and William R. Crowther, "Flow Control in a Resource-Sharing Computer Network," there is a reference to a private communication from Licklider suggesting the possibility of a software interface. "A partition of flow control responsibility between IMPs and the hosts may increase the overall reliability of network operation. In general, however, the inherent operation of the interface between the host and the IMP need not be considered to be partitioned. A combined structure for the interface, which suggests a truly software interface, was pointed out by Licklider." See "Proc of the Second ACM IEEE Symposium on Problems in the Optimization of Data Communication System", Palo Alto, California, October 1971, pg 540. 19)See R. Kahn, "Communications Principles for Operating Systems." Internal BBN memorandum, Jan. 1972. Some Notes for an Appendix Host - In the ARPANET, a time-sharing or batch processing computer Long Lines - The IMPs are interconnected by leased 50 kilobit/second circuits from AT&T. Network - The collection of Hosts, IMPs and circuits forms the message switched resource sharing network. Error Control - "Errors are primarily caused by noise in the communication circuits and are handled most simply by error detection and retransmission between each pair of IMPs along the transmission path." (about the ARPANET's means of error control, from "Computer Communication Network Design -Experience with Theory and Practice", SJCC, 1972, pg 260.) -------------- Last updated April 30, 2000 version 1.07 part I http://www.columbia.edu/~rh120/other/arpa_ipto.txt part II http://www.columbia.edu/~rh120/other/basicresearch.txt part III http://www.columbia.edu/~rh120/other/centers-excellence.txt part IV http://www.columbia.edu/~rh120/other/computer-communications.txt part V http://www.columbia.edu/~rh120/other/birth_internet.txt