Some Principles of the Internet
                                  and
        A Simple Explanation of the Functioning of the Internet
                     Via an Introduction to TCP/IP
 
                              Jay Hauben
                          jrh29@columbia.edu  
 

A) Some Principles of the Internet 
 
Twenty five years ago there was no Internet. Today more than 
120,000 packet switching networks with very many different 
characteristics, interconnecting more than 50,000,000 computers 
as nodes, comprise this communications system used by over 
150,000,000 people worldwide. Yet the Internet is still young. It 
is likely to keep expanding for many years to come. There are 
many aspects of Internet technology, such as the assumed 
unreliability at the internetwork level, that are unique and 
differentiable from other telecommunications network technologies 
like the telephone system. Also the scaling of the Internet to 
meet the expected increase in demand for its use is in no way 
assured.  Therefore, it is a system worth studying.
 
Packet switching networks appeared in the 1970's as a consequence 
of the development in the 1960's of the time-sharing mode of 
computer operation. Greater stand-alone-computer user efficiency 
was achieved when computer processing time was parceled out in 
round-robin fashion. Formally parceled out in batches to one user's
needs at a time (called batch processing), new operating systems
were designed that could offer each user a set of small time 
slots one at a time in turn leading to the successful illusion 
that each of the simultaneous users was the sole user. The 
operation of such systems suggested that two time-sharing computers 
could be connected, each appearing to the other as just another 
user. A cross country hookup of such systems was attempted in 
1965 using slow telephone lines. The result was a success for 
long distance time-sharing computer networking but the call set 
ups and tear downs created time delays that were unacceptable for 
actual use of such a network. 
 
The problem was that computer data is often bursty or is a 
message of minimal size as when a single key stroke is sent to 
solicit a response. Therefore computer data communication 
over normal telephone lines required frequent call setups or 
wasteful quiet times. A solution suggested by queueing theory 
and other lines of reasoning was packet switching as opposed to 
circuit switching. Data to be communicated from a number of sessions 
could be broken into small packets which would be transmitted inter-
spersed each routed to its destination separately without setting up 
a path for each packet. Queueing theory, especially the work of Leonard 
Kleinrock, predicted that interspersed demands utilizing common 
resources would be efficient. Packet switching experimentation was 
initiated in Europe and the US in the early 1970s.
 
Best known of the early packet switched computer networks were 
the ARPANET in the US, Cyclades in France, and the National 
Physical Laboratory network in the UK. The ARPANET designers and 
researchers succeeded in achieving resource sharing among time-shared 
computers manufactured by different vendors and using different 
operating systems, character sets, etc. The computers were 
located at universities and military related research 
laboratories. The ARPANET was funded and encouraged by the 
Advanced Research Projects Agency (ARPA), a civil agency of the 
US Department of Defense. ARPA also funded and encouraged packet 
switching experimentation using ground based radio receivers and 
transmitters and using satellites. Encouraged by the success of 
the ARPANET, commercial networks like Tymnet and Telenet were 
established. In Europe a number of packet switched network 
experiments were undertaken. Just as isolated time-shared 
computers suggested networking, so too the existence of isolated 
packet switching networks suggested some sort of 
interconnectivity. Robert Kahn in the US and Louis Pouzin in 
France were among the first to consider what needed to be done to 
create such a meta network of networks. Pouzin developed the 
concept of a Catenet and Kahn at ARPA developed the Internetting 
Project.
 
The goal of the Catenet and Internetting concept and project was 
to develop an effective technology to interconnect the packet 
switched data networks that were beginning to emerge from the 
experimental stage. Both rejected the alternative of integrating 
all networks into a single unified system based on multi physical 
media links. The later might have produced better integration and
performance but would have limited the autonomy and continued 
experimental development of the new network technologies. Also, 
the developing networks were under different political or 
economic administrations and it is not likely they could have been 
enticed to give up their autonomy to voluntarily join together as 
part of a single network.  
 
Kahn had been involved trying to solve a problem of great 
complexity: could a ground based packet radio network be 
developed that would even allow mobile transmitters and 
receivers? The complexity was that radio communication is prone 
to fading, interference, obstruction of line-of-site by local 
terrain or blackout such as when a tunnel is traveled through. 
The radio signal link is unreliable in itself for data 
communication. Crucial therefore to the success of such a packet 
radio network would be an end-to-end mechanism that could call 
for retransmissions and employ other techniques so that a 
reliable communication service could be provided despite the 
unreliability of the underlying link level.
 
Pouzin had worked on the time-sharing experiments at MIT in the 
1960's. He was impressed by the successful way individual users 
were 'networked' on a single time sharing computer and then how 
these computers themselves were networked. He looked for the 
essence of packet switching networks to give the clue how they 
could be interconnected. He saw many features which were not 
mandatory to packet switching such as virtual circuits, end-to-
end acknowledgments, large buffer allocations, etc. He felt that 
any end-to-end function which users might desire could be 
implemented at the user interface. The Catenet need only provide 
a basic service, packet transport.
 
How then to achieve an effective interconnection of packet 
switched networks? If the interconnection was to include packet 
radio networks the resulting internet would have at least some 
unreliable links. Should packet radio networks and others that 
could not offer reliable network service be excluded? Kahn's 
answer was that the new interconnection should be open to all 
packet switching and even other data networks. That was the first 
principle of the Internet that was to emerge: open architecture 
networking --- the interconnection of as many current and future 
networks as possible by requiring the least possible from each. 
Each network would be based on the network technology dictated by 
its own purpose and achieved via its own architectural design. 
Networks would not be federated into circuits that formed a 
reliable end to end path, passing individual bits on a 
synchronous basis. Instead, a new "internetworking Architecture" 
would view networks as peers in helping offer an end-to-end 
service independent of path or of the unreliability or failure of 
any links. 
 
  " Four ground rules were critical to Kahn's early thinking:
 
     * Each distinct network would have to stand on its own and no
       internal changes could be required to any such network to connect
       it to the Internet.
     * Communications would be on a best effort basis. If a packet didn't
       make it to the final destination, it would shortly be
       retransmitted from the source.
     * Black boxes would be used to connect the networks; these would
       later be called gateways and routers. There would be no
       information retained by the gateways about the individual flows of
       packets passing through them, thereby keeping them simple and
       avoiding complicated adaptation and recovery from various failure
       modes.
     * There would be no global control at the operations level." 
                 (from A Brief History of the Internet at
                  http://www.isoc.org/internet/history/brief.html)       
 
Pouzin and his colleagues developed similar ground rules and 
applied them in the development of the Cyclades network and its 
interconnection with the National Physical Laboratory (NPL) in 
London in August 1974, with the European Space Agency (ESA) in Rome 
in October 1975 and with the European Informatics Network (EIN) 
in June 1976. They were the first to implement a packet service 
which did not assume any interdependence between packets. Each 
packet was treated as a separate entity moving from source to 
destination according to the conditions prevalent at each moment 
of their travel. Dynamic updating of their routing at the 
gateways and retransmissions because of congestion or link or node 
failures sometimes caused the packets to arrive at their 
destinations out of order or duplicated or missing from a 
sequence. The gateways were programmed to make an effort to keep 
the packets moving toward the source but no guarantee of success 
was built into them. Such a best effort transmission service is 
called a datagram service.
 
In the past, out of sequence packets, packet duplication and packet 
loss were considered at least a burden if not serious problems, 
so communication switches were designed to prevent them. The 
French team succeeded in producing transport layer mechanisms to 
rectify these events. In that way they brought substantial 
simplicity, cost reduction and generality to the service that 
their gateways provided. This was a second Internet principle: as 
much as possible be done above the internetwork level. This came 
to be called the end-to-end principle. It provided for successful 
communication under almost any condition except the total failure 
of the whole system. Another way to state this principle was that 
the only information about a communication session (state 
information) would be at the end points. Intermediate failures 
could not destroy such information and disrupted communication 
resulting from such failures could be continued when the packets 
began to arrive again at the destination.
 
In October 1972, Kahn had organized a large public demonstration 
of the ARPANET at the International Computer Communications 
Conference (ICCC72) in Washington, DC. This was the first 
international public demonstration of packet switching network 
technology. Researchers were there from Europe, Asia, and North 
America. At the meeting, an International Network Working Group 
(INWG)  was established to share experiences and be a forum to 
help work out standards and protocols. In 1973-4 it was adopted by 
the International networking professional organization, the 
International Federation of Information Processing as its 
Telecommunications Committee Working Group 6.1 (IFWP/TC 6.1). 
Donald Davies from the UK, Pouzin and Kahn knew of each other's 
work and the work of others who were considering these problems 
by attending and presenting papers at meetings of the IFWP/TC 6.1 
and sharing their work with each other on a regular basis. This 
is an early example of a long tradition in the networking world 
of openness and collaboration. This was to become a third 
principle of the Internet: open and public documentation and 
standards and protocol development.
 
In 1973, Kahn brought Vinton Cerf into the work on 
internetting. The ARPA project gave rise to a proposed general 
solution to the internetting problem with specifications for 
what was needed in common on the end computers and the gateways 
so that the interconnection would be successful. The set of such 
specifications is called a communication protocol. This protocol 
at the time was called Transmission Control Protocol (TCP). Cerf 
and Kahn first shared their thinking in a formal way at a meeting 
of the INWG members who were in Brighton, England in September,
1973 at a conference sponsored by NATO.
 
What emerged was a reliable sequenced data stream delivery 
service provided at the end points despite the  unreliability of 
the underlying internetwork level. But the first implementation 
only resulted in virtual circuit internetwork service. For some 
network services such virtual circuits were too restrictive. At 
the time it was argued by Danny Cohen who was working on packet 
voice delivery that TCP functionality should be split between 
what was required end-to-end, like reliability and flow control, 
and what was required hop-by-hop to get from one network to 
another via gateways. Cohen felt packet voice needed timeliness 
more than it needed reliable delivery. This led to the 
reorganization of the original TCP into two protocols, the simple 
Internet Protocol (IP) which provided only for addressing, 
fragmentation and forwarding of individual packets and a separate 
TCP concerned with recovery from lost packets. This brought the 
internetting work into line with the success of the Cyclades 
datagram service.
 
A major boost to the use of what became known as TCP/IP was its 
adoption by the US Department of Defense (DOD). The DOD funded 
work that incorporated TCP/IP into modifications of the Unix 
operating system being made at the University of California at 
Berkeley. When this version was distributed, much of the computer 
science community in the US and around the world began to have 
TCP/IP capability built into their operating system. This was a 
great boost for broad adoption of the Internet. It is also 
another example of the principle of free and open documentation 
in this case source code. In 1983 the DOD required all users of 
the ARPANET to adopt TCP/IP further insuring that it would be 
broadly implemented.
 
A key element of the design of IP is the capability at each 
gateway to break packets too large for the next network into 
fragments, each a datagram in its own right, that will fit in 
that network's network frames. These fragments then travel along 
as ordinary datagrams until they are reassembled at the 
destination host. By allowing for fragmentation IP makes it 
possible for large packet handling and small packet handling 
networks to coexist on the same Internet. This is an example of 
applying the open architecture principle. Allowing fragmentation 
relieves the necessity of specifying a minimum or a maximum packet 
size (although in practice such limits do exist). Leaving the 
reassembly until the destination minimizes the requirements on 
the gateway/routers. Schemes that would eliminate fragmentation 
from future versions of IP should be carefully scrutinized 
because they may cause the obsoleting of under resourced networks 
that could not adapt to the mandated packet sizes. That would 
violate the open architecture principle.
 
From one point of view, that of the most value for the whole of 
society, the highest order feature a communications system can 
provide is universal connectivity. This has been up until the 
present the guiding vision and goal of the Internet pioneers. 
Leonard Kleinrock has argued that "as the system resources grow 
in size to satisfy an ever increasing population of users" gains 
in efficiency occur (Queueing Theory Volume II, p.275). This is 
an example of the law of large numbers which suggests that the 
more resources and users there are, the more sharing there is. 
This results in a greater level of efficient utilization of 
resources without increased delays in delivery. So far the 
scaling of the Internet has conformed to the law of large numbers 
and provides such a convenient and efficient communications 
system that its users use it more than other communications 
systems they have available. Also the desire for connectivity 
grows with the Internet's growth as does its value since with its 
growth comes more connectivity for those who have already been 
connected as well. This is an example of a regenerative system.
 
In its first 25 years (1973-1998) the Internet has grown to 
provide communication to 2.5% of the world's people. This is a 
spectacular technical and social accomplishment. But much of the 
connectivity is concentrated in a few parts of the world (North 
America, Europe and parts of Asia). The web of the Internet's 
connectivity is also still sparse even in North America. Often 
there are too few alternative paths even though there is 
sufficient total bandwidth so that the  communication service 
available has uncomfortably long delays. 
 
In my opinion the top priority for the Internet technical 
community is to find ways of continuing the growth and scaling of 
the connectivity provided by the Internet. But the Internet is a 
very complex technology. To achieve the necessary further 
scaling, the Internet will require a large pool of well supported 
talented and highly educated scientists and engineers who have 
studied the principles and unique features of the Internet. They 
will need to work collaboratively online and line to hold 
each other to the principles as they seek solutions to the 
current and future problems, then the Internet has a chance of 
reaching the goal of universal connectivity.
 
 
 
 
 
 
 
 
B) A Simple Explanation of the Functioning of the Internet Via an 
   Introduction to TCP/IP
 
 
 
I. Introduction
 
     The Internet as we know it in 1998, although vast, is still
a new and developing communications technology. It is based on a
number of ingenious engineering accomplishments, first of which 
is the Transmission Control Protocol and Internet Protocol suite, 
known as TCP/IP. 
 
     The elements that comprise the Internet are computers and 
networks of computers. These being physical entities, in order to 
perform reliably, require careful design based on solid 
engineering principles. The Internet itself is more than the sum 
of its elements. It too requires careful and evolving design 
based on principles similar to those for computers and networks 
and some unique to the Internet. 
 
II The Internet
 
     The Internet is the successful interconnecting of many
different networks to give the illusion of being one big computer
network. What the networks have in common is that they all use
packet switching technology or at least can carry packets of data 
from one computer to another. On the other hand, each of the
connected networks may have its own addressing mechanism, packet
size, speed etc. Any of the computers on the connected networks
no matter what its operating system or other characteristics can
communicate via the Internet if it has software implemented on it
that conforms to the set of protocols which resulted from open
research funded by the Advanced Research Projects Agency (ARPA)
of the United States Department of Defense in the late 1970s.
That set of protocols is built around the Internet Protocol (IP)
and the Transmission Control Protocol (TCP). Informally, the set
of protocols is called TCP/IP (pronounced by saying the names of
the letters T-C-P-I-P). 
     The Internet Protocol is the common agreement to have
software on every computer on the Internet add a bit of
additional information to each of the packets that it sends out.
Without such software a computer can not be connected to the
Internet even if Internet traffic passes over the network that
the computer is attached to. A packet that has the additional
information required by IP is called an IP datagram. To each IP
datagram the computer adds its own network addressing
information. The whole package is called a network frame. It is
network frames containing IP datagrams rather than ordinary
packets that a computer must send onto its local packet
switching network in order to communicate with a computer on
another network via the Internet. 
     If the communication is between computers on the same
network the network information is enough to deliver the frame to
its intended destination computer. If the communication is
intended for a computer on a different network, the network
information directs the frame to the closest computer that serves
to connect the local network with a different network. Such a
special purpose computer is called a router (some times a
gateway). It is such routers that make internetworking possible. 
     The Internet is not a single giant network of computers. It
is over one hundred thousand networks interconnected by routers. A
router is a high speed, electronic, digital computer very much
like all the other computers in use today. What makes a router
special is that it has all the hardware and connections necessary
to be able to connect to and communicate on two or more different
networks. It also has the software to create and interpret
network frames for each network it is attached to. In addition it
must have capabilities require by IP. It must have software that
can remove network information from the network frames that come
to it and read the IP information in the datagrams. Based on the
IP information it can add new network information to create a an
appropriate network frame and send it out on that different 
network. But how does it know where to send that IP datagram? 
     The entire process of Internet communication requires that
each computer participating in the Internet has a unique digital
address. The unique addresses of the source and destination are
part of the IP information added to packets to make IP datagrams.
The unique number assigned to a computer is its Internet
Protocol or IP address. The IP address is a binary string of 32
digits. Therefore the Internet can provide communication among 2
to the 32nd power or about 4 billion 300 million computers (two
unique addresses for every three people in the world). Internet
addresses are written for example like 129.77.19.140. Each such
address has two parts, a network ID and a host ID. In this
example 129.77 (network ID) identifies that this computer is part
of a particular university network and 19.140 (the host ID)
identifies which particular computer it is.
     A router's IP software examines the IP information to
determine the destination network from the net work ID of the
destination address. Then the software consults a routing table
to pick the next router to send the IP datagram to so that it
takes the "shortest" path. A path is short only if it is active
and it is not congested. Ingenious software programs called
routing daemons send and receive short messages among adjacent
routers characterizing the condition on each path. These messages
are analyzed and the routing table is continually up dated. In
this way IP datagrams pass from router to router over different
networks until they reach a router connected to their destination
network. That router puts network information into the network
frame that delivers the datagram to its destination computer. The
IP datagram is unchanged by this whole process. Each router has
put next router information along with the IP datagram into the
next network frame. When the IP datagram finally reaches its
destination it has no information how it got there and different
packets from the original source may have taken different paths
to get to the same destination. 
     IP as described above requires nothing of the interconnected
networks except that they are packet switching networks with IP
compliant routers. If a transmitting network uses a very small
frame size, the IP software can even fragment an IP datagram into
a few smaller ones to fit the network's frame size. It is this
minimum requirement by the Internet Protocol that makes it
possible for a great variety of networks to participate in the
Internet. But this minimum requirement also results in little or
no error detection. IP arranges for a best-effort process but
has no guarantee of reliability. The remainder of the TCP/IP set
of protocols adds a sufficient level of reliability to make the
Internet useful. 
     There are problems that IP does not solve. For example,
interspersed network frames from many computers can sometimes
arrive faster than a router can route them. A small backlog of
data can be stored on most routers but if too many frames keep
arriving some must be discarded. This possibility was antici-
pated. On most computers on the Internet except routers software
behaving according to the Transmission Control Protocol (TCP) is
installed. When IP datagrams arrive at the destination computer,
the TCP compliant software scans the IP information put into the
IP datagram at the source. From this information the software can
put packets, if they are all there, back together again. If there
are duplications the software will discard all but the first copy
of such packets to have arrived. But what if some IP datagrams
have been lost? 
     As a destination computer receives data, the TCP software
sends a short message back over the Internet to the original
source computer specifying what data has arrived. Such a message
is called an "acknowledgment". Every time TCP and IP software
send out data, TCP software starts a timer (sets a number and de-
creases it periodically using the computer's internal clock) and
waits for an acknowledgment. If an acknowledgment arrives first,
the timer is canceled. If the timer expires before an
acknowledgment is received back the TCP software retransmits the
data. In this way missing data can usually be replaced at the
destination computer in a reasonable time. To achieve efficient
data transfer the timeout interval can not be preset. It needs to
be longer for more distant destinations and for times of greater
network congestion and shorter for closer destinations and times
of normal network traffic. TCP automatically adjusts the timeout
interval and the size of its sliding window based on the rate of 
acknowledgments it receives back it. This ability to dynamically 
adjust the timeout interval contributes greatly to the success of 
the Internet.
     Having been designed together and engineered to perform two
separate but related and needed tasks, TCP and IP complement each
other. IP makes possible the travel of packets over different
networks but it and thus the routers are not concerned with data
loss or data reassembly. The Internet is possible because so
little is required of the intervening networks. TCP makes the
Internet reliable by detecting and correcting duplications, out
of order arrival and data loss using an acknowledgment and time
out mechanism with dynamically adjusted timeout intervals. 
 
III Conclusion
 
     The Internet is a wonderful engineering achievement. Since
January 1, 1983, the cutoff date of the old ARPANET protocols,
TCP/IP technology has  successfully dealt with tremendous
increases in usage and in the speed of connecting computers. This
is a testament to the success of the TCP/IP protocol design and
implementation process. Douglas Comer highlighted the features
of this process as follows:
 
   * TCP/IP protocol software and the Internet were
     designed by talented dedicated people.
   * The Internet was a dream that inspired and challenged
     the research team.
   * Researchers were allowed to experiment, even
     when there was no short-term economic payoff.
     Indeed, Internet research often used new, innovative
     technologies that were expensive compared to existing
     technologies.
   * Instead of dreaming about a system that solved all
     problems, researchers built the Internet to operate
     efficiently
   * Researchers insisted that each part of the Internet
     work well in practice before they adopted it as standard./
   * Internet technology solves an important, practical
     problem; the problem occurs whenever an organization has
     multiple networks.
          (from The Internet Book)
 
     The high speed, electronic, digital, stored program
controlled computer and the TCP/IP Internet are major historic
breakthroughs in engineering technology. Every such breakthrough
in the past like the printing press, the steam engine, the
telephone, the airplane have had profound effects on human
society. The computer and the Internet have already begun to
have such effects and this promises to be just the beginning. In
the long run, despite the growing pains and dislocations every
great technological break through serves to make possible a more
fulfilling and comfortable life for more people. The computer and
the Internet have the potential to speed up this process although
it may take a hard fight for most people to experience any of the
improvement. We live however in a time of great invention and
great potential. 
     The TCP/IP Internet is a major historical achievement. It
provides human society with a new global communications
technology with great promise and potential. This Internet has
sustained unprecedented growth both in the number of its users
and the volume of messages it handles daily. In the 15 years
since the cutover from the NCP ARPANET to the TCP/IP Internet,
the Internet has proven itself founded on solid principles. But
there can be setbacks and false steps. 
     As proposals for further development of the Internet are
made, it would be proper to expect that they reaffirm and build
on the proven principles. But there is, for example, research
currently being under taken to "make IP more reliable." Since the
principle of minimal requirement on component networks is IP's
strength, such research if implemented would be a fundamental
change for the Internet. In exchange for reliability, IP has made
possible the interconnection of the most diverse of networks. To
require greater reliability at the IP level could be an imposition 
of undue conformity on the component networks.  That would
be a backwards step. When today's Internet is developed and
improved, the principles of TCP and IP will in all likelihood
play crucial roles in that development. 
 
---------------------------------------------------------------
Bibliography
 
Carpenter, B. RFC 1958: Architectural Principles of the Internet. 
June, 1996
 
Cerf, Vinton G. and Robert Kahn. "A Protocol for Packet Network 
Intercommunication". IEEE Transactions on Communications, Vol. 
Com-22, No 5. May, 1974. 
 
Cerf, Vinton G. IEN48: The Catenet Model for Internetworking. 
July, 1978. http://lwp.ualg.pt/htbin/ien/ien48.html
 
Clark, David D. "The Design Principles of the DARPA Internet 
Protocols". Proceedings SIGCOMM88 ACM CCR Vol 18 #4. August, 
1988.
 
Comer, Douglas E. Internetworking with TCP/IP Vol I: Principles,
Protocols, and Architecture 2nd Edition. Englewood Cliffs, NJ.
Prentice Hall. 1991. 
 
Comer, Douglas E. The Internet Book: Everything You Need to Know 
about Computer Networking and How the Internet Works. Englewood 
Cliffs, NJ. Prentice Hall. 1995. 
 
Davies, D.W., D.L.A. Barber, W.L. Price C.M. Solomonides. 
Computer Networks and Their Protocols. Chichester. John Wiley & 
Sons. 1979.
 
Hauben, Michael and Ronda Hauben. Netizens: On the History and
Impact of Usenet and the Internet. Los Alamitos, CA. IEEE
Computer Society Press. 1997
 
Kleinrock, Leonard. Queueing Systems Volume II: Computer 
Applications. New York. John Wiley and Sons.1976.
 
Leiner, Barry M., et al. "A brief History of the Internet" at 
http://www.isoc.org/internet/history/brief.html
 
Lynch, Daniel C. and Marshall T. Rose. Editors. Internet Systems
Handbook. Reading, MA. Addison-Wesley. 1993. 
 
Pouzin, Louis. "A Proposal for Interconnecting Packet Switching 
Networks". Proceedings of EUROCOMP. Brunel University. May, 
1974. Pages 1023-36.
 
Pouzin, L. Ed. The Cyclades Computer Network. Amsterdam. North 
Holland. 1982.
 
Stevens, W. Richard. TCP/IP Illustrated, Vol 1 Protocols. 
Reading, MA. Addison-Wesley. 1994. 
-----------------------------------------------------------------