Internet tutorial

The educational technology and digital learning wiki
Jump to navigation Jump to search

Introduction

Learning goals
  • Learn about Internet infrastructure and its major components
Prerequisites
  • None
Moving on
Level and target population
  • Anyone who is interested by what "Internet" means
Remarks
  • This is just "know that" tutorial. It should help people to be able to distinguish between service and infrastructure layers.

“The Internet is a standardized, global system of interconnected computer networks that connects millions of people. The system uses the Internet Protocol Suite (TCP/IP) standard rules for data representation, signaling, authentication, and error detection. It is a network of networks that consists of millions of private and public, academic, business, and government networks of local to global scope that are linked by copper wires, fiber-optic cables, wireless connections, and other technologies. The Internet carries a vast array of information resources and services, most notably the inter-linked hypertext documents of the World Wide Web (WWW) and the infrastructure to support electronic mail, in addition to popular services such as video on demand, online shopping, online gaming, exchange of information from one-to-many or many-to-many by online chat, online social networking, online publishing, file transfer, file sharing and Voice over Internet Protocol (VoIP) or teleconferencing, telepresence person-to-person communication via voice and video.” (Wikipedia retrieved 18:15, 27 August 2009 (UTC)).

RFC 1941 (Frequently Asked Questions for Schools) written in 1996 (!) defines the Internet in the following way: “The Internet is a large and rapidly growing worldwide network comprised of smaller computer networks, all linked by a common protocol, that enables computers of different types to exchange information. The networks are owned by countless commercial, research, government, and education organizations and individuals. The Internet allows the almost 5 million computers and countless users of the system to collaborate easily and quickly either in pairs or in groups. Users are able to discover and access people and information, distribute information, and experiment with new technologies and services. The Internet has become a major global infrastructure used for education, research, professional learning, public service, and business.” (retrieved 18:15, 27 August 2009 (UTC)).

Technically speaking and much simplified: “The Internet ('Net) is a network of networks. Basically it is made from computers and cables. What Vint Cerf and Bob Kahn did was to figure out how this could be used to send around little "packets" of information. As Vint points out, a packet is a bit like a postcard with a simple address on it. If you put the right address on a packet, and gave it to any computer which is connected as part of the Net, each computer would figure out which cable to send it down next so that it would get to its destination. That's what the Internet does. It delivers packets - anywhere in the world, normally well under a second. [...] Lots of different sort of programs use the Internet: electronic mail, for example, was around long before the global hypertext system I invented and called the World Wide Web ('Web). Now, videoconferencing and streamed audio channels are among other things which, like the Web, encode information in different ways and use different languages between computers ("protocols") to provide a service.” (TBL FAQ, retrieved 18:15, 27 August 2009 (UTC)).

Often people confuse the Internet with the World Wide Web. The Internet is a global hardware and software infrastructure that provides connectivity between computers. In contrast, the Web is just one of the services communicated via the Internet, i.e. the collection of interconnected documents and other resources, linked by hyperlinks and URLs.

A short history of Internet

Internet is now almost 40 years old. “The origins of the Internet reach back to the 1960s when the United States funded research projects of its military agencies to build robust, fault-tolerant and distributed computer networks. This research and a period of civilian funding of a new U.S. backbone by the National Science Foundation spawned worldwide participation in the development of new networking technologies and led to the commercialization of an international network in the mid 1990s,...” (Wikipedia retrieved 18:15, 27 August 2009 (UTC))

The sixties - networking architecture

Licklider (1960) wrote "Man-Computer Symbiosis": “Man-computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs.”

  • 1962: Invention of modern networking architecture (packets): A message can be broken down into packets like: sender - receiver - message
  • 1969: first trial of Arpanet (future Internet)

In December 1969, the first version of Arpanet (Internet) went online. It connected four computers from four universities (UCLA, Stanford Research Institute, UCSB, and the University of Utah). The project leader was Bob Kahn from BBN (Cambridge,MA).

  • Telnet (TELecommunication NETwork): Remote connection to another computer over a internet-like network. Telnet (as well as more modern secure versions) is still being used to remotely log into servers. In addition, HTTP (the web protocol) is just a modified telnet.
The seventies
The first Internet Protcols, but also Mail and Forums as a general idea.
  • 1971 First versions of the File transfer protocol ( FTP), revised in 1980 and 1985. Still popular (but consider using SFTP or SCP instead, since FTP is inherently insecure).
  • In 1972 Ray Tomlinson (BBN) created the first e-mail program
  • 1973: TCP, Transmission Control Protocol: According to Vint Cerf's FAQ: “During 1973, we developed the concepts underlying the Internet and prepared a preliminary paper in September of that year that we presented to the International Network Working Group (INWG). In December 1974 the first full draft of TCP was produced.”
  • 1978: TCP/IP Protocol - Addition of IP, the Internet Protocol: TCP/IP, the main technical pillar of Internet emerged in mid-late 1978 in nearly final form and was finalized in 1991. “The Internet protocol suite is the set of communications protocols that implements the protocol stack on which the Internet and many commercial networks run. It is part of the TCP/IP protocol suite, which is named after two of the most important protocols in it: the Transmission Control Protocol (TCP) and the Internet Protocol (IP), which were also the first two networking protocols defined.
The 80’s - coexistence of several different networks
  • UseNet (Unix), BitNet (DEC/IBM), BBS (dial-up servers ran from home), etc.
The nineties - emergence of the web as the dominant Internet service
  • 1991 Gopher: The University of Minnesota developed gopher named after a mascot but also means "go fer". Gopher was a user-friendly server that allowed administrators to build menus to access local or remote files and services (e.g. phone directories, library interfaces).
  • 1992: Tim Berners-Lee et al. at CERN invented the WWW, the World-Wide Web - an Internet service that relies on HTTP and HTML
  • 1995: The World discovers the WWW. Internet goes commercial. Also, Microsoft enters the game.
  • 1998 XML a simply, but fairly universal markup language was defined. Soon after, the first XML-based networking application standards emerged (e.g. SOAP and XML-RPC). Today, there exists hundreds of XML "languages", including a few dozen major ones like XHTML.
  • 2007: 500 million connected computers, hundreds of protocols and services

An overview of the Internet architecture

Internet is complex Infrastructure of hardware and software. “The responsibility for the architectural design of the Internet software systems has been delegated to the Internet Engineering Task Force (IETF).[8] The IETF conducts standard-setting work groups, open to any individual, about the various aspects of Internet architecture. Resulting discussions and final standards are published in a series of publications each of which is called a Request for Comment (RFC), freely available on the IETF web site. The principal methods of networking that enable the Internet are contained in specially designated RFCs that constitute the Internet Standards.” (Wikipedia, retrieved 18:15, 27 August 2009 (UTC))

TCP/IP stack operating on two hosts connected via two routers and the corresponding layers used at each hop

Internet protocols and methods are grouped in four layers (RFC 1122):

  1. the Application Layer,
  2. the Transport Layer,
  3. the Internet Layer, and
  4. the Link Layer.

Other standards organizations and textbooks define layers for networking technology in a different way. The Open System Interconnection (OSI) Reference Model for instance distinguishes between 7 layers which, from top to bottom, are the Application, Presentation, Session, Transport, Network, Data-Link, and Physical Layers. (OSI Model, Wikipedia).

The link layer

The link layer can be decomposed into a physical and a data link layer.

(1) The Physical Layer is composed of various sorts of cables and wireless channels that connect computers through Network Interface Cards (NICs) and in most cases through other hardware like repeaters, hubs, bridges, switches and routers. The physical layer transmits raw bits as opposed to logical data packets.

(2) The data link layer transfers data between adjacent network nodes in a wide area network or between nodes on the same local area network segment.

Local area networks (LANs)cover a relatively small physical area (e.g. a group of buildings). LANs uses Ethernet networking technology that defines a variety of wiring and signaling standards. Each connected device is given a 48-bit Media Access Control address (MAC) address that is used to specify both the destination and the source of each data packet. Each computer's networking car should have a unique identifier. There exist a large variety of cables for Ethernet, e.g. Gigabit Ethernet, in modern buildings, between buildings, organizations and in the backbones of most networks (since 1998). Fast Ethernet (100 Mb/s) may still be around in older buildings/areas (1995), 10 Gigabit Ethernet, between major modern HUBs (2002) and 100 Gigabit Ethernet (2006).

In addition, there exist various wireless protocols that use radio waves (e.g. several WiFi standards, cellular systems or Bluetooth) and Infrared.

Wide area networks (WANs) cover a broad area (i.e., any network whose communications links cross metropolitan, regional, or national boundaries) connect Local area networks (LANs). WANs are typically used by Internet service providers to connect the organization's LANs to the Internet.

Home networks can be connect to rest of the world trough several methods, in particular DSL (e.g. ADSL) over telephone lines.

Over the same physical layer, one can can run several kinds of data links. (E.g. the now extinct AppleTalk local networking system was based on "LocalTalk")

The Internet (Network) layer

In essence, the network layer is responsible for end to end (source to destination) packet delivery, whereas the data link layer is responsible for node to node (hop to hop) packet delivery. (Wikipedia)

The best known protocol is the Internet Protocol (IP). It breaks down a message to packets and can send them over through several nodes over a heterogeneous network (e.g. a mix of Ethernet, Wi-FI).

IP provides an unreliable service, i.e. data can arrive corrupt, out of order, be lost etc. Errors must be repaired at the next level, e.g. with TCP.

have to insert something about addressing and packet structure

The Transport Layer

The Transport Layer is the second highest layer in the four and five layer TCP/IP reference models. It directly answers to the application layer and makes requests to the network layer and usually turns the unreliable and very basic service provided by the Network layer into a more powerful one. E.g. it can ensure that data arrive in the right order or can request that lost data are sent again.

The best known layers are TCP and UDP.

  • The Transmission Control Protocol (TCP) can create connections between two hosts (computers), over which they can exchange streams of data using so-called Stream Sockets. The protocol guarantees reliable and in-order delivery of data from sender to receiver. It can distinguish data for multiple connections by concurrent applications, e.g. you can at the same time surf on the Web, receive email and be connected to a virtual world. Typically, HTTP (World Wide Web) servers use TCP and below IP. The combination of both is called TCP/IP.
  • The User Datagram Protocol (UDP), also called Universal Datagram Protocol or Unreliable Datagram Protocol can send short messages sometimes known as datagrams. It does not provide the reliability and ordering that TCP does. Datagrams may arrive out of order, appear duplicated, or go missing without notice. However, this makes UDP faster and more efficient for many lightweight or time-sensitive purposes, e.g. video streaming.

The Application Layer

At this level, there are standards that define messages and data formats understood by specific applications running at each end of the communication.

Examples (there are many more !!)

  • Hypertext Transfer Protocol (HTTP), the underlying communication protocol of the World Wide Web that specifies how a web server and a navigator talk to each other.
  • File Transfer Protocol - FTP
  • Simple Mail Transfer Protocol - SMTP (the de facto standard for e-mail transmissions across the Internet)
  • Dynamic Host Configuration Protocol - allow a device (e.g. your computer at home or sometimes at work) to request and obtain an IP address from a server which has a list of addresses available for assignment.
  • Real-time Transport Protocol (see video streaming).
  • Simple Object Access Protocol (SOAP), one of the protocols that define how servers can talk to each other by sending XML-based messages.

An overview table of some protocols

The Internet Protocol Suite (links will lead to Wikipedia articles)
Application Layer
BGP DHCP DNS FTP GTP HTTP IMAP IRC Megaco MGCP NNTP Time_Protocol NTP POP RIP RPC RTP RTSP SDP SIP SMTP SNMP SOAP SSH Telnet TLS/SSL XMPP
Transport Layer
TCP UDP DCCP SCTP RSVP ECN
Internet Layer
Internet Protocol (IP) (IPv4, IPv6) ICMP ICMPv6 IGMP IPsec
Link Layer
ARP Address_Resolution_Protocol RARP NDP OSPF protocol Tunnels (L2TP) PPP Media Access Control (Ethernet, MPLS, DSL, ISDN, FDDI)

As you can see HTTP, i.e. the transfer protocol for the web is in the application layer. HTML is not part of the Internet Protocol Suite, since HTML just encodes contents in a certain way.

The 7-layer OSI model has two major components: an abstract model of networking (the Basic Reference Model, or seven-layer model) and a set of concrete protocols.

OSI Model
Data unit Layer Function
Hostlayers Data 7. Application Network process to application
6. Presentation Data representation and encryption
5. Session Interhost communication
Segment 4. Transport End-to-end connections and reliability (TCP)
Medialayers Packet/Datagram 3. Network Path determination and logical addressing (IP)
Frame 2. Data link Physical addressing (MAC & LLC)
Bit 1. Physical Media, signal and binary transmission

Let's have a close look at the Data layers:

  • Session controls the dialogues/connections (sessions) between computer
  • Presentation concerns syntax and semantics, e.g. Abstract Syntax Notation One (ASN.1) or XML
  • Application includes services like FTP, HTTP etc. (i.e. services used by application programs)

Application layers, services and ports

Application layers and services usually rely on transport procols like TCP and UDP. These transport layer protocols specify a source and destination port number. Common servers have specific ports assigned to them (HTTP has port 80; FTP has port 21; etc.).

“In computer networking, a port is an application-specific or process-specific software construct serving as a communications endpoint used by Transport Layer protocols of the Internet Protocol Suite, such as Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). A specific port is identified by its number, commonly known as the port number, the IP address it is associated with, and the protocol used for communication. [...] Applications implementing common services will normally listen on specific port numbers which are defined by convention for use with the given protocol — see list of TCP and UDP port numbers
The concept of ports can be readily explained with an analogy: think of IP addresses as the street address of an apartment building, and the port number as the number of a particular apartment within that building. If a letter (a data packet) is sent to the apartment building (IP) without an apartment number (port number) on it, then nobody knows whom (which service) it is intended for. In order for the delivery to be successful, the sender needs to include an apartment number along with the address to ensure the letter gets to the right domicile.”
(Wikipedia, retrieved 18:15, 27 August 2009 (UTC)).

There are many many kinds Internet services. Some more popular end-user protocols concern:

  • File transfer and sharing protocols, e.g. FTP, SFTP and FTPS or WebDAV
  • HTTP and HTTPS (the Web transfer protocols)
  • User Authentication and directory access, e.g. LDAP and X.500
  • Domain name mapping service (DNS)
  • (Several) protocols that make e-mail work, e.g. SMTP, IMAP, POP3,
  • Remote work, e.g. Telnet, Rlogin, RDP (Microsoft) and SSH
  • Remote procedure calls, e.g. RPC
  • Network file systems, e.g. NFS (Unix), SMB (Windows) and NCP (Novell)
  • Real-time streaming for streaming media and video conferencing, e.g. RTP and RTPC
  • Domain Name System (Service) Protocol

Links (extra reading)


Copyright and Acknowledgements