Learn Linux, 101: Fundamentals of internet protocols
TCP/IP network fundamentals for your Linux system
In this tutorial, learn about TCP/IP network fundamentals for your Linux system. Learn to:
- Understand network masks and Classless Inter-Domain Routing (CIDR) notation.
- Know the differences between private and public dotted quad IP addresses.
- Understand common Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) ports and services such as 20, 21, 22, 23, 25, 53, 80, 110, 123, 139, 143, 161, 162, 389, 443, 465, 514, 636, 993, 995.
- Know the differences between and major features of UDP, TCP and Internet Control Message Protocol (ICMP).
- Know the major differences between IPv4 and IPv6.
- Know the basic features of IPv6.
Networking in Linux
In today’s world, computer networking enables information sharing, research and commerce across the country or across the world. Research Kenyan National Parks. Sure. Buy a Swiss train ticket. No problem. Email someone down the street or on another continent. See photos from outer space. Computer networking makes all this and much more possible.
This tutorial helps you prepare for Objective 109.1 in Topic 109 of the Linux System Administrator (LPIC-1) exam 101. The objective has a weight of 4. This tutorial reflects the Version 5.0 objectives as updated on October 29, 2018.
To get the most from the tutorials in this series, you need a basic knowledge of Linux and a working Linux system on which you can practice the commands that are covered in this tutorial. Sometimes, different versions of a program format output differently. So, your results might not always look exactly like the listings and figures that are shown here. The examples in this tutorial come from Fedora 29, and Ubuntu 16.04.6 LTS.
Communication in layers
In the early days of computer networking, communication was mostly between computers and terminal devices. Teletype or typewriter-like terminals usually used an asynchronous protocol running at speeds as low as 110 bits per second (bps). Each character transmitted was framed with start and stop bits. Remote job entry terminals transferred larger quantities of data as card images or print files and used synchronous protocols such as binary synchronous communication (BSC). These transferred blocks of data with each block framed by special start and end of block characters. Typical speeds for such devices were between 2400 bps and 9600 bps. Communication protocols were proprietary and were built into each device in a way that made general inter operation difficult.
In the mid 1970s, IBM introduced Synchronous Data Link Control (SDLC) as the link protocol for Systems Network Architecture (SNA). In contrast to earlier link protocols, SDLC had the following advantages.
- Bit-oriented: A special octet (binary ‘01111110’) delimited each transmitted frame. Inside the frame, additional 0 bits were inserted to prevent sequences of 6 1 bits occurring within the frame, thus allowing the full range of 256 octets to be transmitted as data.
- Polled or point-to-point operation: The same protocol supports both polled operation with multiple slave devices on a line controlled by a master device, or direct connection between two peers on a line.
- Full-duplex or half-duplex: Data transmission can occur in both directions at the same time or alternating as in older protocols.
- Error checking: SDLC provides better error checking and correction than earlier protocols. It uses a 16-bit frame check sequence (FCS) calculated as a cyclic redundancy check (CRC).
In 1979 the International Organization for Standardization (ISO) standardized SDLC as High-Level Data Link Control (HDLC) and later provided further extensions. This became level 2 of the ISO 7-layer model for open systems interconnection (OSI). The four lower layers are shown in Table 1.
Table 1. Four lower layers of OSI
|1||Physical||Electrical and mechanical aspects of data transmission such as voltage levels and connector pin assignments.|
|2||Link or Data-Link||Transmission of data frames between a pair of devices.|
|3||Network||Establish paths between end devices such as computers. Choose among alternate routes and break messages into packets that can be reassembled after transmission over a level 2 link.|
|4||Transport||Data transfer and flow control between two computers.|
The Internet Protocol (IP) is a layer 3 protocol. TCP, UDP, and ICMP are layer 4 protocols. This tutorial shows you more about these networking parts.
Internet Protocol Version 4 – IPv4
Internet Protocol Version 4, better known briefly as IPv4 has been the networking workhorse for many years. Every device in an IPv4 network has a 4-byte address. It is customary to write these in dotted quad notation where each byte is represented as a decimal number and the numbers are separated by dots. Examples are: 192.168.1.5, 22.214.171.124, or 126.96.36.199.
Subnets, classes, masks, and CIDR
A 32-bit address allows for approximately 4,000,000,000 addresses. This seems like a lot. Originally this was divided so that the high-order 8 bits represented a network umber and the remaining 24 bits represented the local address within the network. This was fine when only a few large networks existed, such as the ARPANET. Fortunately, all existing networks were numbered below 64 by 1981 when the address range was redefined to have classes. The then existing networks had 0 for the high order bits and were designated as class A networks, allowing up to 127 such networks. Eventually five classes were defined as shown in Table 2.
Table 2. Network classes
|Class||High-order bits||Network bits||Remaining address bits||Number of networks||Hosts per network|
|A||0||8||24||128 (27)||16,777,216 (224)|
|B||10||16||16||16,384 (214)||65,536 (216)|
|C||110||24||8||2,097,152 (221)||256 (28)|
|D (multicast)||1110||Not specified||Not specified||Not specified||Not specified|
|E (Reserved or experimental)||1111||Not specified||Not specified||Not specified||Not specified|
The top few bits of a network address defined the class and therefore how many remaining bits made up the host address. Traffic for a device in a particular network was routed first to that network and then to the device, so the main internet routers only needed the network component of the IP address for routing. The device component could be masked off (with logical AND) to obtain just the network address part using a network mask. For example, 255.0.0.0 is the network mask for a Class A network while 255.255.255.0 is the network mask for a Class C network.
Many organizations needed more addresses than ta Class C network, but allocating them a whole Class B network gave them far more addresses than they actually needed. With fax machines, printers, scanners, computers, and many other devices all needing an address pressure on the IPv4 address allocation system increased.
In 1993 the Internet Engineering Task Force (IETF) introduced Classless Inter-Domain Routing (CIDR). In this model, high-order bits still define the network number and low-order bits define the device or host address within the network. Instead of fixed sizes, CIDR specifies the number of bits that represent the network component. This is appended to the dotted-quad notation as a slash (/) followed by a bit count. So a former class A network address might now be specified as 188.8.131.52/8, while a class C one might now be 192.168.1.0/24. An address such as 192.168.1.0/22 would be for an address with 22 bits for the network component and 10 bits for the host address, making for a maximum of 1024 hosts in the network. The address masks for these three examples are 255.0.0.0, 255.255.255.0, and 255.255.252.0. Address masks are always a sequence of 1 bits followed by a sequence of 9 bits for a total of 24 bits.
Thus the former classes were divided into subnets where each subnet could more accurately match the requirements of the organization using it.
Today, IPV4 address ranges are mostly managed by large entities, such as ARIN or the Department of Defense in the United States, or by regional organizations such as Latin America and Caribbean Network Information Centre (LACNIC) or Réseaux IP Européens Network Coordination Centre (RIPE NCC). The Internet Assigned Numbers Authority (IANA) has the overall control of assignment to these entities. My Internet service provider (ISP) provides me an IPv4 address from the 184.108.40.206/19 range.
Even when an organization has a large range of IP addresses, the organization may choose to have internal subnets to ease routing.
Private and public addresses
So far, we have seen an IPv4 address scheme that connects all the devices on the internet. As noted, devices such as printers and scanners need an IP address, but this is typically only needed within a particular home or enterprise. Many LAN devices fit into this category too. Three particular ranges have been reserved by IETF and IANA as private or non-routable addresses. The ranges are:
- 10.0.0.0/8 – 10.0.0.0 to 10.255.255.255 with 16,777,216 addresses and a mask of 255.0.0.0
- 172.16.0.0/12 – 172.16.0.0 to172.31.255.255 with 1,048,576 addresses and a mask of 255.240.0.0
- 192.168.0.0 to 192.168.255.255 with 65,536 addresses and a mask of 255.255.0.0
These ranges can be split into subnets for internal routing purposes. If you have a home router, you will probably use a range such as 192.168.0.0/24 or 192.168.1.0/24 which are both subnets of the 192.168.0.0/16 range. For example, my router defaults to using the 192.168.1.0/24 range (mask 255.255.255.0). So, I am only using a subnet of the 192.168.0.0/16 range.
These private address ranges cannot be used on the public internet and routers will not forward packets onto the public internet if they have private addresses.
One other range that you will see on any computer but will never see on the internet is the local loopback range 127.0.0.0/8. Usually you will see this as 127.0.0.0.
Finally, IANA has reserved the 100.64.0.0/10 range for use in carrier grade network address translation (NAT). This address range should not appear on the public internet nor should it be used for private networks.
You can use the
ip command to see your IPv4 addresses as shown in Listing 1, first on Fedora 29, then on Ubuntu 16.04.6 LTS.
Listing 1. Using the ip command
# Fedora [ian@attic5-f29 ~]$ ip -4 -br addr lo UNKNOWN 127.0.0.1/8 enp9s0 UP 192.168.1.25/24 virbr0 DOWN 192.168.122.1/24 # Ubuntu ian@attic-u16:~$ ip -4 -br addr lo UNKNOWN 127.0.0.1/8 enp2s0 UP 192.168.1.24/24
Link local addresses
When Dynamic Host Configuration Protocol (DHCP) is not available and manual configuration is not done, the system may do zero configuration networking on an Ethernet network by assigning an address pseudo randomly from the range 169.254.0.0/16.
The Internet Protocol (IP) is a host to host datagram protocol. Datagrams may pass through many network nodes. Delivery is not guaranteed. In the event of errors, gateways need to communicate errors between themselves or possibly report back to the originating host. ICMP is used for this purpose. ICMP uses basic IP support as if it were a higher level protocol, but ICMP is actually an integral part of IP that is implemented by every IP module.
ICMP messages can be sent for purposes such as:
- A datagram cannot reach its destination
- A gateway does not enough buffers to forward a datagram
- A gateway suggests a shorter route for host to send traffic
- A gateway detects that a host is unreachable
- A host wants to check the route to a remote host or check that it is alive.
ping command is used to check if another host is alive while the
traceroute command is used to interrogate the route to that host. Listing 2 shows an example of each command.
Listing 2. Using ping and traceroute
[ian@attic5-f29 ~]$ ping cybershields.com PING cybershields.com (220.127.116.11) 56(84) bytes of data. 64 bytes from s10.lookwhois.com (18.104.22.168): icmp_seq=1 ttl=49 time=84.5 ms 64 bytes from s10.lookwhois.com (22.214.171.124): icmp_seq=2 ttl=49 time=84.9 ms 64 bytes from s10.lookwhois.com (126.96.36.199): icmp_seq=3 ttl=49 time=87.2 ms 64 bytes from s10.lookwhois.com (188.8.131.52): icmp_seq=4 ttl=49 time=91.1 ms ^C --- cybershields.com ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 6ms rtt min/avg/max/mdev = 84.521/86.922/91.115/2.652 ms [ian@attic5-f29 ~]$ traceroute cybershields.com traceroute to cybershields.com (184.108.40.206), 30 hops max, 60 byte packets 1 _gateway (192.168.1.1) 0.259 ms 0.368 ms 0.407 ms 2 * * * 3 cpe-174-111-105-205.triad.res.rr.com (220.127.116.11) 45.354 ms 45.375 ms 45.390 ms 4 cpe-024-025-041-002.ec.res.rr.com (18.104.22.168) 23.077 ms 22.143 ms 23.021 ms 5 * * * 6 * bu-ether11.atlngamq46w-bcr00.tbone.rr.com (22.214.171.124) 35.248 ms bu-ether14.atlngamq46w-bcr00.tbone.rr.com (126.96.36.199) 28.788 ms 7 0.ae0.pr0.atl20.tbone.rr.com (188.8.131.52) 27.737 ms 0.ae1.pr0.atl20.tbone.rr.com (184.108.40.206) 19.741 ms 0.ae2.pr0.atl20.tbone.rr.com (220.127.116.11) 21.050 ms 8 te1-6.bbr01.tl01.atl01.networklayer.com (18.104.22.168) 28.073 ms 29.402 ms 29.581 ms 9 * * * 10 * * * 11 ae1.dar01.dal13.networklayer.com (22.214.171.124) 47.021 ms ae1.dar01.dal10.networklayer.com (126.96.36.199) 39.878 ms ae1.dar01.dal13.networklayer.com (188.8.131.52) 53.891 ms 12 * ae16.cbs01.dr01.dal04.networklayer.com (184.108.40.206) 50.829 ms * 13 ae2.cbs01.cs01.lax01.networklayer.com (220.127.116.11) 71.911 ms 81.297 ms * 14 * * * 15 * * * 16 ae24.bbr01.eq01.sjc02.networklayer.com (18.104.22.168) 100.518 ms 86.327 ms 86.202 ms 17 ae5.dar01.sjc01.networklayer.com (22.214.171.124) 88.142 ms ae6.dar02.sjc01.networklayer.com (126.96.36.199) 85.816 ms ae5.dar01.sjc01.networklayer.com (188.8.131.52) 87.915 ms 18 a1.76.1732.ip4.static.sl-reverse.com (184.108.40.206) 84.691 ms po2.fcr02.sr02.sjc01.networklayer.com (220.127.116.11) 87.973 ms a1.76.1732.ip4.static.sl-reverse.com (18.104.22.168) 85.636 ms 19 s10.lookwhois.com (22.214.171.124) 87.216 ms * 83.498 ms
TCP and UDP
Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are transport layer protocols that applications on different hosts use to communicate with each other over an IP network.
TCP is a robust end-to-end protocol that provides for sessions, guaranteed and in-order delivery of packets, and recovery from errors such as packet loss. It is used where reliable end-to-end communication is needed, for example:
- Hypertext Transfer Protocol (HTTP)
- Email services such as Simple Mail Transfer Protocol (SMTP), Internet Message Access Protocol (IMAP), or Post Office Protocol Version 3 (POP3).
- File transfers using File Transfer Protocol (FTP) or secure FTP (SFTP)
- Terminal applications such as Secure Shell (SSH) or Telnet.
UDP is a connectionless protocol that delivers packets from one host to another where reliability is not required. Accordingly, it has lshorter headers, no recovery, and no acknowledgments when compared to TCP. Its uses include:
- Live video streaming where occasional frame loss is not a problem
- Voice or audio transmission, such as for Voice over IP (VoIP) applications
- Domain Name System (DNS) where multiple requests might be sent out for a particular query, but one reply may suffice.
- Trivial File Transfer Protocol (TFTP) for nodes booting over a local area network.
TCP and UDP ports and services
Applications and services use TCP and UDP to connect to other applications on the same or different hosts. A two-byte address called a port is used to distinguish which application should receive an incoming request. Ports therefore can range from 0 to 65535.
Applications such as FTP, HTTP, or SSH generally use well-known port numbers. These are assigned by IANA and the current IANA policy is to assign the same UDP and TCP port number to a given application, even if the application uses only TCP or only UDP. Linux systems maintain a list of ports and the services that use them is stored in the file, /etc/services. The beginning of this file on my Fedora 29 system is shown in Listing 3.
Listing 3. /etc/services
# /etc/services: # $Id: services,v 1.49 2017/08/18 12:43:23 ovasik Exp $ # # Network services, Internet style # IANA services version: last updated 2016-07-08 # # Note that it is presently the policy of IANA to assign a single well-known # port number for both TCP and UDP; hence, most entries here have two entries # even if the protocol doesn't support UDP operations. # Updated from RFC 1700, ``Assigned Numbers'' (October 1994). Not all ports # are included, only the more common ones. # # The latest IANA port assignments can be gotten from # https://www.iana.org/assignments/port-numbers # The Well Known Ports are those from 0 through 1023. # The Registered Ports are those from 1024 through 49151 # The Dynamic and/or Private Ports are those from 49152 through 65535 # # Each line describes one service, and is of the form: # # service-name port/protocol [aliases ...] [# comment] tcpmux 1/tcp # TCP port service multiplexer tcpmux 1/udp # TCP port service multiplexer rje 5/tcp # Remote Job Entry rje 5/udp # Remote Job Entry echo 7/tcp echo 7/udp discard 9/tcp sink null discard 9/udp sink null
Each line contains up to three fields separated by white space plus an optional comment introduced by a
# character. Networking programs should check this file to get the port number (and protocol) for services they provide. The three fields are:
- service-name, which is the friendly name the service is known by. It is case sensitive. The client program often has the same name.
- port/protocol, which is the port number (in decimal) to use for this service followed by a / and the protocol type, typically tcp or udp. Additional protocols are listed in
- aliases, which is an optional space or tab separated list of other names for this service. These are case sensitive.
Common TCP and UDP ports and services that are part of the LPI objectives are shown in Table 3.
Table 3. Common TCP and UDP ports and services
|fsp fspd||21 is registered to ftp, but also used by fsp|
|Secure Shell (SSH) Protocol|
|Domain name server (DNS)|
|World Wide Web HyperText Transfer Protocol (HTTP)|
|Post Office Protocol (POP) version 3|
|Network Time Protocol (NTP) – UDP|
|NETBIOS session service|
|Interim Mail Access Protocol v2|
|Simple Net Management Protocol (SNMP)|
|snmp-trap||Traps for SNMP|
|Lightweight Directory Access Protocol (LDAP)|
|HTTP protocol over TLS/SSL|
|smtps||URL Rendesvous Directory for SSM Message Submission over TLS protocol (deprecated)|
|cmd like exec, but automatic authentication is performed as for login server|
|LDAP over SSL|
|CISCO SNMP TCP port|
|POP-3 over SSL|
Ports from 0 to 1023 are well-known ports. These can only be opened by processes running with root authority (id 0) so that network nodes connecting can have some level of trust in the system they connect to.
Ports from 1024 through 49151 are registered ports. They are registered with IANA and used by non-root applications. For example port 1058 for IBM AIX Network Installation Manager (NIM).
Ports from 49152 through 65535 may be dynamically assigned or used for private purposes. In practice, systems may limit the maximum port allowed for local use and also use some of the registered ports range for this purpose. Use the
sysctl command or the
cat command to display the range on your system as shown in Listing 4.
Listing 4. Determining port range
[ian@attic5-f29 ~]$ sysctl net.ipv4.ip_local_port_range net.ipv4.ip_local_port_range = 32768 60999 [ian@attic5-f29 ~]$ cat /proc/sys/net/ipv4/ip_local_port_range 32768 60999
I finish this section with a simple Python example derived from the TCP Communication example found on the Python Wiki. A server listens for connections on the loopback address port 50000 and prints data about the connecting host and port, then echoes the data framed in angle brackets. The client connects to the loopback address and sends a “Hello, World” message, then prints the reply from the server. The server code is shown in Listing 5.
Listing 5. Server code
#!/usr/bin/env python import socket IP_ADDRESS = '127.0.0.1' PORT = 50000 BUFFER_SIZE = 1024 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.bind((IP_ADDRESS, PORT)) sock.listen(1) conn, address = sock.accept() print( "Connection from: " + address + " Port: " + str(address) ) while True: data = conn.recv(BUFFER_SIZE) if not data: break print( "received data: " + data) conn.send("Echo: " + data) conn.close()
The client code is shown in Listing 6.
Listing 6. Client code
#!/usr/bin/env python import socket IP_ADDRESS = '127.0.0.1' PORT = 50000 BUFFER_SIZE = 1024 MESSAGE = "Hello, World!" sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((IP_ADDRESS, PORT)) sock.send(MESSAGE) data = sock.recv(BUFFER_SIZE) sock.close()
I started the server in one window and then started the client in another. The output is shown in Listing 7 and Listing 8.
Listing 7. Server output
[ian@attic5-f29 ~]$ ./port-server.py Connection from: 127.0.0.1 Port: 48134 received data: Hello, World!
Listing 8. Client output
[ian@attic5-f29 ~]$ ./port-client.py received data: <Echo: Hello, World!>
Note in the server output that the client has been assigned port 48134. If you run the scenario again, you will probably see a different port.
Network Address Translation
In IPv4 networks, you have seen that some address ranges are reserved for private use. These addresses are not routable on the general internet. Most small office or home routers use addresses such as 192.168.1.x. The ISP may provide an address dynamically, say 126.96.36.199. Network Address Translation (NAT) is a process where a router can accept connection requests from the private side of the network and translate the address (and possibly the initiating port number) so that it appears to the public network as if it came from a port on the ISP assigned address. This allows connections from the private network to the public network, which works well for most applications. However, when a connection request (or datagram) from the public network arrives at the router, it can only be forwarded if the router has a table indicating which private host and port traffic arriving on a particular port should be sent to. Besides allowing private hosts to reach public hosts, this scheme also provides a level of shielding to the private network from malicious traffic that may be scanning for potential targets. Note that NAT is not part of the current LPI requirements.
Internet Protocol Version 6 – IPv6
Internet Protocol Version 6 better known briefly as IPv6 was developed by the IETF in the mid 1990s to address the limitations of IPv4. Before the widespread use of NAT, IPv4 addresses were expected to run out imminently. IPv6 became a Draft Standard in 1998 and was finally ratified in 2017.
IPv6 uses a 128-bit address. This allows 2128, or approximately 3.4X1038 addresses. As with IPv4, the actual number is smaller. Several ranges are reserved for special use or excluded completely.
In contrast to the decimal dotted quad notation used for IPv4, IPv6 addresses use eight groups of four hexadecimal digits, with groups being separated by colons (for example, fe80:0000:0000:0000:efaa:4539:a6ba:27b9). Leading zeros can be omitted from a group and one multiple group of all zeros can be replaced by a double colon (::). So this example could be written as fe80:0:0:0:efaa:4539:a6ba:27b9 or just fe80::efaa:4539:a6ba:27b9. Another example of an abbreviated address is ::ffff:188.8.131.52. And if you check you own loopback address using the ip command, you will see your loopback address as ::1. Try
ping -6 ::1 to see an example as in Listing 9.
Listing 9. Ping with IPv6
[ian@attic5-f29 ~]$ ping -6 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.100 ms 64 bytes from ::1: icmp_seq=2 ttl=64 time=0.079 ms 64 bytes from ::1: icmp_seq=3 ttl=64 time=0.079 ms 64 bytes from ::1: icmp_seq=4 ttl=64 time=0.082 ms 64 bytes from ::1: icmp_seq=5 ttl=64 time=0.083 ms ^C --- ::1 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 84ms rtt min/avg/max/mdev = 0.079/0.084/0.100/0.012 ms
Many other utilities, including
traceroute have a -6 or -4 option to specify whether the command should use IPv6 or IPv4. The traceroute6 command is equivalent to
IPv4 and IPv6 comparison
As noted, IPv6 addresses are 128 bits rather than the 32 bits of IPv4. So, the IPv6 address space is unlikely to be exhausted.
Security, in the form of IPSec was originally designed in the IPv6 protocol and mandated. RFC 6434 changed this requirement to a should rather than must, as it was unclear that IPSec would become the dominant security choice among many other possibilities. RFC 6434 recommends IPSec with IKEv2 key management.
IPv6 has a simpler packet header than IPv4. Rarely used fields have been moved to extensions and the fields that remain are of fixed length. This simplifies processing in high speed routing.
Pv5 hosts can configure themselves automatically if connected to an IPv6 network. This process is known as Stateless Address Autoconfiguration (SLAAC), and it replaces the DHCP processes of IPv4.
Mobile devices have a home address than can be reached by the home router. While away from home, packets sent to the home router for the device can be routed to the current location.
If you want to change the socket connection examples, remember to change the address to “::1” and also change AF_INET to AFI_NET6. There are other options you can specify for IPv6 sockets, but this should get you started.
This concludes your introduction to Topic 109: Networking fundamentals.