Protocol that allows to:
- assign addresses to network interfaces.
- find path between one computer to another, possibly passing through many routers
- fragment and reassemble messages that are too large to be reliably transmitted through on a channel
Main spec: http://tools.ietf.org/html/rfc791, from 1981!
As of 2014, the dominant version is IPv4, but IPv6 is already specified and has many implementations.
One of the major limitations of IPv4 is the small number of addresses, which was underestimated at the creation of IPv4, making such addresses a valuable resource.
IPv4 addresses have 4 bytes, while IPv6 have 16, making the number of addresses for IPv6 stratospherically large at around 10^38. Compare it to Avogadro's constant at 10^23, so it is unlikely that those addresses will become a valuable resource in the foreseeable future.
Unless where noted otherwise, the term IP will refer in this project to either IPv4 or IPv4 and IPv6 unless explicitly written IPv6.
Already taken by a revision of some old protocol: http://archive.oreilly.com/pub/post/what_ever_happened_to_ipv5.html
Learn what the IP header contains:
http://en.wikipedia.org/wiki/IPv4_header#Header
Fields by increasing interest / ease to understand ratio:
-
Version (4 bits)
Indicates the protocol version.
Value 4 for IPv4.
-
Internet header length (IHL) (4 bits)
Length of the IP header only (no data) in 4 byte units.
-
Total length (2 bytes)
Total length of header + body in bytes
This must be transmitted as the length is variable.
-
IPs of destination and origin (4 bytes each)
-
Time to live (TTL) (1 byte)
Decreased whenever the packet passes through a router.
If 0, router does not forward the package, because it has already traveled for too long, and signals this to the sender via an ICMP Time exceeded with Code = 0.
This prevents lost packages from doing infinite turns on the network.
-
Header checksum (2 bytes)
-
Protocol (1 byte)
Number that identifies the protocol contained in the IP data, for example TCP or UDP.
In this way, the receiver knows how to interpret the data inside of the IP package.
The numbers are assigned by IANA and can be found here.
-
Flags 3 bits
Used for IP fragmentation.
-
Fragment offset (13 bits)
Used for IP fragmentation.
-
Identification (2 bytes)
Used for IP fragmentation.
An unique address that identifies a host, for example a separate workstation.
IPv4 addresses are 4 bytes long.
Each computer knows its IP address.
This can be set in two ways:
-
statically and manually: admin enters those values for each computer.
They never change.
-
dynamically and automatically specially via DHCP.
Best method on a local network to avoid IP clashes.
The most common notation by far is the dotted decimal:
192.156.0.1
With the advent of CIDR, a new notation was introduced that also specifies the subnet mask in one go:
192.156.0.1/24
Old concept, but people still use it for certain mask sizes + internal reserved addresses.
Each address has two parts: network part and host part.
How many bytes are the network, and how many bytes are the host was determined only by the class of the IP before CIDR, and only the following possibilities existed:
A
: starts with 0B
: starts with 10C
: starts with 110D
: starts with 1110E
: all others
IPs on classes A
, B
and C
were reserved for LAN usage;.
If your address is in those ranges, the routers proxy server knows it is an internal one you are asking about.
The most common home range, specially in home networks, is the Class C:
192.168.0.1 through 192.168.255.254
subnet mask 255.255.255.0
CIDR added the possibility to use any network / host split.
- 2 networks
- 3 bytes for the network part
- 1 router
- 3 computers on the same network
Things could look like:
+---------------+ +---------------+ +---------------+
| Computer 1 | | Computer 2 | | Computer 3 |
|---------------| |---------------| |---------------|
| 192.156.0.2 | | 192.156.0.3 | | 192.156.0.4 |
+---------------+ +---------------+ +---------------+
| | |
| | |
+-------------------+-------------------+
|
|
+---------------+
| 192.156.0.1 |
|---------------+
| Router |
|---------------|
| 192.157.0.1 |
+---------------+
|
|
+-------------------+-------------------+
| | |
| | |
+---------------+ +---------------+ +---------------+
| Computer 3 | | Computer 4 | | Computer 5 |
|---------------| |---------------| |---------------|
| 192.157.0.2 | | 192.157.0.3 | | 192.157.0.4 |
+---------------+ +---------------+ +---------------+
Note how the router also has one ore more local IPs.
In this example, there are 2 local networks: for one to communicate with the other, they must first pass through the router.
When the router sees an address that is not in the current network, e.g. Computer 3 wants to send to Computer 1, it is able to determine to which network to send it. TODO how.
RFC 1918: https://en.wikipedia.org/wiki/IP_address#Private_addresses
If you use a router, your entire network has a single IP seen from the outside (WAN), and an internal IP for each interface seen on the private local network (LAN).
You external IP may change or not depending on how your ISP operates. Most ISPs to modify home user's IPs from time to time to have more flexibility, but many give you the same external IP for at least several hours, making it possible for you to use it for simple development.
This only matters if you want to have external computers make requests for you, e.g. to serve a web server to the outside world. Replies to requests you make already know where to be routed back to from information sent on the request.
If you are going to use your external IP behind a router, you will need to enable port forwarding.
Get LAN IPs for all interfaces on current computer:
ifconfig | grep -B1 "inet addr" | awk '{ if ( $1 == "inet" ) { print $2 } else if ( $2 == "Link" ) { printf "%s:" ,$1 } }' | awk -F: '{ print $1 ": " $3 }'
Each interface has its own IP. Therefore, a single computer can have multiple LAN IPs: one for the wired connection, one for the Wiki, one loopback localhost, etc.
The server on the router is called proxy server.
Internal IPs may be assigned automatically via the DHPC protocol.
http://serverfault.com/questions/115503/why-are-class-c-ip-addresses-preferred-in-private-networks
curl ipecho.net/plain
curl ifconfig.me
- http://askubuntu.com/questions/95910/command-for-determining-my-public-ip
- http://unix.stackexchange.com/questions/22615/how-can-i-get-my-external-ip-address-in-bash
By default, if a TCP / UDP SYN request is made to most routers, they are simply dropped.
This means that you cannot serve a web server from behind the router.
To avoid this, you must set up port forwarding, that specifies certain ports and protocol (TCP or UDP) that will be sent to a local LAN IP.
It seems only possible to do this if the local IP is static TODO confirm.
Classless Inter-Domain Routing: the number of network bits is part of the address.
http://tools.ietf.org/html/rfc1517, 1993.
Name of the analogous concept in IPv6, but restricted to the far dominating case of 1's prefixed mask.
In theory, IPv4 subnet masks could be any 4 byte pattern as the name suggests.
ifconfig wlan0 | sed -nr 's/.*Mask:([^ ]*)/\1/p'
Internal IPs have two parts: network and computer.
The length of the network part may vary between networks.
The length is given by the subnet mask, e.g.:
255.255.255.0
1111.1111.1111.0000
means that 12 first bits are network.
255.255.0.0
means that 8 first bits are network
All computers in the same network must have the same subnet mask and the same network part, but different computer parts.
Each network (formally network segment) is run by a single router #TODO confirm
It is cumbersome for humans to write the subnet mask with 4 integers, so a simpler notation as introduced. E.g.:
255.255.255.0/24
means:
- address:
255.255.255.0
- subnet mask: 24 leading ones
Specified on http://tools.ietf.org/html/rfc6890#section-2.2.2 2013, which determines current best practices, links to many other RFCs that define things.
Some special addresses were specified in separate RFCs than the main IP specs.
http://serverfault.com/questions/135267/what-is-the-network-address-x-x-x-0-used-for
If the entire host part is zero, then the address is used to refer to the network itself.
It is used when several networks, one one a different router must speak to each other.
0.0.0.0
host address in the routing table.
If no network matches request, sends to this network.
Address you get automatically redirected to by router if the address you gave cannot be found on the local network.
Find default gateway:
route -n
or for programmatic usage:
route -n | awk '{ if( $1 ~ "0.0.0.0" ) print $2 }'
If a program is listening to 0.0.0.0
, then it is listening on all interfaces:
- localhost
127.0.0.1
- local IP address of WIFI or Ethernet, e.g.
192.168.0.2
(retrieved withifconfig
)
So if you do:
nc -l 0.0.0.0 8000
it will catch all of:
echo a | nc -l 0.0.0.0 8000
echo a | nc -l 127.0.0.1 8000
echo a | nc -l 192.168.0.2 8000
On ifconfig
, this value appears on the Foreign address
column if there are not programs listening to the connection.
The same goes for writing:
nc -l 127.0.0.1 8000 &
echo a | nc -l 0.0.0.0 8000
http://superuser.com/a/802793/128124
In IPv6, ::
means 0:0:0:0:0:0:0:0
, and is the unspecified address.
For example, it appears on the ifconfig
Foreign address
column when there are no programs listening to a server.
http://tools.ietf.org/html/rfc1122#section-3.2.1.3 1989
Always refers to the current address.
Common hostname associated to the loopback address.
Specified in: http://tools.ietf.org/html/rfc919 1984.
The host address is entirely composed of 1s.
The broadcast address means talking to all computers on a given network at once instead of a single computer.
TODO: simple example with an utility that does something magic with it?
http://en.wikipedia.org/wiki/Link-local_address
TODO what is it? My interpretation: addresses that belong the current network segment and never get forwarded by the router.
Used on zeroconf.
-
network part:
192.168.3
Broadcast is:
192.168.3.255
-
network part:
192.168.234
Broadcast is:
192.168.3.255
-
network part:
10
Broadcast is:
10.255.255.255
The .1
address is not special, but in home networks is often already taken by the router's inner interface
This is why your addresses may start at .2
.
If a router will forward this packet to another interface, it may be that the MTU of the other interface be smaller than the previous one, and the package cannot be transmitted hole.
In this case, IP provides a fragmentation mechanism to split up the package into smaller ones.
TCP always attempts to avoid IP fragmentation, so that if TCP is being used, it is unlikely that fragmentation will happen. It is therefore easier to observe IP fragmentation for UDP packets.
Reassembly of fragments is meant to happen only at the final destination.
The fields used from the IP header are:
-
flags (3 bits)
| 0 | DF | MF |
-
0: reserved, always set to 0
-
DF
: don't fragment.If that is the case, the router first checks the
DF
flag.If
DF = 1
, routers should not fragment, and notify the source via an ICMP fragmentation not possible message.If
DF = 0
, fragmentation can occur, and tranmission continues. -
MF
: more fragmentsIf 0, this is the last fragment.
If 1, there are more fragments to come.
A common technique is to send first the last segment, which allows the destination to know how large a buffer it will need in advance.
-
-
fragment offset: (13 bits)
How many 8 byte units of IP data have already been sent.
The first fragment always has this field equal to 0.
But if fragmentation occurs, the following fragments will contain how much IP data has been sent already.
-
identification: (2 bytes)
It is set by the sender with a different number for each IP datagram.
If fragmentation will occur, each fragment's IP header will get the same identification number so that they can be reassembled afterwards.
Identification numbers can be the same across multiple source IPs, but for a single source IPs they are unique.
Network Interface Cards
Hardware that does network communication.
Come mostly built-in the motherboard today.
Each router has at least 2 NICs: one external and one internal.
ifconfig | perl -ne '/^(\S+)/ && print $1 . "\n"'
AKA:
- physical address
- hardware address
- media access control address
- BIA: burnt in address
Unchangeable address of each NIC.
Unique across and within vendors.
6 bytes: first 3 identify vendor, last 3 product
Colon separated notation. Ex: 0C:21:B8:47:5F:96
.
Get MAC addresses of my computer:
ifconfig
Or for programmatic usage:
ifconfig | sed -nr 's/([^ ]*) .*HWaddr (.*)/\1 \2/p'
Get MAC addresses of computers I have already talked to in the LAN:
timeout 3 ping 192.168.1.3
arp -a | sed -nr 's/([^ ]*) .*at (.*)/\1 \2/p'
Internet control message protocol.
Part of the IP protocol.
Contains several types of information for routers about the IP traffic.
Lives in the Internet layer, but is encapsulated inside an IP datagram just like TCP and UDP:
| ICMP data |
| IP data | IP header |
Its number on the IP protocol field is 1, so you may guess that this is a very important protocol.
Structure of the ICMP header
- Type 1B
- Code 1B
- Checksum 2B
- Variable part 4B
ICMP can also contain an optional extra data section after the variable part.
Examples of what ICMP can do:
-
ECHO requests.
The server responds with another echo request with the same data.
Used to testing the network.
Used by the ping utility.
-
destination unreachable
Datagram cannot be transmitted further by a router.
TODO when does this happens? This is not the
TTL = 0
since that is covered by Time Exceeded -
source quench
Router tells source to lower sending speed because the network is too overloaded.
-
redirect
Router tells source to not make that request again, since there is an obvious better choice of router to make the request to.
Typically happens on a LAN with 2 routers, if a host sends a request to a bad router.
-
router discovery
Source asks for any routers on the LAN to identify themselves.
-
time exceeded.
-
if
code = 0
: TTL reached 0.Router notifies source of that via ICMP.
Used by
traceroute
. -
if
code = 1
: reassembly time exceeded.IP headers can be fragmented.
If the first part reaches, but the second takes too long, the reciever discards the first to make room for other requests, and notifies the source like this.
-
-
packet too big (BTP)
Sent by router to source if it receives a packet that is larger than the MTU and the IP header has
DF = 1
.
Internet Group Message Protocol.
Similar to ICMP in some senses:
- wrapped inside IP
- IP control purposes
However IGMP is used to control multicasts only.
IP protocol number: 2.
http://tools.ietf.org/html/rfc4291 2006.
Terminal 1:
nc -l6 ::1 8000
Terminal 2:
echo a | nc -6 ::1 8000
Congratulations, your first explicit IPv6 connection!
IPv6 is less popular than IPv4 as of 2014, but it's popularity is increasing, and it is likely that it will dominate in a few years.
A few metrics:
- percentage of Google requests on IPv6: https://www.google.com/intl/en/ipv6/statistics.html
Rules documented at: http://tools.ietf.org/html/rfc5952
Full addresses are:
- written in hex. The characters
a-f
must be lowercase. - grouped every 2 bytes by
:
- leading 0's MUST be removed.
0000
becomes0
. - the longest sequence of
:0:0:0:
MUST be converted to::
. Can only be done once, or would lead to ambiguity. - a single
:0:
shall not be converted to::
, only multiple 0's like:0:0:
.
The rules that make addresses shorter lead to the cannonical representation of the address: they are not just optional.
Good:
2001:db8::ff00:42:8329
which means in a fuller notation:
2001:0db8:0000:0000:0000:ff00:42:8329
Bad, uppercase:
2001:DB8:0:0:0:FF00:42:8329
Bad, leading 0's:
2001:0db8::ff00:42:8329
Bad, 0's not compressed:
2001:db8:0:0:0:ff00:42:8329
Fancy stuff: http://stackoverflow.com/questions/16561692/android-exception-handling-best-practice
Routers send packages from one place to another.
Most routers will have at least two network interfaces, one on each network.
If you want to play around with routers, you should get your hands dirty and do some router configuring.
Routers are generally configured through a browser.
First you must make a wired connection to the router.
You must enter the IP address of your router. This is fixed and supplied by the router manufacturer. A common address is the first address of the range: 192.168.0.1
for class C.
You must then enter a username and a password. A default will be supplied by the manufacturer, such as admin
admin
, or admin
password
. This can be changed once you logged in.
Great LAN routing example: http://en.wikipedia.org/wiki/Default_gateway
Routing tables say: if the request should go to a given network, send it to a given interface.
0.0.0.0
is the default if no other is found.
Routers have two interfaces each: inside and outside.
TODO: what is it?
- 8.8.8.8 and 8.8.8.4: http://en.wikipedia.org/wiki/Google_Public_DNS