-
Notifications
You must be signed in to change notification settings - Fork 0
/
how-the-internet-works-2.html
157 lines (130 loc) · 13.7 KB
/
how-the-internet-works-2.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<!DOCTYPE html>
<head>
<meta charset="utf-8" />
<!-- Set the viewport width to device width for mobile -->
<meta name="viewport" content="width=device-width" />
<title>How the Internet Works: Domain Names and Routing</title>
<link rel="stylesheet" href="http://lmontopo.github.io/theme/css/normalize.css" />
<link rel="stylesheet" href="http://lmontopo.github.io/theme/css/foundation.min.css" />
<link rel="stylesheet" href="http://lmontopo.github.io/theme/css/style.css" />
<link rel="stylesheet" href="http://lmontopo.github.io/theme/css/pygments.css" />
<script src="http://lmontopo.github.io/theme/js/modernizr.js"></script>
</head>
<body>
<!-- Nav Bar -->
<nav>
<div class="top-bar">
<div class="row">
<div class="large-9 large-centered columns">
<h1><a href="http://lmontopo.github.io">The L Blog</a></h1>
</div>
</div>
</div>
<!-- Show menu items and pages -->
<div class="row">
<div class="large-9 columns">
<ul class="button-group navigation">
</ul>
</div>
</div>
</nav>
<!-- End Nav -->
<!-- Main Page Content and Sidebar -->
<div class="row">
<!-- Main Blog Content -->
<div class="large-9 columns">
<article>
<header>
<h3 class="article-title"><a href="http://lmontopo.github.io/how-the-internet-works-2.html" rel="bookmark"
title="Permalink to How the Internet Works: Domain Names and Routing">How the Internet Works: Domain Names and Routing</a></h3>
</header>
<h6 class="subheader" title="2015-09-14T05:00:00-04:00">Mon 14 September 2015
</h6>
<p>This blog post is a continuation of my previous post <a href="http://lmontopo.github.io/how-the-internet-works.html">How the Internet Works</a>. Here, I hope to answer the following questions: </p>
<ul>
<li>
<p><strong>Since we don't usually specify the IP address of the computer we wish to connect with, how does our computer successfully connect with it?</strong> </p>
</li>
<li>
<p><strong>Assuming the IP address of the destination computer is known, How does a request from my computer actually make it to the computer with this IP? How does it know where that computer is?</strong></p>
</li>
</ul>
<h2>The Domain Name System</h2>
<p>A <strong>domain name</strong> is a human readable address of a computer on the internet. Domain names are used all the time for activities requiring the internet, including sending emails and browsing web pages. The part of a URL<sup id="fnref:1"><a class="footnote-ref" href="#fn:1" rel="footnote">1</a></sup> that appears after the protocol but before the file path is the domain name. For example, <code>lmontopo.github.io</code> is the domain name in the URL of this blog post: <code>https://lmontopo.github.io/how-the-internet-works-2.html</code>. </p>
<p>The term <strong>Domain Name System (DNS)</strong> refers to the internet service of translating domain names into IP addresses. It consists of a distributed<sup id="fnref:2"><a class="footnote-ref" href="#fn:2" rel="footnote">2</a></sup> database of IP address and domain name pairs. Hundreds of servers known as <strong>DNS servers</strong> are dedicated to working together to translate domain names into IP addresses. There are two kinds of DNS servers: caching servers and authoritative name servers: </p>
<ul>
<li>
<p><strong>Authoritative Servers</strong> are responsible for maintaining a database of the IP addresses of domains within its 'authority'. <em>What does this mean?</em> Well, these servers are organized in a heiarchy that matches the structure of the domain names themselves. Indeed, domain names consist of one or more domains separated by dots. The rightmost domain is the <em>top-level domain</em>. Examples of top-level domains include <code>com</code>, <code>org</code> and <code>edu</code>. Directly left of the top-level domain is the second-level domain, preceeded by more sub-domains until the leftmost word, the <em>host</em>. An authoritative DNS server handles a specific level in this heiarchy and knows the IP addresses of all hosts and subdomains under it. So, for example, an authoritative DNS server with authority over the <code>github.io</code> domain would know the IP address of host <code>lmontopo.github.io</code>.</p>
</li>
<li>
<p><strong>Caching Servers</strong> are your computer's first point of contact with the DNS. Your computer is automatically<sup id="fnref:3"><a class="footnote-ref" href="#fn:3" rel="footnote">3</a></sup> configured with the IP address of a caching DNS server (usually provided by your Internet Service Provider) so that your computer knows who to contact for DNS resolution. These caching servers communicate back and forth with authoritative name servers on your behalf until they find one that can resolve the supplied domain name into an IP address. </p>
</li>
</ul>
<p><strong>OK, I know this stuff can be confusing, but if you've made it this far, bear with me!</strong></p>
<p><em>Let's go through an example to better understand how this works: Suppose we request the IP address of the domain name 'www.wikipedia.org'</em>:</p>
<p>As previously explained, our computer will first send a request to the DNS Server (a cache server) specified in its network configuration settings. Lets call this cache server Jane, since we'll be talking about it lots. Our computer will ask Jane something like “Can you tell me the IP address of www.wikipedia.org?”. Usually the DNS server provided by our ISP will have a cache<sup id="fnref:4"><a class="footnote-ref" href="#fn:4" rel="footnote">4</a></sup> of frequently resolved domain names. Assuming Jane has one, she'll search her cache for this domain name. If Jane finds it she'll get back to us straight away with the desired IP address. Otherwise Jane will strip the host from the domain name and search her cache for a server with authority over the remaining name: <code>wikipedia.com</code>. If Jane find this server's IP in her cache she'll send our original request-“Can you tell me the IP address of www.wikipedia.org?”- to this server. For now, lets assume Jane's cache doesn't contain 'wikipedia.com'. Then Jane will again search her cache for the server with authority over the <code>com</code> domain. (Notice that each time the authoritative DNS server strips of the leftmost word off the domain name.) Finally, if Jane doesn't find this server's IP, she'll resort to contacting the <strong>DNS Root Servers</strong>. Root servers are the Authoritative DNS Servers which reside at the top of the heiarchy, right above above top level DNS servers. <em>(Cache servers, like Jane, are pre-configured with a list of root servers so that they always have someone to contact.)</em> In this case the root server would respond with a list of server IPs of the <code>com</code> domain. Jane would then contact one of the servers from this list, and ask it to resolve <code>www.wikipedia.com</code>. The <code>com</code> DNS server would give Jane some IPs of DNS servers with authority over the <code>wikipedia.com</code> domain. Jane could then contact one of these servers, any of which could resolve <code>www.wikipedia.com</code>. When Jane finally gets the translation she sends us back our desired IP address. Jane will also keep (at least temporarily) a cache of the IP addresses that were just discovered: IPs of <code>com</code> domain servers, <code>wikipedia.com</code> domain servers and the IP address of <code>www.wikipedia.com</code>. The following picture demontstrates this process:<sup id="fnref:5"><a class="footnote-ref" href="#fn:5" rel="footnote">5</a></sup></p>
<p><img alt="Pelican" src="../images/example_lookup.png" /></p>
<p><em>Cool Trick: you can actually bypass the whole Domain Name Resolution Process by typing in the IP address of the computer you wish to communicate with.</em></p>
<h2>Routing</h2>
<p>Once we know the IP address of the computer we want to talk to, how does our request make its way to that computer? </p>
<p>The answer, is <strong>routers</strong>. Routers connect</p>
<p>Routers are connected between networks to control the flow of messages between them. The main responsibilitis of a router are to ensure messages get sent where they are intended, and to ensure messages don't get sent into networks where they weren't intended. But how does it do this? </p>
<p>Well, each router usually knows about the IP addresses within its sub-networks but not about the IP addresses in networks above it (similar to the DNS system!). At a high level, we can imagine that a request arrives at its destination IP address by first travelling up this hierarchy of routers until a router that knows about the destination's IP address is reached. At this point the request would begin to travel back down the chain in the direction of the desired computer. </p>
<p><strong>Although this gives us a nice idea of how routing works, in reality there's a bit more to it.</strong> The picture I painted seems to imply that the computer at the ‘top’ of this chain is all knowing -i.e. it knows all the IP addresses of every computer below it- and that is not the case. Each router has a configuration table which lists patterns of IP addresses, and rules to follow based on those patterns. The router will scan the destination address and match that IP address against the patterns in its table. When the router finds the category that the IP fits into, it follows the related instructions to go in a specific direction. Before actually sending out information in this direction, the router will check to see if that direction is satisfactory (i.e. flow is OK there, and nothing seems broken). If all is well it will go ahead. Otherwise it will check the configuration table for an alternate route. Because many different routes exist between two computers on the internet, a request made to google.com one minute might take a completely different route than the next minute. And that's part of the beauty of routing, is that its adaptable and can act accordingly.</p>
<p>In order for routers to do their job, they are also performing <strong>Network Address Translations</strong> (NAT). Remember last time how I mentioned that your personal computer's public IP is your router's IP? Well, your router has to translate all of the traffic coming from your local network, re-addressing it with its own IP address. Then the router has to keep track of which incoming messages (all of which are addressed to <em>its</em> IP address!) are meant for which computer in its local Network. Even more complicated, routers may have to deal with situations where a private IP in your local network overlaps with a public IP address somewhere in the internet. In some cases this might cause issues, but other times your router may successfully create a 'lookup table' to keep track of which computer is which, and translate IP addresses to compensate.</p>
<h3>Thanks Again</h3>
<p>Well, thats it (for now!) on how the internet works. I hope that if you've made it this far you've enjoyed the journey. As always, feel free to reach out to me if you have any questions! I am no expert, but if there's something I can help you understand I'd be happy to do so!</p>
<div class="footnote">
<hr />
<ol>
<li id="fn:1">
<p>URL stands for Uniform Resource Locator and refers to the web address used to locate a web page on the World Wide Web. <a class="footnote-backref" href="#fnref:1" rev="footnote" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>The word <em>distributed</em> in this context just means that the database is stored over many computers, as opposed to a single database on a single computer.  <a class="footnote-backref" href="#fnref:2" rev="footnote" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>Of course, you can override this setting if you'd like to use some other Domain Name System Server. <a class="footnote-backref" href="#fnref:3" rev="footnote" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>A <strong>cache</strong> is just a fancy word for data that is stored on a machine. Usually this data is the result of a computation or process that it previously performed, so that if it needs the result again it can provide it more quickly.  <a class="footnote-backref" href="#fnref:4" rev="footnote" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>I got this great image from the wikipedia page on the Domain Name System. You can find it <a href="https://en.wikipedia.org/wiki/Domain_Name_System#/media/File:An_example_of_theoretical_DNS_recursion.svg">here</a>. <a class="footnote-backref" href="#fnref:5" rev="footnote" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
</ol>
</div>
<p class="subheader">Category: <a href="http://lmontopo.github.io/category/blog.html">Blog</a>
Tagged:
<a href="http://lmontopo.github.io/tag/web.html">Web </a>
</p>
</article>
</div>
<!-- End Main Content -->
<!-- Sidebar -->
<aside class="large-3 columns">
<h5 class="sidebar-title">Site</h5>
<ul class="side-nav">
<li><a href="http://lmontopo.github.io/archives.html">Archives</a>
<li><a href="http://lmontopo.github.io/tags.html">Tags</a>
</ul>
<h5 class="sidebar-title">Categories</h5>
<ul class="side-nav">
<li><a href="http://lmontopo.github.io/category/about.html">About</a></li>
<li><a href="http://lmontopo.github.io/category/blog.html">Blog</a></li>
</ul>
</aside> <!-- End Sidebar -->
</div> <!-- End Main Content and Sidebar -->
<!-- Footer -->
<footer class="row">
<div class="large-12 columns">
<hr />
<div class="row">
<div class="large-6 columns">
<!-- <p>The L Blog by Leta Montopoli</p> -->
</div>
</div>
</div>
</footer>
</body>
</html>