박성범 Simon Park

The very concrete principle of how the internet works

Connecting to Google on a campus network

KO | EN

This article walks through what happens when a laptop on a campus network connects to Google (www.google.com). Everything here is simplified or assumed, from KT (short for Korea Telecom) and Google Fiber to Google’s network layout and each node’s IP and MAC addresses. Proxies and caches are omitted as well so the full exchange of packets between the client and the server can be followed from end to end.

There are several access points (APs) on the campus network, and the APs are connected to switches. The switches are connected to a gateway router, and the router is connected to the network of an ISP (Internet Service Provider) such as SK Broadband or KT.

Campus network 68.80.2.0/24

        +----+   +--------+    /------\    +--------+
Node ---| AP |---| Switch |---| Router |---| Switch |--- ...
        +----+   +--------+    \------/    +--------+
          |           |            |
         ...         ...           |
                                   |
###################################|#################
KT network 68.80.0.0/13            |
                                   |
                               /------\
                       ... ---| Router |
                               \------/
                                   |
                                  ...

Getting started: WLAN, DHCP, UDP, and IP

First, the laptop powers on and scans for access points (APs). Nearby APs broadcast beacon frames to every node within range to announce their presence, so the client can inspect the AP list and choose which AP to connect to. This method, where the AP sends the signal, is called passive scanning. The opposite method, where the node searches for APs, is called active scanning.

+-----------------+        +----------------+   +----+
| Client (Laptop) |<---+---| SSID: iptime01 |---| AP |
+-----------------+    |   | Security: WPA2 |   +----+
                       |   +----------------+
              Node <---+
                       |
              Node <---+

Once the client selects iptime01, it sends a connection request frame to the AP. The AP receives the request frame, sends back a response frame, and the connection is complete. Everyday Wi-Fi communication is based on the IEEE 802.11 standard.

Because this is the client’s first time connecting to the campus network, it doesn’t have an IP address yet. DHCP (Dynamic Host Configuration Protocol) now runs to assign one to the client. The DHCP server that hands out IP addresses can run on the AP or on the router.

The client’s operating system creates a DHCP DISCOVER message sent from port 68 on the client to port 67 on the destination. The DHCP DISCOVER message is sent over UDP. The client still has not been assigned an IP address, so the DHCP message’s src and yiaddr fields are 0.0.0.0. Also, because the message will be broadcast to every node connected to the AP, the dest field is 255.255.255.255.

The frame carrying the DHCP DISCOVER message has the destination MAC address (FF:FF:FF:FF:FF:FF), so it is broadcast to every device connected to the AP. The frame’s source MAC address at this point is the client’s MAC address (00:16:D3:23:68:8A).

+-------------------+   +---------------------------+    +----+
| Client (Laptop)   |---| src: 0.0.0.0, 68          |--->| AP | 68.85.2.9
| 00:16:D3:23:68:8A |   | dest: 255.255.255.255, 67 |    +----+ 00:1F:57:21:A8:3C
+-------------------+   | DHCPDISCOVER              |      |
                        | yiaddr: 0.0.0.0           |      +---> Node
                        | transaction ID: 654       |      |
                        +---------------------------+      +---> Node

When the AP receives the DHCP message, it extracts the client’s MAC address (00:16:D3:23:68:8A) and the IP datagram from the frame. The datagram’s destination IP address is handled by the upper layer.

A DHCP server running on the AP can assign IP addresses from its CIDR (Classless Inter-Domain Routing) block. Every IP address in use on the campus network belongs to KT’s address block. The DHCP server creates a DHCP OFFER message that includes the IP address to assign to the client, 68.85.2.101. The frame that carries the DHCP OFFER message to the client over UDP has the AP’s source MAC address (00:1F:57:21:A8:3C) and the client’s destination MAC address (00:16:D3:23:68:8A).

+-------------------+    +---------------------------+   +----+
| Client (Laptop)   |<---| src: 68.85.2.9, 67        |---| AP | 68.85.2.9
| 00:16:D3:23:68:8A |    | dest: 255.255.255.255, 68 |   +----+ 00:1F:57:21:A8:3C
+-------------------+    | DHCPOFFER                 |
                         | yiaddr: 68.85.2.101       |
                         | transaction ID: 654       |
                         | DHCP server ID: 68.85.2.9 |
                         | Lifetime: 3600 secs       |
                         +---------------------------+

After receiving the DHCP OFFER response, the client broadcasts a DHCP REQUEST message containing its own configuration values based on the DHCP OFFER. If it received several DHCP OFFER messages, it chooses one of them.

+-------------------+   +---------------------------+    +----+
| Client (Laptop)   |---| src: 0.0.0.0, 68          |--->| AP | 68.85.2.9
| 00:16:D3:23:68:8A |   | dest: 255.255.255.255, 67 |    +----+ 00:1F:57:21:A8:3C
+-------------------+   | DHCPREQUEST               |      |
                        | yiaddr: 68.85.2.101       |      +---> Node
                        | transaction ID: 655       |      |
                        | DHCP server ID: 68.85.2.2 |      +---> Node
                        | Lifetime: 3600 secs       |
                        +---------------------------+

When the router receives the DHCP REQUEST, it broadcasts a DHCP ACK message approving the requested configuration values.

+-------------------+        +---------------------------+   +----+
| Client (Laptop)   |<---+---| src: 68.85.2.9, 67        |---| AP | 68.85.2.9
| 68.85.2.101       |    |   | dest: 255.255.255.255, 68 |   +----+ 00:1F:57:21:A8:3C
| 00:16:D3:23:68:8A |    |   | DHCPACK                   |
+-------------------+    |   | yiaddr: 68.85.2.101       |
                         |   | transaction ID: 655       |
                Node <---+   | DHCP server ID: 68.85.2.9 |
                         |   | Lifetime: 3600 secs       |
                Node <---+   +---------------------------+

The client extracts the DHCP ACK message and records its own IP address and the DNS server’s IP address. Because an assigned IP address is essentially leased, if the client does not use it for some period, it is returned and a new IP address has to be assigned.

Still getting started: Ethernet, DNS, and ARP

When the client types a URL like www.google.com into the browser, Google’s home page appears after several steps. Before that can happen, the client has to find the IP address of www.google.com. This is where DNS is used to translate the domain name into an IP address.

For the DNS query, the client sends a frame containing a DNS query message to a local DNS server on the campus network. A DNS server can also run on the router, but in this case a separate local DNS server is in operation.

If there is no local DNS server, the query is sent to the ISP’s DNS server. In that case, if the ISP wants to block access to a specific site, it can reply not with the IP address of the requested domain name but with the IP address of warning.or.kr.

Right now, the client does not know the local DNS server’s MAC address. It uses ARP (Address Resolution Protocol) to find it.

The client creates an ARP query message that includes the local DNS server’s IP address. The ARP message is sent to the AP and is also broadcast to the switches connected to the AP over Ethernet. The switches don’t know the local DNS server’s MAC address either, so they send the frame to every device attached to them. Ethernet is the communication standard for building wired LANs (Local Area Networks), and it is defined by IEEE 802.3.

+-------------------+   +------------------------------+    +----+    +--------+     /------\
| Client (Laptop)   |---| Sender HA: 00-16-D3-23-68-8A |--->| AP |--->| Switch |--->| Router | 68.85.2.1
| 68.85.2.101       |   | Sender IP: 68.85.2.101       |    +----+    +--------+     \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |   | Target HA: 00-00-00-00-00-00 |                |    |
+-------------------+   | Target IP: 68.85.2.2         |                |    |     +------------------+
                        +------------------------------+                |    +---->| Local DNS Server | 68.85.2.2
                                                                        |          +------------------+
                                                                        |
                                                                        +---------> Node

Because the local DNS server’s MAC address is not known, the value of the Target HA field is set to 00-00-00-00-00-00.

When the local DNS server receives the frame containing the ARP query message, it prepares an ARP reply containing its own MAC address. The ARP reply message is packaged into an Ethernet frame with a destination address, and the frame is sent to the switch. The switch then sends the frame to the client.

+-------------------+    +----+    +--------+    +------------------------------+   +------------------+
| Client (Laptop)   |<---| AP |<---| Switch |<---| Sender HA: 00-16-D3-23-68-8A |---| Local DNS Server | 68.85.2.2
| 68.85.2.101       |    +----+    +--------+    | Sender IP: 68.85.2.101       |   +------------------+ 00:07:89:1C:43:2F
| 00:16:D3:23:68:8A |                            | Target HA: 00:07:89:1C:43:2F |
+-------------------+                            | Target IP: 68.85.2.2         |
                                                 +------------------------------+

The client receives the frame containing the ARP reply message and extracts the local DNS server’s MAC address (00:07:89:1C:43:2F) from the ARP reply message. Now that the client knows the local DNS server’s MAC address, it can send a DNS query message.

The client’s operating system creates a DNS query message and puts the string www.google.com into the question section of the message. Next come a UDP segment bound for port 53 on the local DNS server and an IP datagram. The IP datagram in this frame carries the destination IP address. The client sends the frame to the AP, the AP forwards it to the switch, and the switch sends the frame to the local DNS server connected to it.

+-------------------+   +---------------------------------------+    +----+    +--------+    +------------------+
| Client (Laptop)   |---| Identification, Flags (OP, Query type,|--->| AP |--->| Switch |--->| Local DNS Server | 68.85.2.2
| 68.85.2.101       |   | AA, TC, RD, RA, Response Type)        |    +----+    +--------+    +------------------+ 00:07:89:1C:43:2F
| 00:16:D3:23:68:8A |   | Questions (Query name, type, class )  |
+-------------------+   | Answers, Authority, Additional Info   |
                        +---------------------------------------+

Still getting started: intra-domain routing to the DNS server

The local DNS server looks at the DNS query message and tries to find the IP address of www.google.com. If the local DNS server already knows that IP address, it can respond to the client immediately. But right now it does not know the IP address of www.google.com, so it sends a DNS query to a root DNS server. There are only 13 root DNS servers in the world, and mirror servers are operated in many countries, including Korea.

Because the root DNS server is on an external network, the packet has to pass through the gateway router. At this point the local DNS server does not know the router’s MAC address, so it uses ARP and goes through the same process used earlier to learn the local DNS server’s MAC address.

+-------------------+   +------------------------------+    +--------+     /------\
| Local DNS Server  |---| Sender HA: 00:07:89:1C:43:2F |--->| Switch |--->| Router | 68.85.2.1
| 68.85.2.2         |   | Sender IP: 68.85.2.2         |    +--------+     \------/  00:22:6B:45:1F:1B
| 00:07:89:1C:43:2F |   | Target HA: 00-00-00-00-00-00 |      |    |
+-------------------+   | Target IP: 68.85.2.1         |      |    |
                        +------------------------------+      |    +-----> Node
                                                              |
                                                              |
                                                              +----------> Node

The router receives the frame, extracts the IP datagram containing the DNS query, and looks at the datagram’s destination address to decide which router on KT’s network should receive the datagram according to its forwarding table. The IP datagram is encapsulated in a link-layer frame, and the frame is sent over the link between the campus router and a KT router.

After receiving the frame, a router in KT’s network extracts the IP datagram and checks the destination IP address. It then looks up the IP address corresponding to the requested www.google.com on its DNS server.

Campus network 68.80.2.0/24

+-------------------+    +--------+     /------\
| Local DNS Server  |<-->| Switch |<-->| Router | 68.85.2.1
| 68.85.2.2         |    +--------+     \------/  00:22:6B:45:1F:1B
| 00:07:89:1C:43:2F |                       ^
+-------------------+                       |
                                            |
############################################|#####
KT network 68.80.0.0/13                     |
                                            v
                  +---------------+     /-------\
                  | KT DNS Server |<-->| Routers |
                  +---------------+     \-------/

If there is no matching record on KT’s own DNS server, it has to send a request to a root DNS server. First it decides on the output interface and sends the DNS query frame toward the root DNS server’s network. The root DNS server extracts the query message, sees that the top-level domain (TLD) of www.google.com is .com, and responds with the IP address of the .com TLD server.

Campus network 68.80.2.0/24

+-------------------+    +--------+     /------\
| Local DNS Server  |<-->| Switch |<-->| Router | 68.85.2.1
| 68.85.2.2         |    +--------+     \------/  00:22:6B:45:1F:1B
| 00:07:89:1C:43:2F |                       ^
+-------------------+                       |
                                            |
############################################|#####
KT network 68.80.0.0/13                     |
                                            v
                                        /-------\
                                       | Routers |
                                        \-------/
                                            ^
                                            |
############################################|#####
Root DNS Server network                     |
                                            v
                +-----------------+     /-------\
                | Root DNS Server |<-->| Routers |
                +-----------------+     \-------/

After receiving the response, the local DNS server sends a DNS query to the .com TLD server. The .com TLD server looks for a DNS resource record containing the IP address. If the TLD server does not know the IP address of www.google.com either, it responds with the IP address of Google’s name server.

Campus network 68.80.2.0/24

+-------------------+    +--------+     /------\
| Local DNS Server  |<-->| Switch |<-->| Router | 68.85.2.1
| 68.85.2.2         |    +--------+     \------/  00:22:6B:45:1F:1B
| 00:07:89:1C:43:2F |                       ^
+-------------------+                       |
                                            |
############################################|#####
KT network 68.80.0.0/13                    ...
############################################|#####
TLD DNS Server network                      |
                                            v
             +--------------------+     /-------\
             | com TLD DNS Server |<-->| Routers |
             +--------------------+     \-------/

Google’s name server responds with the IP address of www.google.com, using data that maps hostnames to IP addresses. Because Google’s servers are overseas, the response has to cross a submarine cable connected to KT’s network and pass through the ISP network in the country where Google’s servers are located. (A map of submarine cables is available at TeleGeography Submarine Cable Map.) In the United States there are ISPs such as AT&T and Comcast, and Google uses the network of Google Fiber, the ISP it operates itself.

Campus network 68.80.2.0/24

+-------------------+    +--------+     /------\
| Local DNS Server  |<-->| Switch |<-->| Router | 68.85.2.1
| 68.85.2.2         |    +--------+     \------/  00:22:6B:45:1F:1B
| 00:07:89:1C:43:2F |                       ^
+-------------------+                       |
                                            |
############################################|#####
KT network 68.80.0.0/13                    ...
############################################|#####
Google Fiber network 172.80.0.0/13         ...
############################################|#####
Google network 172.217.20.0/19              |
                                            v
               +------------------+     /-------\
               | Auth. DNS Server |<-->| Routers |
               +------------------+     \-------/

The client extracts the IP address of www.google.com from the DNS message. Now the client is ready to contact the www.google.com server.

Web client-server interaction: TCP and HTTP

The client creates a TCP socket to send an HTTP GET message. Before that, it has to establish a TCP connection with www.google.com through the three-way handshake. The client first creates a TCP SYN segment, sets the destination port to port 80 (HTTP), and sends the frame toward the switch. The frame carries the destination MAC address (00:22:6B:45:1F:1B).

Routers on the campus network, KT’s network, Google Fiber’s network, and Google’s network look at their forwarding tables and send the datagram carrying the TCP SYN to the www.google.com web server. Packets carried over the inter-domain links between the ISP’s network and Google’s network are governed by BGP (Border Gateway Protocol).

Google’s actual network is surely more complex, with firewalls, load balancers, API gateways, and the like. But Google’s network architecture is not the core point here, so I simplified it.

Campus network 68.80.2.0/24

+-------------------+   +------------+    +----+    +--------+     /------\
| Client (Laptop)   |---| Seq Num: 1 |--->| AP |--->| Switch |--->| Router | 68.85.2.1
| 68.85.2.101       |   | SYN        |    +----+    +--------+     \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |   +------------+                                 |
+-------------------+                                                  |
                                                                      |
#######################################################################|#####
KT network 68.80.0.0/13                                               ...
#######################################################################|#####
Google Fiber network 172.80.0.0/13                                    ...
#######################################################################|#####
Google network 172.217.20.0/19                                         |
                                                                       v
                                            +----------------+     /-------\
                                            | Web Server     |<---| Routers |
                                            | 172.217.25.228 |     \-------/
                                            +----------------+

The SYN message that reaches www.google.com is extracted from the datagram and demultiplexed to port 80. Google’s HTTP server creates a connection socket for its TCP connection with the client, then generates a TCP SYN-ACK segment and sends it to the client.

Campus network 68.80.2.0/24

+-------------------+    +----+    +--------+     /------\
| Client (Laptop)   |<---| AP |<---| Switch |<---| Router | 68.85.2.1
| 68.85.2.101       |    +----+    +--------+     \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |                                 ^
+-------------------+                                 |
                                                      |
######################################################|#####
KT network 68.80.0.0/13                              ...
######################################################|#####
Google Fiber network 172.80.0.0/13                   ...
######################################################|#####
Google network 172.217.20.0/19                        |
                                                      |
          +----------------+   +------------+     /-------\
          | Web Server     |---| Seq Num: 5 |--->| Routers |
          | 172.217.25.228 |   | Ack Num: 2 |     \-------/
          +----------------+   | SYNACK     |
                               +------------+

The datagram carrying the TCP SYN-ACK segment arrives at the client. The operating system demultiplexes the datagram to the TCP socket that was created earlier. After receiving the SYN-ACK segment, the client sends back a TCP ACK and the TCP connection is established.

Campus network 68.80.2.0/24

+-------------------+   +------------+    +----+    +--------+     /------\
| Client (Laptop)   |---| Seq Num: 2 |--->| AP |--->| Switch |--->| Router | 68.85.2.1
| 68.85.2.101       |   | Ack Num: 6 |    +----+    +--------+     \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |   | ACK        |                                 |
+-------------------+   +------------+                                 |
                                                                       |
#######################################################################|#####
KT network 68.80.0.0/13                                               ...
#######################################################################|#####
Google Fiber network 172.80.0.0/13                                    ...
#######################################################################|#####
Google network 172.217.20.0/19                                         |
                                                                       v
                                            +----------------+     /-------\
                                            | Web Server     |<---| Routers |
                                            | 172.217.25.228 |     \-------/
                                            +----------------+

Because Google’s web server uses HTTPS, a TLS handshake has to happen as well. First the client generates a client random string and sends it to the server in a ClientHello message. The server receives the ClientHello and responds with a ServerHello message containing its SSL certificate and a server random string.

The client validates the SSL certificate it received from the server and confirms that the server it is connected to really belongs to www.google.com. It then sends the server a random string encrypted with the server’s public key. This string is called the premaster secret, and only the Google server’s private key can decrypt it.

The server receives the premaster secret from the client and decrypts it with its private key. The server and the client each generate a session key from the server random, the client random, and the premaster secret. The server’s session key and the client’s session key must be identical.

Finally, the client and server exchange Finished messages encrypted with the session key and complete the TLS handshake. Now the client and server can communicate over TLS with symmetric-key encryption using that session key.

The client’s socket is ready to send data to www.google.com. The client’s browser creates an HTTP GET message and puts the URL into it. The HTTP GET message is written to the socket and becomes the payload of the TCP segment. The TCP segment is wrapped in a datagram and sent to www.google.com. In an actual packet capture, the contents cannot be inspected because TLS encrypts them. (See How is HTTPS different?.)

Campus network 68.80.2.0/24

+-------------------+   +----------------------+    +----+    +--------+     /------\
| Client (Laptop)   |---| GET / HTTP/2         |--->| AP |--->| Switch |--->| Router | 68.85.2.1
| 68.85.2.101       |   | Host: www.google.com |    +----+    +--------+     \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |   | ...                  |                                 |
+-------------------+   +----------------------+                                 |
                                                                                 |
#################################################################################|#####
KT network 68.80.0.0/13                                                         ...
#################################################################################|#####
Google Fiber network 172.80.0.0/13                                              ...
#################################################################################|#####
Google network 172.217.20.0/19                                                   |
                                                                                 v
                                                      +----------------+     /-------\
                                                      | Web Server     |<---| Routers |
                                                      | 172.217.25.228 |     \-------/
                                                      +----------------+

The HTTP server for www.google.com reads the HTTP GET message from the TCP socket and creates an HTTP response message. It places the requested content in the body of the HTTP response message and sends it over the TCP socket.

The datagram carrying the HTTP response message heads back to the campus network and arrives at the client. The client’s browser reads the HTTP response message from the socket, extracts the HTML from the body, and renders the web page.

Campus network 68.80.2.0/24

+-------------------+     +----+     +--------+      /------\
| Client (Laptop)   |<----| AP |<----| Switch |<----| Router | 68.85.2.1
| 68.85.2.101       |     +----+     +--------+      \------/  00:22:6B:45:1F:1B
| 00:16:D3:23:68:8A |                                    ^
+-------------------+                                    |
                                                         |
#########################################################|#####
KT network 68.80.0.0/13                                 ...
#########################################################|#####
Google Fiber network 172.80.0.0/13                      ...
#########################################################|#####
Google network 172.217.20.0/19                           |
                                                         |
+----------------+   +-------------------------+     /-------\
| Web Server     |---| HTTP/2 200 OK           |--->| Routers |
| 172.217.25.228 |   | Content-Type: text/html |     \-------/
+----------------+   | ...                     |
                     | <html>...</html>        |
                     +-------------------------+

At last, Google appears on the client’s browser screen.

References