Unicode = an international standard which assigns a number to roughly 150,000 letters and symbols.

UTF = Unicode Transformation Format

IP = Internet Protocol

DNS = Domain Name System

UDP = User Datagram Protocol

HTTP = Hypertext Transfer Protocol

TLS = Transport Layer Security

TCP = Transmission Control Protocol

THe Work That Makes The Web

By Logan Lembke

P-r-o-m-p-t-z-i-n-e-.-c-o-m

The letters clack back and forth while I settle into another week of work. With a single, sturdy thwack — Enter — an avalanche of electricity hurdles across the planet. On most days, I’d yawn. Monday mornings have never been easy for me, and using the internet is just another mundane part of life. Nonetheless, behind this bleary-eyed morning ritual is a complex story of technological innovation.

First, the web browser has to take the link and stash it into the device’s memory.

This memory could be a mix of capacitors, transistors, and magnetic devices, but each kind stores data as sequences of on-and-off signals known as bits. As a result, the web browser has to use an encoding scheme to store the link as a sequence of bits.

The most common encoding schemes for mapping Unicode numbers to sequences of bits are UTF-16 and UTF-8.

UTF-16 is commonly used by Microsoft Windows

UTF-8 is used in most other spaces

Importantly, UTF-8 is backwards compatible with ASCII, an older encoding scheme dating back to 1967.

Once the link has been stored into memory, the web browser needs to find the IP address of the machine responsible for hosting the webpage.

To accomplish this, the web browser queries the DNS via a resolver.

A DNS resolver is a server which takes in DNS requests, contacts other DNS servers, and responds back to the requests with the results it has found.

In order to contact the DNS resolver, the operating system needs to sandwich the domain name with a new set of bits to form a DNS request.

For example, the field called “query type” is always set to 00000001 when searching for a website’s IPv4 address.

The first publication which laid out the format for DNS requests was released in 1987 under the title, Request For Comments (RFC) 1035.

Now, the DNS request needs to be packaged up for delivery.

A UDP header is added to the front of the newly formed DNS request.

This header includes the port numbers (which help each machine route the data to the correct applications) and a field which lets the receiver know if the data has been transmitted correctly.

Similarly, an IP header is tacked on to the front of the UDP header.

This new header contains the IP addresses of the DNS resolver and the device making the request. While the IP header contains a handful of other details, one important field is the “differentiated services” field. This field helps machines prioritize transmitting certain types of traffic over others (like video conferencing).

Internet Experiment Note (IEN) 88 was the first document to define the UDP header in 1979.

The layout for the IPv4 header can first be found in IEN 54 from 1978, but it wasn’t formalized until 1981 with RFC 791.

If the system is connected via a gigabit Ethernet cable, the IP packet must be wrapped up inside an Ethernet frame.

The Ethernet frame adds its own fields, including source and destination addresses, signals to help each receiver understand the rate at which the data is being transmitted, and optional tags which help segregate and prioritize communications.

The most popular format for Ethernet frames dates back to 1982 in a joint publication from DEC, Intel, and Xerox.

The data is now ready to be sent.

Gigabit Ethernet works by sending eight bits of data at a time. Each set of eight bits is mapped to a set of four voltages, one per pair of wires. In turn, each of these voltages are set to take on one of five predetermined levels.

However, which set of voltages corresponds to which set of bit patterns changes over time.

While this makes determining what is happening on the wire more difficult, it aids in preventing radio interference and error correction.

In 1999, this method of sending data was standardized as 802.3ab by IEEE.

Next, the browser waits to hear back from the DNS resolver.

Once it obtains an IP address for the machine serving the webpage, it will create an HTTP request for the webpage and wrap it up in yet another set of protocols.

This time around, instead of UDP, the request will be wrapped up with TLS and TCP.

Finally, after transmitting the web request and waiting to hear back, the browser will begin displaying the page.

In our day-to-day lives, the complexity and technologies that make the web possible are all too easy to overlook. Hopefully, the next time you would yawn and hit — Enter — to start your day like me, you’ll take a moment to admire the systems at work behind the scenes.

Previous
Previous

AC Hunter Community Edition

Next
Next

Active Countermeasures CTF