http/1.1 vs http/2

12 min readSep 18, 2022

Introduction

The HTTP or Hypertext Transfer Protocol is an application layer control protocol for transmitting hypermedia documents such as graphics, audio, video, plain text and hyperlinks on the World Wide Web. To obtain information from the target web server, the client makes a request and keeps the Connection open until it receives a response. HTTP is a stateless protocol that means the target server does not support information between two requests.

Since the evolvement of HTTP/1.1 in 1997, this version was the standardized communication protocol in RFC 2068 until the HTTP/2 protocol was introduced in 2015 by IETF (Internet Engineering Task Force). HTTP/2 protocol version was introduced to decrease latency, increase speed and enable complete request and response multiplexing, which means when the client opens a connection to the target web server, it can send multiple requests and receive multiple responses with a single line of communication.

Now, let’s dive deep into these two protocols and understand the main difference between HTTP/1.1 and HTTP/2 and the technical changes HTTP/2 has adopted in the modern World Wide Web.

What is HTTP/1.1?

In 1989 Timothy Berners Lee developed a communication protocol for the World Wide Web called Hypertext Transfer Protocol that exchanges data between a client and a remote web server. When a client visits a website on his browser, for example — www.google.com, it sends a request to the remote web server and waits until the server sends a response; in this communication, the server acknowledges it with an HTML page.

To send a particular request client calls methods like GET or POST, which later be interpreted by HTTP and processed to the remote web server.

GET /index.html HTTP/1.1
Host: www.google.com

The above HTTP request is an example of a GET method, and it requests the remote Host: to get the resources of www.google.com and the web server returns an HTML with all the necessary resources like CSS, js, images, etc. But all these contents are not returned to the client at once in a single request. To render the complete HTML page on the web browser, the client sends multiple requests to receive all the resources from the target web server; once the web browser receives all the resources, it displays the complete HTML web page on the screen.

What is HTTP/2?

HTTP/2 is a primary revision of HTTP/1.1 and derived from the SPDY protocol, which Google originally developed as a way to improve the web page loading performance by reducing the latency using the technique such as complete request and response multiplexing, efficient Compression of HTTP header fields, enabling request prioritization and server push.

The HTTP Working Group developed HTTP/2, also called as httpbis, of the Internet Engineering Task Force (IETF). The Working Group first presented the HTTP/2 protocol research to the Internet Engineering Steering Group (IESG) on December 2014, and later IESG approved it on February 17, 2015. The specification for HTTP/2 was published as RFC-7540 on May 14, 2015. The protocol is supported by many standardized browsers Chrome, Firefox, Safari, Opera and many more.

The architecture behind HTTP/2 is the binary framing layer which sends all the requests and responses in binary format and maintains HTTP semantics such as methods and headers. When the client makes a request using a high-level API, it still constructs the messages in traditional HTTP formats. Still, it’s essential to understand that all these messages are converted into binary at a low level. HTTP/2 version has many features and evolved significantly by transforming messages from plain text to binary, allowing this protocol to deliver the message faster and more efficiently.

As of now, we understand the typical difference between these two protocols and ensure that both protocols use the same HTTP semantics such as headers, methods and verbs. To send messages, HTTP/1.1 uses the plain text technique, and HTTP/2 transform the messages into binary. In the next section, we will briefly discuss how HTTP/1.1 uses different methods to send messages and what challenges occurred and were resolved by the HTTP/2 using the binary framing layer technique.

Head Of Line blocking in HTTP/1.1

In computer networking, Head-of-line blocking (HOL blocking) is a performance-limiting phenomenon that occurs when a group of packets are in the queue by the first packet in line. For example — when the client or browsers have restrictions on accessing a server, and a new request waits for the previous request to complete before it can be processed.

In HTTP/1.1 (HOL), blocking occurs because when a client opens a TCP connection, it sends multiple requests with the same Connection without waiting for a response. After all, HTTP/1.1 protocol creates persistent connections by default without specifying the Connection: keep-alive header that allows multiple requests to be sent over the same TCP connection using the pipeline technique. However, the pipeline improves the loading time performance of the webpage but also creates a HOL blocking issue because processing one request or response takes a longer time, which causes a delay in retransmitting other requests and responses.

HTTP/2 fixes HOL blocking issues using the binary framing layer technique that improves the connection efficiency and allows multiple connections to increase the flexibility of transmitting information to the desired destination. In the next section, we will learn more about the binary framing layer technique and its advantages.

HTTP/2 — Binary Framing Layer

The binary framing layer converts requests/responses into binary and breaks them into smaller chunks to form a bidirectional communication stream. HTTP/2 establishes a single TCP connection between the client and the server, and during the Connection, there are multiple streams of data transfers between two machines.

Each stream contains multiple messages in request/response form, which are divided into smaller units called Frames. HTTP/2 concurrently can open various streams within a single connection and can assign frames from other streams on the Connection. A Stream is a bidirectional flow of frames between two machines, and What can also share Streams between the client and the server.

The architecture behind HTTP/2 consists of a group of binary-encoded frames within a single communication channel that is tagged to a particular stream. Each frame is the smallest unit of communication that contains a specific type of data — For example, HTTP Headers, messages, etc. Those frames are from different streams interleaved and reassembled via an embedded stream identifier in each frame’s header. The interleaved requests and responses run in parallel without blocking the messages behind them and confirm no messages have to wait for another to finish, and this methodology is called multiplexing.

The multiplexing technique allows servers and clients to send concurrent requests and responses for more efficient connection management within a single TCP connection and also reduces the latency, improves the network and bandwidth utilization and lowers the operational cost throughout the network. To secure the communication throughout the channel, the client and the server can reuse the same secured session for multiple requests and responses because during TLS or SSL handshaking, both the machines agree to use a single key throughout the session; if the session ends a new key will generate for further communication that also improves the performance of HTTPS protocol.

As a final thought, multiplexing with a binary framing layer improves the performance of TCP communication and solves a couple of issues of the HTTP/1.1 protocol. But still, within a channel, awaited streams use the same resources that cause performance issues. In the next section, we will learn how stream prioritization optimizes application performance and manage resources.

HTTP/2 — Stream Prioritization

Stream Prioritization is a technique that allows customization of requests’ relative weight to optimize application performance and solves potential requests competition for the same resource. The prioritization technique is essential to understand better, so you can take benefit of this feature of HTTP/2.

The binary framing layer manages requests/responses into a parallel data stream. When a server receives concurrent client requests, it prioritizes the responses by setting a weight between 1 and 256 per stream. If the number is higher, then that is the higher priority. An Identifier specifies each stream’s dependency on another stream by the client, so it’s easier to identify the dependant of the stream. When there is no parent stream ID, the root stream is considered as the stream’s dependent.

The client constructs and communicates a prioritization tree by combining stream dependencies and their weight, showing how the client would prefer to receive the responses. The server utilizes this information to prioritize stream processing by controlling resource utilization, such as allocating CPU, memory, and other resources. When the response data is ready to send to the client, it ensures bandwidth allocation is better so it can able to deliver high-priority responses to the client. Streams that share the same parent should allocate resources in proportion to their weight.

Let’s see a few examples to get hands-on with stream allocation.

Examples

Author, Ilya Grigorik (Year). Title of the article. High-Performance Browser Networking, Stream Prioritization.
If stream A has a weight of 12 and its one sibling B has a weight of 4, then to determine the proportion of the resources that each of these streams should receive:
Sum all the weights: 4 + 12 = 16
Divide each stream weight by the total weight: A = 12/16, B = 4/16
Thus, stream A should receive three-quarters and stream B should receive one-quarter of available resources; stream B should receive one-third of the resources allocated to stream A. Let’s work through a few more hands-on examples:
Neither stream A nor B specify a parent dependency and are said to be dependent on the implicit “root stream”; A has a weight of 12, and B has a weight of 4. Thus, based on proportional weights: stream B should receive one-third of the resources allocated to stream A.
D depends on the root stream; C depends on D. Thus, D should receive full allocation of resources ahead of C. The weights are inconsequential because C’s dependency communicates a stronger preference.
D should receive full allocation of resources ahead of C; C should receive full allocation of resources ahead of A and B; stream B should receive one-third of the resources allocated to stream A.
D should receive full allocation of resources ahead of E and C; E and C should receive equal allocation ahead of A and B; A and B should receive a proportional allocation based on their weights.

As we saw in the above examples by Ilya Grigorik, resource prioritization is crucial in improving browsing performance in today’s modern web applications with the combination of stream dependencies and weights.

And also, HTTP/2 allows the client to change the preferences at any moment, enabling further enhancement to the browser, which means we can reallocate weights in response, and update the dependencies, to enhance the user browsing experience.

Resource Inlining vs Server Push

In every web application, when a client sends a GET request to the target web server in the first response, it only accepts the website’s index page. But to render the complete web page on the browser still requires fetching additional resources such as CSS, Javascript and other media files. So, the client makes further requests to bring other resources but increases latency and web page load time and consumes more bandwidth.

This section will discuss how HTTP/1.1 and HTTP/2 follow different methodologies to improve the web page loading time and reduces bandwidth consumption.

HTTP/1.1 — Resource Inlining

In HTTP/1.1, resource inlining is a technique to include the required resources with the HTML document so that the server can send all the resources in the response at the initial request. For example — To render the complete web page, the index.html file requires additional resources; inlining these resources with the index file will reduce the total number of network calls. This technique is only feasible for text-based resources. However, inline, a large file with an HTML document can significantly increase the HTML document’s size and reduces the network speed, delaying response to the client.

The major drawback of resource inline is that the client cannot separate the inlined resources from an HTML document or cache the resources in a particular state. And if every web page on the site is inline with the same resource, it will increase the HTML document’s size and reduce the web page loading time.

The resource inlining technique is not optimal for decreasing the web page loading time and increasing the connection speed. In the next section, we will learn how HTTP/2 uses server push methodology to optimize the web page connection.

HTTP/2 — Server Push

In HTTP/2, as we know, it sends multiple concurrent responses to the client’s requests. The target web server can send additional resources along with the HTML document without the client’s request for the resource, and this process is called server push. HTTP/2 even maintains the separation between pushed resources and HTML documents by using this technique, which resolves the issue of resource inlining as the client can either cache the resource or decline the pushed resources from the HTML document.

The technique behind this process is that the server will send a PUSH_PROMISE frame to notify the client that it will send a pushed resource. The frame only includes the header of the message, and the client knows which type of resource the server will push before sending the request. In this technique, if the client already cached the existing resource, it can decline the pushed resource by sending an RST_STREAM frame in the response. As the client knows ahead of time which resources the server will send, it holds the client from sending duplicate requests.

The client has complete control over server push that can adjust the priority or even disable the server push; whenever required, it will only send a SETTINGS frame to change the HTTP/2 component.

The server push has many features but is still not supported by the many web browsers that disable many critical components for the client, such as cancelling a cached resource, allowing duplicate resources, etc. And this technique should be used based on the requirement of the web application.

To read more on web application optimization and server push, you can check out the PRPL pattern developed by Google.

Header Compression

The most common message between the client and server in every HTTP transmission is the HTTP headers. These headers are always sent as plain text of around 500–800 bytes per transfer; if the header contains a cookie, the size can be up to kilobytes. To reduce the size of the HTTP messages, HTTP uses various compression algorithms so that the web application performance can be increased and transferring of messages can be faster.

HTTP/1.1 — gzip compression

In HTTP/1.1, a program like gzip compression software is used to compress the size of CSS and JavaScript files to reduce the data size. But the major problem in HTTP/1.1 is that the headers are always transferred in plain text, which increases the message’s size. When API has many different features and resources, such as Cookies and additional headers, the weight of the messages becomes heavy and increases the latency transfer rate.

To solve the header compression issue, HTTP/2 has a compression feature, HPACK, which we will discuss in the next section.

HTTP/2 — HPACK Compression

In HTTP/2, compressing the header is different; it divides the header from the data and keeps it in a header and data frame. To compress the header frame, HTTP/2 uses the compression software HPACK. The algorithm behind this compression program is Huffman coding which encodes the header metadata and reduces the size of the header.

The HPACK also keeps track of the previous header metadata, which will only compress the altered field in the subsequent request.

Request — 1
— — — — — —
method: GET
scheme: https
host: example.com
path: /blog
accept: /image/png
user-agent: Mozilla/5.0…Request — 2
— — — — — —
method: GET
scheme: https
host: example.com
path: /blog/17885642
accept: /image/png
user-agent: Mozilla/5.0…

In the above requests, the only path header uses different values that update dynamically in the subsequent request. So, in Request — 2, the HPACK will only compress the path header metadata and restore the most common and newly encoded path fields.

To learn more on HPACK, you can also refer to RFC7541.

Conclusions

There are many significant differences between HTTP/1.1 and HTTP/2 in terms of features and techniques to improve web application performance. As we’ve seen, both the protocols’ implementations to enhance the web are entirely different. HTTP/2 changes the web architecture by implementing various methods such as multiplexing, stream prioritization, flow control, server push and compression that transform the web application to the next level.

I hope you enjoyed reading this article, which gave you an insight into the difference between HTTP/1.1 and HTTP/2. If you think this article helped you in anyways then feel free to share it with your friends. If you think of anything I’ve missed, let me know with a comment below.