Over the past two months attackers have been abusing a feature of the HTTP/2 web communication protocol that makes web application servers, load balancers, and web proxies vulnerable to distributed denial-of-service (DDoS) attacks of unprecedented scale. Google, AWS, Cloudflare, and other major cloud infrastructure providers, as well as web server vendors have been working on mitigation strategies and patches in private groups until the weakness was disclosed today.
The newly dubbed HTTP/2 Rapid Reset DDoS attacks take advantage of the stream multiplexing capability of the HTTP/2 protocol that allows multiple HTTP requests to be sent in parallel over the same TCP transport connection, and in particular the ability of the clients to unilaterally reset those streams. The issue is tracked as CVE-2023-44487 and organizations should check if their web server and load balancer providers have patches available or mitigation recommendations.
Stream multiplexing makes DDoS attacks more efficient
In the old HTTP version 1, which is still supported by most servers and web clients, multiple requests can be sent over a single TCP connection, but they are sent serially and the server processes and responds to them in the order they were received.
In HTTP/2, multiple requests called streams that are made up of frames such as HEADERS or DATA can be sent over a TCP connection concurrently and out of order. That’s because each stream has an ID associated with it, so the server will always know which stream a frame is part of and how to respond. This is known as stream multiplexing and allows for more efficient use of TCP connections and speeds up the page load times.
Imagine a modern web page that has a multitude of resources, third-party scripts, and images loaded from different locations. A browser accessing such a page over HTTP/2 will immediately start loading those resources in parallel, prioritizing those that are in the user’s view. If the user immediately clicks on a button and navigates away from the page, the browser can close the streams even if the resources haven’t fully loaded or rendered without closing the entire connection and open new requests.
“Since late 2021, the majority of Layer 7 DDoS attacks we’ve observed across Google first-party services and Google Cloud projects protected by Cloud Armor have been based on HTTP/2, both by number of attacks and by peak request rates,” Google engineers said in a blog post explaining the new attack. “A primary design goal of HTTP/2 was efficiency, and unfortunately the features that make HTTP/2 more efficient for legitimate clients can also be used to make DDoS attacks more efficient.”
Bypassing concurrent stream limits with Rapid Resets
Since a server needs to consume CPU cycles and memory to process each frame and stream, the possibility of abusing concurrent streams to exhaust a server’s resources, and therefore cause a denial-of-service condition, has been obvious to the protocol developers from the start. That’s why they added a setting called SETTINGS_MAX_CONCURRENT_STREAMS that the server will communicate to endpoint clients during the first connection via a SETTINGS frame.
By default the value of this setting is unlimited, but the protocol designers recommend that it shouldn’t be lower than 100 to maintain efficient parallelism. Because of this, in practice, many clients don’t wait for the SETTINGS frame and just assume a minimum limit of 100 and send 100 frames from the start.
The issue comes with another feature called RST_STREAM which stands for “reset stream.” This is a type of frame that a client can send to a server to indicate that a previously opened stream ID should be canceled. This allows the client to cancel in-flight requests for resources that are no longer needed, for example because the user clicked away from the page before a resource loaded. It is useful because it tells the server to stop responding to a previous request and not waste bandwidth.
However, there’s a catch. By sending a RST_STREAM frame the targeted stream is no longer counted toward the maximum concurrent streams limit, so the client can immediately open a new stream after sending a reset for a previous one. This means that even with a limit of concurrent streams of 100, the client can open and reset hundreds of streams over the same TCP connection in quick succession.
The server still needs to spend resources to process RST_STREAM frames. Even if it’s not much, with millions of requests it quickly adds up. Using this technique, attackers have managed to launch DDoS attacks of unprecedented scale against servers hosted by Google, Cloudflare, and AWS.
“When an HTTP/2 server is able to process client-sent RST_STREAM frames and tear down state quickly enough, such rapid resets do not cause a problem,” the Cloudflare engineers said in their report. “Where issues start to crop up is when there is any kind of delay or lag in tidying up. The client can churn through so many requests that a backlog of work accumulates, resulting in excess consumption of resources on the server.”
The largest HTTP/2 Rapid Reset attack seen by Google peaked at over 398 million requests per second (rps), By comparison, the biggest attack seen by the company in 2022 peaked at 46 million rps. The attack that hit Cloudflare in August peaked at 201 million rps, three times bigger than the largest DDoS attack the company previously detected. This new HTTP/2 Rapid Reset attack was launched from a botnet of only 22,000 computers, which is small compared to other botnets.
Multiple HTTP/2 DDoS attack variations
The attacks using the new HTTP/2 technique continue, and Google has seen multiple variants, some of which are probably in response to mitigations. For example, one attack variant opened and reset streams in batches, waiting before sending the RST_STREAM frames and then opening another batch. This is likely meant to defeat mitigations that rely on detecting high numbers of RST_STREAM frames over the same TCP connection and closing the connection as a response.
“These attacks lose the main advantage of the canceling attacks by not maximizing connection utilization, but still have some implementation efficiencies over standard HTTP/2 DDoS attacks,” the Google engineers said. “But this variant does mean that any mitigation based on rate-limiting stream cancellations should set fairly strict limits to be effective.”
Another variation doesn’t use RST_STREAM cancellations at all and instead tries to open as many concurrent streams as possible, ignoring the limit advertised by the server. The HTTP/2 standard says that in this case, the streams over the limit should be invalidated by the server, but the full TCP connection should not be canceled. So this attack variation allows attackers to keep the requests pipeline full at all times.
“We don’t expect that simply blocking individual requests is a viable mitigation against this class of attacks — instead the entire TCP connection needs to be closed when abuse is detected,” the Google engineers said.
Mitigations and patches for HTTP/2 DDoS attacks
The mitigation strategies against these attacks are not simple because there are legitimate uses for RST_STREAM cancellations, so each server owner needs to decide when an abuse is taking place and how harsh the response should be based on connection statistics and business logic. For example, if a TCP connection has more than 100 requests and the client cancels over 50% of those, the connection could potentially be viewed as abusive. Responses could range from sending forceful GOAWAY frames or closing the TCP connection immediately.
Another response could be to block an offending IP address from accessing the service over HTTP/2 and relegating it to HTTP 1.x only temporarily. The problem with IP filters is that multiple clients can share the same IP address and not all might be malicious. By limiting the requests to HTTP 1.x, the non-malicious clients behind a filtered IP will still be able to access the web service, even if they’ll experience a performance downgrade.
Developers of Nginx, a popular reverse proxy and load balancer, also provided mitigations that rely on specific features that the server already has implemented such as keepalive_requests, limit_conn and limit_req. They will also prepare a patch over the coming days that will further limit the impact of such attacks.
Microsoft, AWS, F5 and other infrastructure companies and web server or load balancing software developers have posted mitigations or patches. Users can follow the official entry in the CVE tracker for links with updated responses from vendors.