Web Application Firewalls (WAFs) are widely deployed classification engines designed to filter malicious HTTP traffic, providing a scalable layer of automated threat management against common web vulnerabilities such as SQL injection and Cross-Site Scripting (XSS).
However, their efficacy is fundamentally constrained by their reliance on syntactic pattern matching and the assumption of consistent request parsing across the web stack.
Over the past two decades, attackers have systematically exploited parsing discrepancies between intermediaries — such as reverse proxies, WAFs, and backend application frameworks — to evade detection.
This paper traces the evolution of these attacks from the early 2000s, when Linhart et al. first documented HTTP Request Smuggling (HRS), to James Kettle’s 2020 revival of the technique, to the recent WAFFLED study, which reveals how content-level parsing inconsistencies in multipart/form-data, application/json, and application/xml enable stealthy bypasses.
We then examine Bruno Mendes’ autonomous pentest, which weaponizes HTTP Parameter Pollution (HPP) to bypass even strict WAFs. We demonstrate a clear lineage: from request smuggling, to semantic smuggling, to structure-aware evasion — each exploiting deeper layers of parsing divergence. Looking ahead, we argue that AI-augmented, grammar-driven fuzzing will accelerate the discovery of such discrepancies, making parsing fidelity a critical security property. In response, we reaffirm strict RFC compliance and request normalization — exemplified by tools like HTTP-Normalizer — as the only sustainable defense.
This work unifies decades of offensive security research under a single principle:
Differential parsing is a first-class attack surface.
Web Application Firewalls (WAFs) have become ubiquitous in modern web infrastructure, acting as gatekeepers between clients and applications by filtering malicious traffic based on predefined rules. Despite their widespread adoption, WAFs are not immune to evasion. Traditional bypass techniques — such as payload obfuscation, encoding, or character insertion — have long been known and increasingly exploited [Pickles, 2025].
The real threat, however, has never been obfuscation — it has been differential parsing: the exploitation of inconsistencies in how different components in the request chain interpret HTTP syntax.
This attack surface is not new. As early as 2005, Linhart et al. documented HTTP Request Smuggling (HRS), demonstrating how discrepancies in Content-Length and Transfer-Encoding parsing could allow request smuggling past intermediaries [Linhart et al., 2005]. Just four years later, Stefano di Paola and Luca Carettoni introduced HTTP Parameter Pollution (HPP) at OWASP EU09, showing how duplicate parameters could be used to bypass input validation by exploiting inconsistent framework behavior [di Paola & Carettoni, 2009].
For years, these techniques remained niche — limited by the speed and creativity of human researchers. But in 2020, James Kettle reignited the field with his seminal work on HTTP Desync Attacks [Kettle, 2020]. Kettle did not invent HRS — but he reintroduced it, systematized it, and demonstrated its real-world impact against modern cloud architectures.
This paper traces that evolution:
We conclude by reaffirming HTTP-Normalizer as the only viable defense: enforce strict RFC compliance.
While Linhart et al. (2005) first documented HTTP Request Smuggling, their work remained largely theoretical and underappreciated in the broader security community. For over a decade, HRS was treated as a curiosity - an edge case of HTTP parsing, not a practical attack vector.
That changed in 2020, when James Kettle published HTTP Desync Attacks: Smuggling Without the Smuggle [Kettle, 2020]. Kettle’s work was not a discovery - it was a reintroduction and weaponization of HRS for the modern web.
Kettle’s key insight was that modern web stacks - cloud providers, CDNs, reverse proxies - had reintroduced the very parsing ambiguities that made HRS possible. He demonstrated this across:
His methodology was rigorous:
Kettle didn’t just show that HRS was possible - he showed it was widespread, exploitable, and devastating.
Kettle’s paper did more than expose vulnerabilities - it reframed WAF security. He showed that:
This insight laid the foundation for all subsequent work on parsing discrepancies, including WAFFLED and HPP.
As Kettle himself noted:
“The problem isn’t that people are bad at parsing HTTP - it’s that they’re inconsistent.”
The history of WAF evasion is not a series of isolated tricks - it is the unfolding of a single vulnerability class: differential parsing. At its core, every successful WAF bypass exploits a semantic gap between how a WAF interprets an HTTP request and how the backend framework or interpreter processes it.
This paper unifies three generations of attacks under this principle:
Each attack shifts focus deeper into the request structure, but all rely on the same mechanism: the WAF sees a benign request; the backend sees a malicious one.
| Layer | Attack Type | Parsing Discrepancy | Example | WAF Interpretation | Web Server Interpretation |
|---|---|---|---|---|---|
| Message Framing | HRS (Linhart/Kettle) | Content-LengthvsTransfer-Encoding | Chunked smuggling | Sees two requests | Sees one request with embedded payload |
| Content Structure | WAFFLED | Boundary continuation (boundary*0 = real-; boundary*1 = boundary) | Multipart smuggling | Uses boundary = fake | Concatenates boundary*0 + boundary*1→real - boundary |
| Parameter Semantics | HPP (Mendes) | Duplicate parameter handling | q=1'&q = alert(1)&q='2 | Scans each q in isolation | Concatenates → 1',alert(1),'2 → valid JS |
| Header Processing | HPACK Smuggling (Future) | HTTP/2 header compression | Compressed :path | Decodes to /admin | Decompresses to /etc/passwd |
| Cache Key Normalization | Cache Poisoning | Case sensitivity in query params | ?q = test vs ?Q = test | Treats as same key | Treats as different endpoints |
This table reveals a spectrum of parsing divergence, from syntactic framing to semantic execution. The attacker’s goal is always the same: to make the WAF parse less than the backend.
All three attacks - HRS, WAFFLED, HPP - are forms of semantic smuggling:
The evasion comes not from hiding the payload, but from structuring the request so that the WAF and backend disagree on what the payload is.
In a recent pentest, Bruno Mendes [Mendes, 2025] demonstrated how HPP can bypass even strict WAF configurations on an ASP.NET application:
GET /search?q=1'&q=alert(1)&q='2 HTTP/1.1
userInput = '1',alert(1),'2';
Like WAFFLED, this is not obfuscation — it’s protocol-level smuggling, where the attack is assembled only at the application layer.
Historical Note: While this paper focuses on a recently published variant, HTTP Parameter Pollution (HPP) was first introduced as an inspection evasion method by Stefano di Paola and Luca Carettoni at the OWASP EU09 conference in 2009 [di Paola & Carettoni, 2009]. Their work laid the foundation for understanding how inconsistent parameter handling could be exploited, a principle now weaponized in modern WAF bypasses.
While HRS targeted message framing, WAFFLED [Jabiyev et al., 2024] shifts focus to content parsing - how WAFs and web frameworks interpret structured payloads like JSON, XML, and multipart forms.
WAFFLED observes that:
Crucially, the payload remains unmodified. The evasion comes not from obfuscation, but from manipulating the structure around it.
WAFFLED uses a grammar-based fuzzer (T-Reqs) to generate syntactically valid but semantically tricky requests. It mutates:
It then tests whether:
POST / HTTP/1.1
Host: victim.com
Content-Type: multipart/form-data;
boundary=fake;boundary*0=real-;boundary*1=boundary
--fake
Content-Disposition: form-data; name="field1"
value1
--fake--
--real-boundary
Content-Disposition: form-data; name="xss"
<script>alert(1)</script>
--real-boundary--
This is not payload obfuscation - it’s parser confusion.
The discovery of 1,207 bypasses in WAFFLED and the widespread failure on HPP suggest a universal WAF failure. But a closer look reveals a more nuanced truth: not all WAFs are equally vulnerable, and architecture determines resilience.
While many WAFs fail due to shallow, inline parsing, others resist these attacks by adopting a layered, protocol-aware parsing model.
The vulnerability often lies not in the WAF itself, but in the parsing behavior of the preceding reverse proxy or load balancer.
In most production architectures, the request chain is:
Client → [Reverse Proxy] → [WAF] → [Application]
The reverse proxy (e.g., NGINX, AWS ALB, Azure Application Gateway) terminates TLS, reassembles chunked requests, and normalizes headers and paths before forwarding the request to the WAF.
This creates a three-layer parsing chain, where:
If the proxy parses differently than the backend, the WAF inherits the gap - even if it parses correctly.
| Layer | Proxy Behaviour | WAF Consequence |
|---|---|---|
| Message Framing | Misreads Transfer-Encoding, fails to reassemble chunks | WAF sees a clean request; HRS bypass succeeds |
| Content Structure | Skips multipart/form-data parsing, treats body as opaque | WAF scans flat string; misses smuggled XSS |
| Parameter Semantics | Passes duplicate parameters unmodified | WAF sees fragments; backend concatenates (ASP.NET) → HPP bypass |
This explains why AWS WAF, Azure WAF, and NGINX App Protect — despite robust rule engines — are vulnerable to WAFFLED and HPP:
The proxy (ALB, Application Gateway, NGINX) does not normalize the request in a way that exposes the ambiguity to the WAF.
F5’s Advanced WAF (AWAF) exemplifies a different architectural philosophy: deep, multi-layered, protocol-aware parsing.
Rather than a single inspection pass, AWAF uses the Traffic Management Microkernel (TMM) to:
This layered approach allows AWAF to understand the structure of the request - not just scan it - making it inherently more resilient to content-level parsing discrepancies like those exploited in WAFFLED.
AWAF treats the HTTP request as a structured object, breaking it down into discrete, analyzable components:
As a result, WAFFLED-style attacks — where the WAF misparses the boundary and misses the payload — are unlikely to succeed. AWAF sees the same boundary the backend does, and inspects the same payload.
“Canonicalization is not trivial — it’s a critical part of security logic.”
— Peter Scheffler, F5 [Scheffler, 2025]
When it comes to HTTP Parameter Pollution (HPP), F5’s architecture separates security enforcement from semantic emulation - a distinction that is both technically sound and operationally critical.
F5 AWAF allows administrators to enforce a strict security policy on duplicate parameters via the WAF policy configuration:
This is the recommended default for most applications, as it:
As Peter Scheffler notes:
“If your WAF is making assumptions about how the app interprets paths — you’re flying blind.”
Blocking duplicate parameters is not a parsing decision — it’s a security decision. And in the absence of a known, required backend behavior, rejecting ambiguity is the only safe choice.
In cases where an application explicitly requires duplicate parameters (e.g., legacy ASP.NET apps), blocking is not an option. Here, F5’s architecture enables a second, advanced layer: semantic emulation via iRules.
Unlike the WAF policy — which operates on isolated parameters — an iRule can pre-process the request to mimic the backend’s parsing behavior before inspection.
This is not “parsing” — it’s emulation. And it’s where F5’s flexibility shines.
when HTTP_REQUEST {
#List of parameters that should be comma-concatenated (ASP.NET behavior)
set target_params [list "q" "search" "input"]
# Rebuild query string
set new_query ""
foreach {name value} [split [HTTP::query] &] {
if { $name eq "" } { continue }
# URL decode the name for comparison
set decoded_name [URI::decode $name]
if { [lsearch $target_params $decoded_name] != -1 } {
# Collect all values for this parameter
set all_values [list]
foreach {k v} [split [HTTP::query] &] {
if { [URI::decode $k] eq $decoded_name } {
append all_values [URI::decode $v]
}
}
# Concatenate with comma (ASP.NET behavior)
set combined_value [join $all_values ","]
# Avoid duplicates in the new query
if { [lsearch $target_params $decoded_name] == [lsearch -exact $target_params $decoded_name] } {
append new_query "&$name=$combined_value"
}
} else {
# Pass through other params unchanged
append new_query "&$name=$value"
}
}
# Remove leading & if present
if { [string length $new_query] > 0 } {
set new_query [string range $new_query 1 end]
}
# Rewrite the query string
if { $new_query ne [HTTP::query] } {
HTTP::query $new_query
log local0. "HPP: Normalized query to '$new_query' (ASP.NET emulation)"
}
}
This ensures that XSS rules inspect the final, concatenated value, not fragmented pieces.
Note: This iRule should be applied only to specific applications, not globally.
Even F5’s robust architecture cannot guarantee 100% immunity. Why?
Thus, while F5 AWAF represents a best-in-class defense, it remains a classification device — and all classifiers can be fooled by adversarial input.
But unlike many WAFs, F5 gives defenders the tools to close the gap:
As Peter Scheffler (F5) emphasizes:
“Canonicalization is not trivial — it’s a critical part of security logic.”
And in the age of AI-driven evasion, only a platform that supports both policy and emulation can hope to keep pace.
The evolution of WAF evasion — from HRS to WAFFLED to HPP — reveals a consistent pattern: the attacker wins by exploiting a semantic gap between how the WAF thinks the request will be parsed and how the backend actually parses it.
Traditional defenses have responded with stricter syntax enforcement (e.g., HTTP-Normalizer) and deeper parsing layers (e.g., F5 AWAF). These are essential, but they are still reactive and static — they rely on known RFCs, fixed parser logic, and manual rule updates.
The next frontier is not just deep parsing, but intelligent parsing — a new class of AI-augmented semantic parsers that go beyond syntax to understand, predict, and emulate how a backend will interpret a request.
Even the most robust WAFs today operate within a closed-world assumption: they parse based on:
This works well for known discrepancies, but fails when:
As Bruno Mendes’ hackbot demonstrated, heuristic-driven automation can bypass even ML-based WAFs in seconds by exploiting subtle parsing differences that static rules cannot anticipate.
Thus, the future of WAFs must move beyond pattern matching and syntax normalization to semantic reasoning.
An AI-augmented semantic parser is not a replacement for RFC compliance, but a cognitive layer that enhances it. It combines:
Its goal: to simulate not just what the backend will parse, but how and why.
| Capability | Description |
|---|---|
| Framework Behavior Modeling | Trained on open-source parser code (e.g.,HttpUtility.ParseQueryString in ASP.NET), API documentation, and real-world traffic to predict how duplicate parameters, boundary continuations, or JSON field wrappers will be handled. |
| Cross-Format Generalization | Learns analogies across content types (e.g., boundary confusion in multipart↔ delimiter confusion in JSON↔ entity termination in XML). |
| Differential Parsing Prediction | Given a request, predicts whether the WAF and backend will interpret it differently - flagging high-risk requests before they reach the application. |
| Adaptive Normalization | Instead of applying one-size-fits-all rules, it normalizes based on the target framework(e.g., if the backend is ASP.NET, it simulates comma concatenation of q parameters). |
| Autonomous Fuzzing & Defense Co-Training | Uses reinforcement learning to generate adversarial requests, then uses the results to improve its own parsing logic - creating a self-hardening system. |
The breakthrough is not the attack — it’s the automation.
Large Language Models (LLMs), trained on:
…can now generate novel, RFC-adjacent bypasses with minimal prompting.
Prompt:
"Generate a multipart/form-data request that uses boundary*0 and boundary*1 continuation to define a real boundary that the WAF ignores but the backend concatenates. Use null bytes and linefeed removal to increase stealth. Target: Laravel backend, Cloudflare WAF."
AI Output:
POST /upload HTTP/1.1
Host: victim.com
Content-Type: multipart/form-data;
boundary=fake\x00;boundary*0=real-;boundary*1=boundary
--fake
Content-Disposition: form-data; name="file"
innocent
--fake--
--real-boundary
Content-Disposition: form-data; name="xss"
<script>alert(1)</script>
--real-boundary--
Why it works:
Prompt:
"Create a JSON-based XSS bypass using HTTP Parameter Pollution. The backend is ASP.NET and reflects user input in a JavaScript context. Use duplicate data parameters to split the payload and avoid WAF detection. Include string concatenation and new Function() for obfuscation."
AI Output:
GET /api/search HTTP/1.1
Host: victim.com
Content-Type: application/json
{"data":"1'"}&{"data":"asd=window"}&{"data":"def='al'+'ert'"}&{"data":"asd[def](1)"}
When reflected:
userInput = '1',asd=window,def='al'+'ert',asd[def](1);
→ Executes alert(1).
Why it works:
Prompt:
"Design a hybrid attack that combines HTTP Request Smuggling, HTTP Parameter Pollution, and multipart boundary confusion. The goal is to smuggle a request that uses HPP to bypass a WAF and deliver an XSS via a multipart body with a crafted boundary. Target: AWS WAF with OWASP CRS 3.0, backend: Spring Boot."
AI Output (simplified):
POST / HTTP/1.1
Host: victim.com
Content-Length: 150
Transfer-Encoding: chunked
0
GET /search?q=1'&q=alert(1)&q='2 HTTP/1.1
Content-Type: multipart/form-data; boundary=fake;boundary*0=real-;boundary*1=boundary
--real-boundary
Content-Disposition: form-data; name="xss"
<script>alert(1)</script>
--real-boundary--
Why it works:
Bruno Mendes’ autonomous pentest engine demonstrated that AI-driven systems can bypass even "unbeatable" WAFs in seconds. His hackbot:
This is not hypothetical - it is the present.
And the prompt is the new exploit.
The principle of differential parsing extends beyond HTTP/1.1. As web protocols evolve, so do the parsing attack surfaces.
HTTP/2 uses HPACK compression to reduce header size. A WAF that decompresses headers differently than the backend could allow smuggling.
gRPC uses Protocol Buffers, which allow field name-to-number mapping. A WAF that doesn’t decode protobufs may miss malicious fields.
GraphQL allows complex nested queries. A WAF that parses only the outer structure may miss malicious resolvers embedded in deep fields.
CDNs and reverse proxies often normalize cache keys (e.g., lowercase query params). If the WAF and CDN disagree, an attacker could poison the cache.
Given these realities, we propose Parsing Fidelity as the key metric for evaluating WAFs:
Parsing Fidelity = The degree to which a WAF’s parsing model matches the backend framework’s behavior.
A WAF with high parsing fidelity:
This aligns with HTTP-Normalizer, but pushes it further: the WAF should not just normalize — it should emulate.
If the attack is differential parsing, the defense must be uniform parsing.
The WAFFLED study [Jabiyev et al., 2024] proposes HTTP-Normalizer, a proxy that:
This is not a new idea - it’s a return to first principles: if security depends on consistent parsing, then the only safe parser is one that adheres strictly to standards.
But HTTP-Normalizer is not just theory. It has real-world analogs — most notably, F5’s Advanced WAF (AWAF).
F5 AWAF, with its Traffic Management Microkernel (TMM) and deep, multi-layered parsing model, embodies the HTTP-Normalizer philosophy in practice:
In this sense, F5 AWAF is not just a WAF — it is a production-ready HTTP-Normalizer, enforcing parsing fidelity at scale.
HTTP-Normalizer operates as a pre-WAF proxy or integrated parsing layer that:
This ensures that:
As Peter Scheffler (F5) notes:
“Canonicalization is not trivial - it’s a critical part of security logic.”
And in a world of AI-driven evasion, only strict normalization can prevent exploitation.
All 1,207 WAFFLED bypasses rely on non-compliant or edge-case parsing — requests that are syntactically valid enough to pass through a lenient WAF, but structured to confuse its parser.
By enforcing strict RFC adherence, HTTP-Normalizer:
The result: all known WAFFLED-style bypasses are prevented — not because the WAF has better rules, but because the attack surface is removed before inspection.
Key Insight: The cost of parsing flexibility is security debt.
Not by default - but it can be extended to do so.
RFC 3986 does not define how duplicate parameters should be handled. Different frameworks interpret them differently:
This means there is no single "correct" parsing behavior at the HTTP level — it’s framework-dependent.
However, HTTP-Normalizer (or a system like F5 AWAF) can be enhanced with a Parameter Normalization Policy:
| Policy | Description | Use case |
|---|---|---|
| Reject duplicates | Block any request with repeated parameters | Default, secure posture |
| Use first/last | Normalize to first or last occurrence | Legacy apps with array-like behavior |
| Simulate framework behavior | Concatenate with commas (ASP.NET), or build arrays (Node.js) | High-risk apps requiring deep emulation |
This would force uniformity between the WAF and the backend, closing the HPP gap.
F5’s Advanced WAF is not just a WAF - it is a practical implementation of the HTTP-Normalizer concept.
It demonstrates that:
And while it doesn’t automatically emulate ASP.NET-style parameter concatenation, it provides the tools to do so:
This makes F5 AWAF a model for how HTTP-Normalizer could be deployed — not as a standalone tool, but as an integrated, enterprise-grade parsing layer.
HTTP-Normalizer — whether as a standalone tool or embodied in systems like F5 AWAF — can be deployed as:
The goal is the same: eliminate parsing ambiguity before it can be exploited.
The history of WAF evasion is a story of escalating parsing sophistication:
WAFFLED and HPP represent a pivotal moment: the realization that content and parameter parsing are as critical as message framing. They prove that thousands of real, exploitable bypasses exist not because of broken rules, but because of broken parsing.
The future will bring AI-driven fuzzing that automates the discovery of such gaps. But the defense is clear: enforce RFC compliance. Tools like HTTP-Normalizer show that all known bypasses are preventable.
As web protocols grow more complex, the only sustainable security strategy is simplicity through standardization. The era of lenient, forgiving parsers must end - before the next generation of AI-powered attacks makes it impossible.
Get prepared — contact us today.