0xbaDc0dE MEV Bot Hack Analysis

1 year ago 132
BOOK THIS SPACE FOR AD
ARTICLE AD

On September 27th, 2022, a smart contract MEV bot was hacked on the Ethereum blockchain, losing around 1,101 WETH, which amounted to approximately $1.46m on the day of the exploit. Interestingly, the hack took place just 30 minutes after the MEV bot pulled off a notoriously profitable arbitration that earned it 804 WETH–a profit of around $1m for a single arbitrage transaction.

The MEV bot is known by the name 0xbad, which comes from the start of its blockchain address 0xbaDc0dE. These two transactions (the arbitrage and the hack) crowned a 75-day period of successful MEV transactions, “creating a beautiful display of on-chain karma”, as Rekt News would put it.

An interesting aspect of this hack is that, contrary to a normal DeFi protocol hack, the vulnerable smart contract source code is not verified and published on Etherscan. In other words, the high-level language used to develop the smart contract bot–Solidity, Vyper, or any other language compiling to EVM bytecode–is not available to us. Due to this, we’ll have to settle with on-chain investigation, namely looking into past transactions and the MEV bot’s compiled bytecode.

In this article, we will be analyzing the exploited vulnerability in the 0xbad smart contract without looking at any actual source code. We’ll analyze the massive profit arbitrage transaction and see what we can learn from there, but also try a few decompiling tools to help us reach the same conclusion as the hacker. At the end of this on-chain investigation, we will create our own version of the attack to drain all the MEV bot’s funds, testing it against a local fork. You can check the full PoC here.

This article was written by gmhacker.eth, an Immunefi Smart Contract Triager.

0xbad is a smart contract on Ethereum, which is labeled MEV Bot on Etherscan. What does that mean, though? A key feature of smart contracts is that they cannot execute by themselves; they need some external entity to trigger their code–an externally owned account (EOA) to start the transaction. So, an MEV bot will be some kind of off-chain software, with blockchain monitoring capabilities, which in certain conditions will trigger some complex logic on-chain–hence they need a smart contract counterpart deployed on the blockchain.

Now, MEV (Maximal Extractable Value) is an entire rabbit hole. It’s the capacity to extract profit by the reordering or addition of transactions on a block. There’s very beneficial MEV action on blockchains. Arbitrage, for example, brings about more efficiency to markets. But there’s also forms of MEV that are nefarious to users, like frontrunning and sandwich attacks, among other more complex forms. In its most essential form, MEV is economic incentives getting explored throughout the blockchain dark forest.

Going back to our bot. It is a smart contract on Ethereum, which is a public blockchain. What is stored and public on the blockchain state is compiled bytecode, which is not exactly as readable as the source code’s high-level language from which it was compiled. But usually, smart contracts in DeFi protocols have verified source code published on Etherscan. In the case of our bot, Etherscan presents us the following image:

0xbad bytecode

The trained eye, knowledgeable about EVM bytecode, can spot various different things like the typical 0x60806040 present on Solidity-compiled bytecode. But it’s unquestionably less readable. This presents a challenge: how do we investigate the smart contract, and how did the hacker figure out there was a vulnerability in the code?

We will try using some decompilation tools, but first let’s see what the famous arbitrage transaction has to say about the MEV Bot.

We won’t dive deep into the way the bot managed to make the big profit arbitrage. Halborn gave a brief explanation on that subject. In short, a UniswapV2 user attempted to swap $1.85m worth of cUSDC for USDC, but due to a lack of liquidity, this swap brought about a huge loss, netting the user $500 in USDC. The 0xbad MEV bot went on to take advantage of this huge slippage and followed up with a complex transaction that netted around $1m in profit, as we’ve mentioned already. You can see the transaction call trace here.

Transaction call trace on samczsun’s viewer

An immediately interesting thing to note in this transaction is that the entirety of protocol calling logic is being done through a delegatecall to another contract. After looking at other 0xbad transactions, we can see this is a common pattern, which means the 0xbad contract itself works as a proxy to a more complex implementation contract. Unfortunately, that contract, we’ll call it 0xDd6B, is also not verified.

The transaction viewer also provides the sload operations that were done on 0xbad. As we can see, 0xDd6B’s address was loaded from storage slot 0x00. If we use Dedaub’s online contract decompiler, we get the following inferred storage layout:

0xbad’s inferred storage layout, using Dedaub’s online decompiler

We know 0xDd6B’s address is on slot 0x00, meaning we are dealing with the variable stor_0_0_19. If we analyze the decompiled code further, we see its usage:

Usage of delegatecall in 0xbad

As we expected from the transaction, the variable read from slot 0x00 was used as the delegatecall address. It is clear that 0xbad proxies an undefined function call to its implementation contract. There are also checks done against msg.sender and other variables in storage. 0xbad borrows contract logic from 0xDd6B to work on its own contract storage, so even though 0xDd6B’s logic might be more interesting to investigate, we still should look at other storage values on the contract.

Snippet 1: Foundry test to read 0xbad storage

The above snippet is a Foundry test contract that will read 0xbad’s first 5 storage slots values, prior to the hack transaction. This is what we get:

Foundry storage reading test output

As expected, we see 0xDd6B’s address in slot 0x00. But we see two other addresses. Slot 0x01 stores, as Dedaub’s decompiler intelligently labeled, the contract owner–an externally owned address that is allowed to call functions on 0xbad. But slot 0x03 stores another address, which Etherscan recognizes: it’s dYdX’s SoloMargin contract. Interestingly, Dedaub’s decompiler gives that storage slot the name owner_3_0_19. Rightfully so, since it seems this address is also allowed to call functions on the MEV bot’s smart contract.

It’s curious that the specific SoloMargin contract address has a special place on 0xbad’s storage. But the logic of this address having permission to call the contract should not be a surprise. An MEV bot usually leverages flash loans, so it needs to implement callbacks that those protocols will need to call. And this callback execution needs to work, so it stands to reason that 0xbad would allow the flash loan protocols it uses to execute its smart contract.

Dedaub’s decompiler doesn’t give us a lot of information on 0xDd6B’s logic. But when we use EtherVM’s online decompiler tool, we get a number of public methods on the smart contract. In particular, the tool manages to understand some of the function signatures, since they are common throughout the blockchain:

0xDd6B’s public methods, seen on EtherVM

We saw a lot of function signatures, but as we expected, the known ones are callbacks from DeFi protocols, which means that all those protocols in some way will be able to call our smart contract, though the question remains whether there are more requirements for the transaction to proceed. Among the public methods, there’s callFunction(address,(address,uint256),bytes), which is the dYdX’s SoloMargin callback function. This confirms that 0xbad is able to flash loan from dYdX. Or on another point of view, dYdX’s SoloMargin contract can call 0xbad through their flash loan function.

To test whether there are more requirements involved, we can create a PoC where we will call SoloMargin’s flash loan function, but with 0xbad as the target address, effectively allowing us to execute 0xbad’s logic (present in 0xDd6B).

Snippet 2: interfaces.sol implementing SoloMargin methods

The first thing we need to do is to declare the relevant SoloMargin functions in an interface. We will be doing an action of type ActionType.Call.

Snippet 3: Attacker.sol, the contract trying to execute 0xbad’s logic

The SoloMargin.operate input data needs to be properly set to allow for a call to 0xbad as if we were flash loaning to it, though no value is actually being flash loaned. Still, we need to pass our Attacker contract address as the owner of the account, otherwise dYdX will revert our transaction.

Foundry test contract

Our PoC test just runs Attacker.attack against a local fork using the free public RPC aggregator provided by Ankr. We select the block number 15625423 as our fork block, 1 block before the first hack transaction. As we can see in the following image, our test didn’t work:

Foundry test output, showing the transaction being reverted

The good news is that dYdX’s SoloMargin calls 0xbad, and that call gets delegated to 0xDd6B’s callFunction callback. This confirms that SoloMargin has the right permissions to call the MEV bot. The bad news is that the transaction got reverted somewhere inside 0xDd6B’s callFunction logic. As it turns out, things aren’t as easy as we thought. Life is hard.

Still, we’re making some meaningful progress. Our next step is to investigate some of 0xbad’s past transactions where callFunction gets successfully called. Dedaub’s library allows us to filter transactions so that we only see the ones calling 0xDd6B’s callFunction. We managed to find a successful transaction doing this. 0xDd6B’s callback executes the following external functions:

WETH.allowanceAn exchange function on a Curve Finance contractUSDT.allowanceThe same exchange function on CurveUSDC.allowanceUniswapV2Router.swapExactTokensForTokens

Interestingly enough, if we select another transaction with the same relevant properties, callFunction executes the following functions:

WETH.allowanceA swap function with a Balancer’s VaultwstETH.unwrapstETH.allowanceAn exchange function on a Curve Finance contractWETH.transfer

We conclude that callFunction has various possible execution path and is able to call different protocols depending on what is wanted. But how is this path determined? From a closer inspection on the execution of SoloMargin.operate, we see that there is encoded data in the data field of SoloMargin.ActionArgs, and that’s exactly what encodes the actions on 0xDd6B. As an example, here is a portion of the data for that first transaction calling Curve and Uniswap (we will from now on call this transaction 0x8e56):

Some labels on the data field of a 0xbad transaction (0x8e56)

It seems 0xDd6B will have some logic that is encoded with possible hardcoded values (the initial WETH.allowance is not being fully encoded on the data field), but also some logic that is fully encoded (Curve.exchange is encoded on data, both the address, the function and the inputs). If there are no sanity checks on the possible addresses to be called and the function signatures, then one can encode the execution of any function for 0xDd6B’s logic to execute.

No other available decompiling tool was able to fully comprehend 0xDd6B’s logic. Due to that, we need to follow a small process of trial and error. If we pass the same data from 0x8e56, we will get the error “UniswapV2Router: EXPIRED”. That’s because the deadline parameter on UniswapV2Router.swapExactTokensForTokens is no longer valid. So we change that to some random timestamp in the future — 2674487634 (encoded in hexadecimal is 9F697152).

If we run that once again, we get the error “Insufficient output amount”. This is from some check on 0xDd6B that is probably reverting the transaction if certain conditions are not met. Such condition values might also be encoded on the data field, but we can instead aim at changing all called addresses to our contract, and returning a value that would bypass requirements, like the values that the original 0x8e56 transaction returned.

First, let’s just replace those called addresses–Curve and UniswapV2–for our Attacker contract address. If everything works out, we expect it to revert on the first exchange call to us, since we haven’t coded that yet in our contract. But a very interesting thing happens already, that wasn’t happening before:

Test trace after replacing 0xDd6B called addresses with Attacker address

As it turns out, the MEV bot has a smart mechanism that will approve the caller for spending 0xbad’s funds if the allowance is 0! Because of that, 0xbad just approved our Attacker contract to spend all its WETH balance. Now, the only thing we need to do is to implement the called functions and return values that will trigger 0xbad to thinking that everything is fine with the transaction.

Functions exchange and swapExactTokensForTokens on the Attacker contract

The exchange function will be called twice on our Attacker contract, one for _from = WETH and another one for _from = USDT, so we make that distinction on our return. Other than that, we simply return the exact same values that were returned on the actual calls that took place on the 0x8e56 transaction.

Final form of Attacker.attack

After calling SoloMargin.operate, we expect to have enough allowance to transfer all 0xbad’s WETH to the Attacker contract, so we add that to our attack function, along with a few console logs. This is the final output of our PoC test, which confirms that we would successfully be able to steal all 0xbad’s WETH:

Test output of our final PoC

The 0xbad exploit was a particularly special hack in 2022. The attack stresses how proper security measures must be taken even in smart contracts without verified source code: if it has valuable assets inside, hackers will try to break it.

We’ve learned that there are quite a few transaction viewers and decompiling tools that help us in such investigations, and they will only get better as time moves forward.

The actual attacker on the 0xbad hack used a different data payload to gain WETH allowance and then transferred those funds on another transaction. The principle used is roughly the same: tricking the bot into thinking that the transaction is being executed as expected. I decided to do my own investigation to follow the hacker’s mentality as closely as possible, instead of looking at the hack transaction that had successfully tricked the bot. After all, the hacker also didn’t have access to an already working data payload.

Though it didn’t get included into this writeup, I’d like to thank Jon Becker, the creator of the very promising Heimdall toolkit, who readily optimized the decompiler module as an attempt to better decompile 0xDd6B. Unfortunately, the bytecode was too complex to output a helpful decompilation, but future iterations promise improved returns.

This is what our entire PoC looks like, with the addition of some helpful Foundry logs, and the code for the internal _buildData function.

Snippet 7: All code.
Read Entire Article