BOOK THIS SPACE FOR AD
ARTICLE ADI am writing this article to share an interesting pickle deserialization vulnerability I have disclosed for Apache Airflow. I reported this finding to Airflow and got a bounty of $540 from IBB, an Internet Bug Bounty program organized by HackerOne (the purpose of IBB is to reward security research into vulnerabilities impacting some selective Open Source Software Projects that include Airflow). Currently, this vulnerability has been published publicly with CVE-2023–50943.
OK, no more words wasted, let me show you the vulnerable code first.
At my first glance through the code lines from 679 to 688, I feel it is simple logic to use a core configuration key enable_xcom_pickling to control Airflow’s XCom pickling option. But when I looked closely at the if and else branches, I suddenly realized both of them could reach the pickle deserialization code as below:
return pickle.loads(result.value)A little difference is the else branch can only run the pickle.loads() in the condition that json.loads() runs failed. Unluckily, such a condition is necessarily satisfied as the pickle dumped string is not decodable by JSON and thus can throw JSONDecodeError to activate the except() code in line s 687–688. As a result, the configuration key enable_xcom_pickling can no longer provide any protections to the pickle deserialization.
Before I write the Proof of Concept code (PoC) to show the impacts, I briefly explain how pickle deserialization can be exploited to execute arbitrary commands (ACE) in the figure below:
As can be seen, the pickle.loads() deserialize the pickle-dumped string using an abstract stack. The load_build() and load_reduce() functions can read the string stack to construct dynamic functions to execute. This is a bad practice since the malicious pickle dump can freely embed arbitrary strings as the function names, leading to an arbitrary command execution (ACE).