5 Methods I Use To Discover APIs

7 months ago 62

BOOK THIS SPACE FOR AD

ARTICLE AD

While working on a target, some of the most interesting parts to test is its API. APIs are dynamics, they get updated more often then other parts of the application and are responsible for many of the backend heavy lifting. In modern applications we usually see REST API, but also other forms like GraphQL and even SOAP.

When we first approach a target, we have a lot of research to do in order to understand its main functions and how they work behind the scenes. Of course that it’s always recommended to give some time just for reading about the target and the services of it. For example, if we are hacking a car rental application, a good thing to do at the beginning would be to read about the company’s services (renting, selling, support, discounts, etc). With the knowledge of our target’s services, we will look for the reflected functions in its application and try to break them.

In this writeup we will talk about methods for API reconnaissance in order to get a good picture of our attack surface. We won’t dive into common API attacks here, since it will be in a different writeup.

For more content and hacking tips — follow me on X.

It’s important to say that it is a good practice to “click every button” the app displays to you. The most simple way to understand a function is by simply using it and analyzing the response we get.

And yet, not always the buttons on the screen include the entire APIs that the app contains. The main three reasons why we need to assume that we see just a partial picture of the APIs:

We don’t know if a user with different level of privileges holds more APIs than usThere might be some undocumented APIs that the developers haven’t created a web interface for themThere might be old APIs that the developers removed their web interface, but still function at the backend

Another good reason to do a thorough API recon is because this is a great way to know more about the underlying and the core of the application, and exposing secrets at the same time. Read more about it here.

Not always, but in many cases we’re hacking a target that runs a product, an app, that has available documentation of its API, like Swagger of WSDL file. Usually the documentation is out there for the convenient usage of other developers that want to integrate the target’s API in their apps. But sometimes the documentation is publicly accessible without any reason except for excessive exposure.

In any case, this is a very useful piece of information. Not only that it maps the main application’s API endpoints, it also explains how the API itself functions:

What kind of data the specific endpoint expects to get (integer/string, JSON/XML, POST/PUT/GET, etc)Required headers to sendThe response we should get back from the requestAuthentication level needed for the specific endpoint

Uber’s API docs — example for request and response

At the example above we can see how Uber provides it’s API documentation for other developers. Pay attention to the different headers in the request, that include some parameters like authorization code, client ID and client secret. These parameters might be crucial for executing the API correctly and the documentation gives us great explanation for it.

Another way of using the app’s API documentation is to find its Swagger/WSDL files. Not only that we can read them to understand the API structure, we can also load these files to Postman and start playing with them!

Not every application has it, but finding and reading the documentation can save us time and give the answers to almost every question about the app.

In case that our target doesn’t have API docs, we are able to create our own documentation to the app, without a lot of effort. Read more about it in this article: How to craft rogue API docs for a target when they don’t exist.

Like we mentioned earlier, APIs are a very dynamic part of the application and it tends to change from time to time. It means two important things:

The developers constantly work on the APIs and probably use different tools to build, test and document the different API’s versionsThere is a good chance that there are older versions of the application’s API that we can find, and maybe they are less secure then the current version in the production!

Let’s talk about few OSINT tools that we can easily use and get our results pretty fast.

Google Dorking

The combination of Google’s advanced search options and some indicative keywords for APIs is one of the first things to start with when we approach a target. A quick Google dorking search may give us:

Subdomains of the target related to APIsAPI documentation page of the targetAPI endpoints — old and current versions

Here’s an example of some results about Starbucks:

Starbucks API Google search

This simple search is of course not mapping the entire API surface of Starbucks, but it provides us with a few more leads to more subdomains of the company that hold its API.

Some more useful search forms:

site:target.com inurl:”/v1"site:target.com inurl:”/api"site:target.com inurl:”/graphql"site:target.com intitle:”api*”

WaybackMachine

One of the greatest tools out there for discovering API endpoints and harvesting some secrets at the same time is WaybackMachine. We all know that by searching a URL there we can view the target’s page in a specific date. The magic comes that we can also get a list of URLs in GET requests. Pay attention to the image below:

Ryanair API endpoints in WaybackMachine

Just by searching for the company’s domain and filtering on the work “api”, we got back a few API endpoints, containing even GraphQL.

And if we look on more subdomains of the company, we will probably see more and more API endpoints. In many cases I’m able to find credentials like usernames, tokens, auth-keys and JWTs in Wayback.

Valid user token found in WaybackMachine

Using these found credentials, I can sometimes test for post-auth API endpoints with different user permissions.

Also, it’s recommended to integrate GAU or Waymore in your recon automation for pulling more API endpoints.

Postman

This is one of the most common tools for developers to test their APIs without the need for a frontend interface to send the requests and much more convenient than just using curl.

Postman (from Postman.com) — useful tool for API testing

Postman is available as a SaaS application in postman.com and lets developers to share projects in order to make it easier for teams. A postman project, also known as postman collection, is usually suppose to be private. But in so many cases you will see that the collections are publicly open. In the collections, there are many details such as parameters, headers, body data, environment variables and authorization tokens.

If we are lucky enough to put our hands on a postman collection of our target, it might be even better than finding the official API documentation page. These are not just examples without real content that we see there, these are real requests sent by the developers of the app and real responses from the backend. We can learn a lot about the internal environment of the app and the underlying core of the target. And one of the best parts — credentials that usually has high permissions to query the backend!

Cookie and tokens in Postman collections

At the image above we can see a postman collection that belongs to one of my clients. As you can see, this is example of a POST request sent to localhost:8080 because it’s been sent in a staging environment. But there are a token and a cookie that were still valid for using in a pentest on the production.

GitHub

It’s not always the case, but if your target has a GitHub repository that is accessible to you, spending some time on the app’s code is always a good idea. With a few keywords, we’ll maximize our chances to find API endpoints and a detailed explanation for how they work.

Some common keywords for API:

/v1/apiapikeyapi_keyapidocsapi_secretx-api-key/graphql

Like Postman and WaybackMachine, also in GitHub we have a good chance to find some secrets and credentials that might be useful for the next steps of the engagement.

In order to send the API request from the frontend to the backend, the frontend app uses Javascript for XHR/AJAX calls. It means that the API endpoints themselves should be mentioned in the client side source code. In FireFox, if we open the DevTools (F12) and open the Debugger tab (or Sources tab in Chrome), we will see our target’s address and a small arrow pointing down. Clicking on the small arrow we’ll get the resources of the frontend, including the Javascript file.

TryHackMe’s Javascript files

After finding the Javascript files, we usually get a chunk of minified code, without new lines and spaces. It happens for increasing the performance for user experience. In this case we can use a JS prettifier, like this one. After that, just copy the code to your code editor like VSCode or Sublime and start searching for API requests.

To search for API calls in the code, we first need to understand the structure of the API calls in the app. Don’t hesitate to spend some time reading the different functions and variables you see. Search for keywords like API, v1, v2, user and other common words associated with APIs. Another thing to do, is searching HTTP methods that indicate for a request sending to the backend.

Also, if we want an automated tool we can use Katana. This is a nice crawler with a lot of different flags to add in order to custom it for our target. One of the most important features of Katana is Javascript parsing. With this, we can run on many JS files and in few moments get the first picture of the APIs on the target. A good recommendation would be to use Katana and view the output and then run it again with few more customizations for the specific web app.

By viewing the HTML and Javascript of an application, we can map most of the API calls and even expose shadow APIs. In one of my last engagements, the user interface of the application exposed around 30 API calls, while after extracting the Javascript code I was able to expose more than 140 API endpoints that couldn’t be exposed just by using the app.

Because of the small exposure of shadow APIs, they tend to be with higher potential for vulnerabilities, because they are also been tested very rarely.

So far we discussed only about passive ways to enrich our API surface, with minimum contact with the target itself. Passive actions still enable us to discover many, even most, of the API endpoints that exist on the backend. But what happens when there are API endpoints that are not suppose to be exposed to the app we are messing with? Not talking about shadow APIs, I’m talking about other endpoints that exist on the backend and suppose to serve another application, like internal or employees app.

For example, this case is relevant when a company develops few frontend applications (web&mobile) that get data from one backend (api.target.com), like one panel for clients and another one for the managers.

If the other app is not accessible to us or we don’t even know about its existence, with fuzzing we can find more endpoints to hack!

When it comes to API fuzzing, there are two important things we need to consider:

Fuzzers/Scanners: basically it’s the tool that sends the HTTP requests and filters by responses that we have to pre define what would be interesting for us.Wordlists: the content we fuzz for. A good wordlist is the difference between finding a vulnerability to just run generic words and wasting time.

Let’s talk about them.

Fuzzers

There are many tools these days that do great job in API discovery through fuzzing. For simple GET requests with a list of endpoints we can always use tools like Burp Intruder, ffuf, GoBuster, Kiterunner, and even building our own fuzzer. In most cases, I find ffuf and Kiterunner as great tools, not only in terms of speed, but also the useful features it brings on like filtering by size, status code, words and more. Specifically about Kiterunner, combining the relevant lists from Assetnote, this tool is excellent for modern web apps (NodeJS, Flask, Rails, etc).

Kiterunner’s wordlists from Assetnote

With a single command you can have a very good understanding of the API picture of your target:

./kr scan https://target.com -w ~/wordlists/routes-large.json

Also, except API endpoints we also have to discover what parameters are accepted by the backend. Of course that there are the “default” parameters of legitimate requests, but what if there are also “shadow parameters”? Maybe we can find a Mass Assignment vulnerability that we have no other way to find but fuzzing the target. For this task, I see Arjun as one of the best tools out there.

Arjun is a python tool that simply sends GET requests to a given URL with a big amount of different parameters. At the end, the tool will provide us a list of valid parameters for further testing. I will write more about the usage of Arjun in a future writeup about hacking APIs.

Wordlists

Using the right wordlist is the key for a successful API pentesting. There are some great resources for this mission: SecLists, Assetnote, FuzzDB and more.

Lazy hackers would use the generic wordlists, that simply contain huge amount of words but without any specific purpose. The pros will always try to get more specific wordlists according to the target. For example, if we know that our target is a car rental web app based on Django as backend, we can combine a generic wordlist for Django with a custom wordlist for car rental. For the first wordlist we can use assetnote:

Django wordlist from Assetnote

And for the second we can ask ChatGPT to generate a list of common API endpoints for car rental:

Car rental generated wordlist from ChatGPT

Let’s say that out target is a delivery company. It has a web application where we can make an order, pay for it and maybe some more features. But if the delivery company also has a mobile application, there’s a chance that there are some features that might be available especially for mobile phones, like taking exact location by a GPS.

In this case, it might means that the APIs that exist in the Javascript of the web application, wouldn’t be the same as the API endpoints exist on the APK file.

Mobile app pentesting is a big different topic so we won’t get into much details in this article, but we can use static analysis tools like JADX and MobSF for having some hardcoded API endpoints that reside in the APK.

There are some mirror websites like APKPure, that we can download the APK to our machine in order to open it with analysis tools. It’s always recommended to use static and dynamic analysis in order to map every call the app sends, but for a start it’s also very efficient to use tools like MobSF for a first glance.

MobSF is an automated analysis tool that takes an APK file and build a report about the file internals:

Output from mobsf.live

From the report, we can get some hardcoded URLs and domain that can help us build a bigger picture of the target’s API picture.

When we’re hacking APIs, we must have a good understanding of how the application works, what features it offers and what is the whole surface that is available for us. By using the above mentioned methods, we will thoroughly build the picture of the application and have a great basis for hacking the different app’s APIs.

Hacking APIs is sometimes more like a research that we have to use different tools, resources and even manual techniques in order to discover every piece of it. If we are focusing on a target for a long time (weeks or months), we will probably see its APIs changing and growing by the time and the best time to hack them is when they are brand new or still in the shadows.

In a future article I will also provide content of how to hack the APIs that we find in these methods.

Read Entire Article