BOOK THIS SPACE FOR AD
ARTICLE ADTraversing GitHub for secrets utilizing automated tools such as gitrob (michenriksen) or or GitGot (BishopFox) are great for a quick scan of potentially hidden sensitive information behind a target’s GitHub environment. However, these automated tools and automation of discovering secrets in GitHub are far from perfect.
Many times sensitive secrets stored in a target’s GitHub environment are overlooked and thus not reported in the tool output due to the limitations of automated scanning (regex, entropy searches, etc.). Other times, too much information is outputted making it difficult to discern true secrets from a sea of false positives.
Enter GitDorker. An easy to use tool written in Python that uses a compiled list of GitHub dorks from various sources across the Bug Bounty community to perform manual dorking given a user inputted query such as a GitHub organization, user, or domain name of the intended target. These manual dorks are utilized to map out the potential surface for exposure of secrets by providing the user with a list of successful dorks, the number of results returned per dork, and a URL link for easy access to manual searching of secrets across a target’s public GitHub environment.
You may view the current compiled list of over 230+ GitHub dorks which I will continue to update in the near future here:
https://github.com/obheda12/GitDorker/
Lastly, before I dive into the use cases and explanation of GitDorker I’d like to thank Gwendell Le Coguic who had written “github-dorks” from which I was able to base GitDorker on. He also has a fantastic repo of GitHub searching tools available here:
https://github.com/gwen001/github-search
In this post, I will be outlining use cases and demonstrating how to utilize GitDorker to create an insightful attack surface to find sensitive information exposure for your intended target’s GitHub environment. For this demonstration we will be using Tesla as our target.
In order to download GitDorker perform the following command in your terminal of choice.
git clone https://github.com/obheda12/GitDorker
To install the requirements use the following command:
pip3 install -r requirements.txt
Lastly, in order to utilize GitDorker, a github personal access token must be created and utilized using the “-t” or “-tf” switch if using multiple tokens. You may follow the documentation below to create your own access token.
NOTE: GitHub rate limits accounts to 30 queries every minute. In order to circumvent rate limiting restrictions I have inputted a sleep for every 30 requests. The option to utilize a file of tokens, of which GitDorker will round robin through to perform dorks and increase the amount of queries you may utilize per minute. For example, utilizing the “-tf” switch with a file of 3 unique tokens from 3 unique accounts will increase your query limit from 30 per minute to 90 per minute. Utilizing multiple tokens to avoid rate limits is highly encouraged.
We will first identify Tesla’s GitHub organization account name. A quick google search for “tesla github” gives use the name, “teslamotors”. We will use this information to perform dorks across Tesla’s GitHub environment.
To view the help menu of GitDorker, utilize the following command
python3 GitDorker.py -h
We will be using the “-q” query switch to perform a GitHub query on Tesla’s GitHub organization account “teslamotors”. This query switch enters input into GitHub’s public search bar.
Utilizing GitHub’s advanced search functionality we can perform more complex queries to enhance our results. Linked below is an informational page of GitHub’s advanced search parameters when performing queries.
We will be utilizing the “org:” parameter to perform our query and a list of dorks. I will be using the “demo_dorks” and a token file containing 4 access tokens for this demonstration. I will also be using the “-o” parameter to parse my output to a CSV.
Note: There is an “-org” switch, however I find for demonstration purposes using the “-q” switch is more conducive to understanding the dynamic functionality of GitDorker.
python3 GitDorker.py -tf tokensfile.txt -q org:teslamotors -d dorks/demo_dorks.txt -o tesla
Below is the resulting standard output in a terminal and CSV output opened in excel and filtered based on number of results for easy analysis and manual searching of sensitive information.
As you can see a url of a custom search query is generated for each dork along with the number of results for reference. We will visit the link generated for the “ftp” dorks and analyze our results.
Our query for the “ftp” dork produced a sizeable number of results and would take a longer time to look through code and commit history information given the size of Tesla’s GitHub environment. While it could prove to be fruitful to start here first, it is advised to analyze dork results in order of least quantity of results to the greatest in order to optimize your time spent searching
On the lower end of our results we see the “connectionstring” dork produce one result. Taking a look at the result we see the following:
A single result with the dork “connectionstring” apparent in the code. This would be an ideal surface to look into the commit history of the result. In our case, Tesla is a fairly mature target and does not have any sensitive information lying behind this instance or behind this instance’s commit history, however, other less mature targets are much more susceptible to be hiding secrets.
Enabling a user to enumerate the potential endpoints for sensitive information exposure on GitHub make manual GitHub dorking a far smoother process and greatly increase the likelihood for findings. The core purpose of GitDorker is to identify and map where sensitive informaion may may be hiding to enhance your manual search for sensitive information exposure and give you detailed insight into your target’s GitHub environment.
Below are additional use cases for GitDorker using Tesla as our example target:
Utilizing a domain as a search term
You may search on a domain or any search term per the GitHub search guidelines I mentioned earlier in my post. For example we will utilize “tesla.com” as our target domain.
python3 GitDorker.py -tf tokensfile.txt -q tesla.com -d dorks/demo_dorks.txt
Utilizing a user or multiple users as a target with threading
You may search on a user or a list of users for sensitive information listed in their repositories. For example we will utilize the users listed on Tesla’s “teslamotors” GitHub page as our targets provided in a file.
First we will visit Tesla’s “teslamotors” GitHub page and identify users on the people tab. Gwen001 has written a script in his GitHub repository “github-search” to automatically scrape users as well, which I’ve linked earlier above.
We will be targeting the first 2 public users and input their GitHub usernames into a text file separated by new lines. For this example I am using a shorter list of dorks as the amount of dorks will be multiplied per user.
We will now perform the following command to perform dorking on 3 GitHub users while using a thread count of 2 and a user file containing 4 unique GitHub access tokens.
python3 GitDorker.py -tf tokensfile.txt -uf userfile.txt -d dorks/*DORK_FILE*
These are only a few use cases. The advantages of GitDorker depends on how you choose to perform advanced querying to pinpoint sensitive information and gain further insight on a target’s publicly facing GitHub environment.
To date, I’ve personally found success with this tool on bug bounty targets. If it works well for you, great! I’d love to hear about any wins or successes you have. If you think this modified tool sucks, then let me know how to improve it. I am more than open to feedback :)
Feel free to give me a follow, I plan to drop more tools and insights from my research in the future:
Follow Me:
github.com/obheda12
twitter.com/obheda12
Feel free to email me as well at obheda1@gmail.com for any questions, ideas, or if you’d like to collab :)