OSINT analysis — SpiderFoot & theharvester (Information Gathering)

1 year ago 84
BOOK THIS SPACE FOR AD
ARTICLE AD

The best and most accurate piece of information of a person/host/web-server is the information the person is giving out itself. Think about it, if I ask you about something related to a person, what is the most trustworthy place to find that information, it’s their social media/internet or you can just approach and ask them directly. Basically, when we dig up information of a particular host on internet/web, it can be regarded as OSINT.

OSINT stands for Open source intelligence. OSINT (Open Source Intelligence) is data available in the public domain which might reveal interesting information about your target. This includes DNS, Whois, Web pages, passive DNS, spam blacklists, file meta data, threat intelligence lists as well as services like SHODAN, HaveIBeenPwned? and more.

When you try to gather info on a target without directly interacting it is called passive recon or passive OSINT, but when you interact with the target to get info, it is called active recon or active OSINT.

Information that you can get on the internet from passive recon is always less in quantity and quality as compared to active recon, but active recon puts you at risk since you are interacting with the target and the owner can reverse it back to you and thus ending with you identified and possibly located, if you’re not careful enough.

There are various tools on the internet that can help you with OSINT, but the best one of them is Google itself, we will discuss about Google dorking in one of my upcoming articles. I’ve already posted about GHunt — Google mail OSINT and I’ve already mentioned a lot of OSINT tools, in my hacktoria contract challenges.

In the OSINT Analysis section of Information gathering tab in Kali Linux, there are basically two tools listed, let’s discuss each of them in details.

SpiderFoot

SpiderFoot is a reconnaissance tool that automatically queries over 100 public data sources (OSINT) to gather intelligence on IP addresses, domain names, e-mail addresses, names and more. You simply specify the target you want to investigate, pick which modules to enable and then SpiderFoot will collect data to build up an understanding of all the entities and how they relate to each other.

SpiderFoot can be used offensively, i.e. as part of a black-box penetration test to gather information about the target, or defensively to identify what information you or your organization are freely providing for attackers to use against you.

Spiderfoot is a free and open-source tool available on GitHub.Spiderfoot works as a framework cum tool.Spiderfoot framework is written in python language.Spiderfoot can be used for reconnaissance.Spiderfoot contains many modules. As it’s a framework that uses modules for information gathering.Spiderfoot works on the principles of OSINT.Spiderfoot is an automated OSINT Framework.Spiderfoot automates the reconnaissance processes.Spiderfoot is used for reconnaissance.Spiderfoot is used for information gathering.Spiderfoot is working as a scanner for active and passive scanning on target.Spiderfoot can be used for domain foot-printing.Spiderfoot can be used to find the phone numbers, email addresses of the target.Spiderfoot can be used to find bitcoin addresses.Spiderfoot can be used to save all the information gathering summary.Spiderfoot can be used to create graphs of scanning done by Spiderfoot.Spiderfoot can be used to automate GitHub all the information gathering processes.

Spiderfoot can be used in terminal as well as with a GUI (Graphical user interface) in the web-browser.

If you don’t have SpiderFoot. You can simply clone it from GitHub or download from the official website like this.

Open your terminal and make a directory called SpiderFoot in downloads.

cd Downloads && mkdir spiderfoot

Go that directory.

cd spiderfoot

Git clone the SpiderFoot directory from GitHub.

git clone https://github.com/smicallef/spiderfoot.git

List files using ls command.

ls

Install the dependencies and requirements from this command.

pip install -r requirements.txt

After this is completed, clear your terminal and list files. Now it’s time to run the tool. Use following command to run the tool.

python3 sf.py

The tool is asking to start the web server. Use following command to start the web server and also the tool.

python3 ./sf.py -l 127.0.0.1:5002

Now, open your browser and go to http://127.0.0.1:5002/

A page like this will appear, now move on to New Scan and just follow the instruction given about the format of target

Let’s try to scan google.com and name it as Test1. Scroll down and click on Run scan now.

And, the results will start appearing.

If you already have spiderfoot installed in your system, then all you have to do is enter the following command in your terminal.spiderfoot -l 127.0.0.1:5052
Now, go on to the linkhttp://127.0.0.1:5052/

Here is the help section of SpiderFoot in command line interface.

┌──(scott㉿notebook)-[~]
└─$ spiderfoot -h
usage: sf.py [-h] [-d] [-l IP:port] [-m mod1,mod2,...] [-M] [-C scanID]
[-s TARGET] [-t type1,type2,...]
[-u {all,footprint,investigate,passive}] [-T] [-o {tab,csv,json}]
[-H] [-n] [-r] [-S LENGTH] [-D DELIMITER] [-f]
[-F type1,type2,...] [-x] [-q] [-V] [-max-threads MAX_THREADS]
SpiderFoot 4.0.0: Open Source Intelligence Automation.options:
-h, --help show this help message and exit
-d, --debug Enable debug output.
-l IP:port IP and port to listen on.
-m mod1,mod2,... Modules to enable.
-M, --modules List available modules.
-C scanID, --correlate scanID
Run correlation rules against a scan ID.
-s TARGET Target for the scan.
-t type1,type2,... Event types to collect (modules selected
automatically).
-u {all,footprint,investigate,passive}
Select modules automatically by use case
-T, --types List available event types.
-o {tab,csv,json} Output format. Tab is default.
-H Don't print field headers, just data.
-n Strip newlines from data.
-r Include the source data field in tab/csv output.
-S LENGTH Maximum data length to display. By default, all data
is shown.
-D DELIMITER Delimiter to use for CSV output. Default is ,.
-f Filter out other event types that weren't requested
with -t.
-F type1,type2,... Show only a set of event types, comma-separated.
-x STRICT MODE. Will only enable modules that can
directly consume your target, and if -t was specified
only those events will be consumed by modules. This
overrides -t and -m options.
-q Disable logging. This will also hide errors!
-V, --version Display the version of SpiderFoot and exit.
-max-threads MAX_THREADS
Max number of modules to run concurrently.

I would also suggest you to read the official documentation of SpiderFoot, so that you can understand how it works. One of the best part about SpiderFoot is that, it is an open-source project.

theHarvester is a simple to use, yet powerful tool designed to be used during the reconnaissance stage of a red team assessment or penetration test. It performs open source intelligence (OSINT) gathering to help determine a domain’s external threat landscape. The tool gathers names, emails, IPs, subdomains, and URLs by using multiple public resources.

┌──(scott㉿notebook)-[~]
└─$ theHarvester --help
*******************************************************************
* _ _ _ *
* | |_| |__ ___ /\ /\__ _ _ ____ _____ ___| |_ ___ _ __ *
* | __| _ \ / _ \ / /_/ / _` | '__\ \ / / _ \/ __| __/ _ \ '__| *
* | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | *
* \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| *
* *
* theHarvester 4.2.0 *
* Coded by Christian Martorella *
* Edge-Security Research *
* cmartorella@edge-security.com *
* *
*******************************************************************
usage: theHarvester [-h] -d DOMAIN [-l LIMIT] [-S START] [-p] [-s]
[--screenshot SCREENSHOT] [-v] [-e DNS_SERVER] [-r] [-n]
[-c] [-f FILENAME] [-b SOURCE]
theHarvester is used to gather open source intelligence (OSINT) on a company
or domain.
options:
-h, --help show this help message and exit
-d DOMAIN, --domain DOMAIN
Company name or domain to search.
-l LIMIT, --limit LIMIT
Limit the number of search results, default=500.
-S START, --start START
Start with result number X, default=0.
-p, --proxies Use proxies for requests, enter proxies in
proxies.yaml.
-s, --shodan Use Shodan to query discovered hosts.
--screenshot SCREENSHOT
Take screenshots of resolved domains specify output
directory: --screenshot output_directory
-v, --virtual-host Verify host name via DNS resolution and search for
virtual hosts.
-e DNS_SERVER, --dns-server DNS_SERVER
DNS server to use for lookup.
-r, --take-over Check for takeovers.
-n, --dns-lookup Enable DNS server lookup, default False.
-c, --dns-brute Perform a DNS brute force on the domain.
-f FILENAME, --filename FILENAME
Save the results to an XML and JSON file.
-b SOURCE, --source SOURCE
anubis, baidu, bevigil, binaryedge, bing, bingapi,
bufferoverun, censys, certspotter, crtsh, dnsdumpster,
duckduckgo, fullhunt, github-code, hackertarget,
hunter, intelx, omnisint, otx, pentesttools,
projectdiscovery, qwant, rapiddns, rocketreach,
securityTrails, sublist3r, threatcrowd, threatminer,
urlscan, virustotal, yahoo, zoomeye

Let’s try to perform theHarvester scan on domain “google.com” using duckduckgo to gather first 100 results.

theHarvester -d google.com -l 100 -b duckduckgo

There you go! Here are the results.

[*] Hosts found: 13
---------------------
accounts.google.com:142.250.192.237
apis.google.com:142.250.194.174
encrypted.google.com:142.250.194.142
encrypted.google.com:142.250.194.14
myaccount.google.com:142.250.194.14
ogs.google.com:142.250.194.142
play.google.com:142.250.193.206
policies.google.com:142.250.206.110
support.google.com:142.250.195.14
support.google.com:142.250.193.14
www.google.com:142.250.193.196

Now, it’s time for some special tools that you can use in day to day life.

WhatsMyName — This tool allows you to enumerate usernames across many websites.

2. Check-HostCheck-Host is a modern online tool for website monitoring and checking availability of hosts, DNS records, IP addresses. It supports the latest technologies such as localized domain names (both punycode and original formats), hostname IPv6 records (also known as AAAA record).

3. CyberchefCyberChef is a simple, intuitive web app for carrying out all manner of “cyber” operations within a web browser. These operations include simple encoding like XOR and Base64, more complex encryption like AES, DES and Blowfish, creating binary and hexdumps, compression and decompression of data, calculating hashes and checksums, IPv6 and X.509 parsing, changing character encodings, and much more.

Google Dorks — OSINT data gathering method using clever Google search queries with advanced arguments.Shodan — a search engine for online devices and a way to get insights into any weaknesses they may have.Maltego — an OSINT tool for gathering information and bringing it all together for graphical correlation analysis.Metasploit — a powerful penetration testing tool that can find network vulnerabilities and even be used to exploit them.Recon-ng — an open-source web reconnaissance tool developed in Python and continues to grow as developers contribute to its capabilities.Aircrack-ng — a wifi network security testing and cracking tool that can be used both defensively and offensively to find compromised networks.

At last we have finished the OSINT Analysis part of Information gathering tab of Kali Linux and we have discussed What is OSINT, Active and Passive Recon, All the tools that is pre-installed for this purpose in our Kali machine, and some special tools too.

Be safe, be secure and happy hacking :)

Read Entire Article