BOOK THIS SPACE FOR AD
ARTICLE ADPersonally Identifiable Information (PII) refers to any sensitive data that could potentially identify an individual such as usernames, emails, phone numbers, and addresses.
Protecting PII is paramount to safeguarding individuals’ privacy and preventing identity theft, fraud, and other malicious activities. This article discusses a methodology for uncovering PII on websites using free and open-source tools.
At this point, many of you would have made a guess that we are going to use the Wayback Machine tool for this purpose, but there’s more to this. If you are searching on a website with multiple subdomains it is not practically feasible to enter each and every subdomain into Wayback Machine. This is where we have to use the tools mentioned below:
subfinder: A tool to fuzz (or discover) subdomains of a target domainhttprobe: A tool to check for live web servers on a list of URLs.waybackurls: A tool to find archived versions of websites using the Wayback Machine.grep: A powerful command-line tool for searching text.Subdomain Enumeration:Command: subfinder -d example.in > examplesubdomain.txt
The initial step of our approach involves utilizing the “subfinder” tool to enumerate subdomains associated with a target domain. Taking the example of “example.in,” we execute the command “subfinder -d example.in > examplesubdomain.txt” to generate a list of subdomains and save the results to a file named examplesubdomain.txt.
2. Separating the wheat (active web servers) from the chaff (inactive web servers):
Command: cat examplesubdomain.txt | httprobe > http.txt
We’re primarily interested in active web servers that might hold PII. Here’s where httprobe comes in. We categorize URLs into HTTP and HTTPS protocols to streamline subsequent analysis. We now pipe the output of examplesubdomain.txt through httprobe to extract live HTTP and HTTPS URLs from the subdomain list, facilitating targeted investigation into active web resources saving the live URLs in a file named http.txt.
3. Wayback Machine Magic:
Command: cat http.txt | waybackurls > way.txt
The Wayback Machine is a goldmine for historical website data. waybackurls leverages this archive to find archived versions of the URLs listed in http.txt. By feeding the HTTP URLs obtained earlier into this tool, we retrieve archived instances of web content, enabling a deeper examination of past site configurations and potential vulnerabilities. After executing the above command, the results will be saved in a file named way.txt.
4. Searching for PII:
Now comes the exciting part — hunting for PII! We’ll use grep, a powerful text-searching tool. Here are some commands to get you started:
Command:
cat way.txt | grep @gmailSearch for URLs containing keywords related to login or account information:
cat way.txt | grep passwordYou can further refine your search using additional keywords like “mail”, “id”, “phone”, “mobile”, “invoice”, and “pay”.
The filtered URLs will be stored in way.txt file and we can try to access these URLs. If we are lucky we get to find PII leakage immediately, that we can report and earn bounty.
There are multiple ways to patch PII leakage.
Data Encryption: Encrypt PII both in transit and at rest to prevent unauthorized access. Utilize strong encryption algorithms and ensure keys are managed securely.Access Control: Implement strict access controls to limit who can view and modify PII. Use role-based access control (RBAC) to grant permissions based on job roles and responsibilities.User Authentication and Authorization: Implement multi-factor authentication (MFA) and strong password policies to prevent unauthorized access to systems containing PII. Additionally, enforce least privilege principles to restrict access to only those who need it.NOTE: Make sure to test only on sites where it is allowed to test and carefully read and follow the guidelines for testing on the site.
________________________________________
Check out this Avvatar whey protein product:
Also, check out this exciting hoodie for hackers from Zazzle: