Machine learning technique detects phishing sites based on markup visualization

3 years ago 305

BOOK THIS SPACE FOR AD

ARTICLE AD

Ben Dickson 08 September 2021 at 10:50 UTC

Researchers showcase new method for improving the detection of fake websites

Machine learning technique detects phishing sites based on markup visualization

Machine learning models trained on the visual representation of website code can help improve the accuracy and speed of detecting phishing websites.

This is according to a paper (PDF) by security researchers at the University of Plymouth and the University of Portsmouth, UK.

The researchers aim to address the shortcomings of existing detection methods, which are either too slow or not accurate enough.

Turning web code into images

The technique developed by the researchers uses “binary visualization” libraries to transform the markup and code of web pages into images.

Using this method, they created a dataset of legitimate and phishing images of websites.

Barlow, et al // 'A Novel Approach to Detect Phishing Attacks using Binary Visualisation and Machine Learning'
Visual differences between the legitimate PayPal login page and a phishing equivalent

The dataset was then used to train a machine learning model to classify legitimate and phishing websites based on the differences in their binary visualization.

To test a new website, the target webpage’s code is transformed through binary visualization and run through the trained model.

Accurate detection of phishing websites

According to the researchers’ experiments, the model reached 94% accuracy in detecting phishing websites. And since it uses a very small neural network, it can run on user devices and provide near-real-time results.

“We have tested the technique with actual phishing and legit sites,” Stavros Shiaeles, one of the paper’s co-authors, told The Daily Swig.

This is not the first time that binary visualization and machine learning has been used in cybersecurity. In 2019, Shiaeles, who is a cybersecurity lecturer at the University of Portsmouth, was among the co-authors of another technique that used ML and binary visualization to detect malware with promising results.

After testing the phishing website detection system, the team is now taking the next step to make the technique ready for adoption.

“We are working on a new extended method and we are trying to apply for a patent,” Shiaeles said. “Based on the results we initially have I don't see the point not to be adopted. The accuracy is 100%.”

Read Entire Article