The UIT Information Security Laboratory

Phishing is a major cybersecurity threat that is increasingly dangerous and complicated, especially during a global pandemic when there is a great need for remote work and communication between Internet users. Moreover, the challenge is even greater when the crime of using high technology increase with the speed of development of science and technology. Machine learning-based approaches for phishing detection have been explored and applied, which have gained many significant results. Though, the problem with using visual similarity-based phishing detection techniques is the complexity of the visual feature extraction process. In this work, we propose a method that uses the transfer learning technique with pre-trained image classification models to extract features as input to machine learning algorithms for phishing website detection. Our proposed method takes advantage of the research results in the field of computer vision, opening up prospects for application in the field of information security. In the experiments, we tested five deep learning image classification models combined with eleven machine learning algorithms in the task of training to classify phishing websites based on visual similarity. Experimental results show that VGG16 is well compatible with several ML-based algorithms in the task of classifying phishing sites. Our proposed method offers promise for the problem of detecting phishing websites based on visual similarity when combining CNN image classification models and machine learning algorithms.

Leveraging Deep Learning Image Classifiers for Visual Similarity-based Phishing Website Detection