Vừa qua, tạp chí IEEE Access (tạp chí thuộc nhóm Q1 – top 25% tạp chí danh giá nhất trong lĩnh vực Kỹ thuật và Khoa học máy tính) đã đăng bài nghiên cứu “A study on adversarial sample resistance and defense mechanism for multimodal learning-based phishing website detection” của nhóm sinh viên chuyên ngành An toàn thông tin, Khoa MMT-TT, Trường ĐH CNTT.
Đây là công trình khoa học về xây dựng khung tấn công và cơ chế phòng thủ cho các mô hình đa thể thức phát hiện trang website lừa đảo được nhóm sinh viên thực hiện tại PTN ATTT trong quá trình thực hiện Khóa luận tốt nghiệp (KLTN) trong năm vừa qua.
Tên bài báo:
- A study on adversarial sample resistance and defense mechanism for multimodal learning-based phishing website detection
Sinh viên thực hiện:
- Võ Quang Minh - Lớp Tài năng ngành ATTT 2020
- Bùi Tấn Hải Đăng - Lớp Tài năng ngành ATTT 2020
Giảng viên hướng dẫn:
- ThS. Phan Thế Duy
- TS. Phạm Văn Hậu
Abstract:
In recent years, the advancement of Artificial Intelligence (AI) has significantly impacted various fields, particularly cybersecurity. However, current approaches to combating cyber threats, such as phishing attacks, remain limited by their inability to address evolving vulnerabilities in online systems effectively. Despite this challenge, extensive research has demonstrated the efficacy of learning-based models, notably Machine Learning (ML), Ensemble Learning (EL), and Deep Learning (DL), in developing defensive mechanisms against these threats. However, these methods encounter challenges when dealing with adversarial examples (AEs). Multimodal model (MM) have emerged as a promising approach to address this issue. Despite their potential, there is a notable lack of research employing multimodal techniques for phishing website detection (PWD), especially in the context of adversarial websites. To tackle this challenge, this paper assesses 15 learning-based models, particularly multimodal ones, for phishing and adversarial detection, aiming to enhance their defense capabilities. Due to the scarcity of adversarial website examples, training and testing models are limited. Therefore, this study proposes an innovative attack framework AWG - AdversarialWebsite Generation that employs Generative Adversarial Networks (GAN) and transfer-based black box attacks to create AEs. This framework closely mirrors real-world attack scenarios, ensuring high effectiveness and realism. Finally, we present defense strategies with straightforward implementation and high effectiveness to enhance the resistance of models. The models underwent training and testing on a dataset collected from reputable sources such as OpenPhish, PhishTank, Phishing Database, and Alexa. This approach was chosen to ensure the dataset’s diversity and relevance to reflect real-world conditions. Experimental results highlight that the Generator’s effectiveness is demonstrated by a domain structure generation rate exceeding 90%. Moreover, AEs generated by this Generator effectively bypass most state-of-the-art ML, DL, and EL models with an evasion rate of up to 88%. Notably, the Support Vector Machine (SVM) model is the most vulnerable, with a detection rate of only 10.02%. On the other hand, the Multimodal model Shark-Eyes demonstrates outstanding resistance against AEs, with a detection rate of up to 99%. Upon applying our defense strategy, the resistance of models is significantly boosted, with all detection rates surpassing 90%. These findings underscore the robustness of our methods and pave the way for further exploration into advanced attack and defense strategies in the context of phishing website detection and adversarial attacks.
Toàn văn bài báo: https://ieeexplore.ieee.org/document/10620285