Recent advancements in Artificial Intelligence (AI) have greatly impacted cybersecurity, particularly in detecting phishing websites. Traditional methods struggle to address evolving vulnerabilities, but research shows that Machine Learning (ML), Ensemble Learning (EL), and Deep Learning (DL) are effective in developing defenses. However, these methods face challenges with adversarial examples (AEs). The multimodal model (MM) is a promising solution, yet there is a significant lack of research using multimodal techniques specifically for phishing website detection (PWD) against adversarial websites.To tackle this challenge, this paper assesses 15 learning-based models, particularly multimodal ones, for phishing and adversarial detection, aiming to enhance their defense capabilities.Due to the scarcity of adversarial websites, training and testing models are limited. Therefore, this study proposes an innovative attack framework, AWG - \textbf{A}dversarial \textbf{W}ebsite \textbf{G}eneration that employs Generative Adversarial Networks (GAN) and transfer-based black box attacks to create AEs. This framework closely mirrors real-world attack scenarios, ensuring high effectiveness and realism. Finally, we present defense strategies with straightforward implementation and high effectiveness to enhance the resistance of models. The models underwent training and testing on a dataset collected from reputable sources such as OpenPhish, PhishTank, Phishing Database, and Alexa. This approach was chosen to ensure the dataset's diversity and relevance to reflect real-world conditions. Experimental results highlight that the Generator's effectiveness is demonstrated by a domain structure generation rate exceeding 90\%. Moreover, AEs generated by this Generator effectively bypass most state-of-the-art ML, DL, and EL models with an evasion rate of up to 88\%. Notably, the Support Vector Machine (SVM) model is the most vulnerable, with a detection rate of only 10.02\%.On the other hand, the MM Shark-Eyes demonstrates outstanding resistance against AEs, with a detection rate of up to 99\%. Upon applying our defense strategy, the resistance of models is significantly boosted, with all detection rates surpassing 90\%. These findings underscore the robustness of our methods and pave the way for further exploration into advanced attack and defense strategies in the context of phishing and adversarial website detection.