Context

tabnabbing attacks exploit user behavior in web browsers, deceiving users by altering content in inactive tabs to appear legitimate, leading to data disclosure or unintended actions.

Approach

This research evaluates the effectiveness of RL (Reinforcement Learning) in detecting Tabnabbing attacks at the web browser level, presenting a proactive defense mechanism against this cyber threat

Dataset A publicly available dataset (Phishpedia) was utilized to train the model. The dataset contained both phishing websites and legitimate websites.

  • 48,046 data instances, with 19,388 being legitimate and 28,658 being phishing data

Features of the model

  • Title
  • IP address in domain
  • URL length
  • Presence of ’@’ in the URL
  • Redirection using ’//’ symbol
  • Prefix or suffix separated by ’-’ symbol
  • Subdomains and multi subdomains in the URL
  • Favicon URL
  • Percentage of request URLs
  • Percentage of links in <meta>, <script>, and <link> tags

The random forest algorithm was used to select the most significant features from the initial set. 5 features were identified as the most important

The algorithm used to train the RL agent was DQN algorithm, which combines the principles of Q-learning with deep neural networks to approximate the Q-values (action values) for each state action pair

Results

  • During the testing phase, the RL agent achieved an accuracy of 83%, indicating its ability to correctly classify phishing and legitimate websites
  • Some false negatives were present
  • The achieved AUC-ROC of 85% suggests that the performance is quite robust

References