Identifying Phishing URLs using Cosine Similarity

Authors

  • Bhawna Sharma
  • Parvinder Singh

Abstract

Phishing is one of the serious issues looked by digital world and prompts budgetary misfortunes for ventures and people. Discovery of phishing assault with high precision has consistently been a difficult issue. Phishing site looks fundamentally the same as in appearance to its relating genuine site to beguile clients into accepting that they are perusing the right site. In this article, we acquaint with cosine-similarity centered phishing identification technique which calculates cosine-similarity between test vectors and training vectors. A high value of cosine-similarity indicates more similarity between the two vectors. The proposed technique is highly efficient. We test our technique using 100 URLs in testing dataset and 300 URLs in training dataset. Experiments show that the proposed technique classified the test data with 98.7% accuracy.

Downloads

Published

2020-02-09

Issue

Section

Articles