Phishing Email and URL Detection using Machine learning and Deep learning

M, Somesha

Please use this identifier to cite or link to this item: https://idr.l3.nitk.ac.in/jspui/handle/123456789/17720

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Pais, Alwyn Roshan	-
dc.contributor.author	M, Somesha	-
dc.date.accessioned	2024-04-29T09:57:05Z	-
dc.date.available	2024-04-29T09:57:05Z	-
dc.date.issued	2023	-
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/17720	-
dc.description.abstract	The research thesis attempts to address the issue of email phishing, which poses a se- rious risk to businesses and corporations. Through the use of social engineering strate- gies, email phishing assaults persuade users to divulge personal data that can be ex- ploited to access their digital assets. Despite the presence several defenses, the Anti- Phishing Working Group survey reveals that the present approaches to phishing attack detection are still insufficient and ineffective. This underlines the requirement for a more effective system to identify phishing emails and offer greater protection against such assaults to the end user. There exist many machine learning based techniques to detect phishing emails. Also, they use a large number of heuristics to classify the email. To overcome the dis- advantages of existing schemes, we have presented an efficient word embedding cum machine learning framework to classify the emails. The presented technique uses only four email header based heuristics (i.e. From, Return-path, Subject, and Message-ID). The model achieved a significant accuracy of 99.50% using FastText-CBOW algorithm in combination with the Random Forest classifier. Although machine learning based techniques achieved significant accuracy, it is ad- visable to use deep learning models whenever we have sufficient data. We have pre- sented an efficient deep learning model called ”DeepEPhishNet” for the classification of emails. The presented model based on FastText-SkipGram with Deep Neural Network (DNN) achieved a significant accuracy of 99.52%, TPR of 99.38%, TNR of 99.92%, F-Score of 99.68%, Precision of 99.97%, and MCC of 98.71%. The above methods make use of only four email header based heuristics for the classification. To study the contribution of the email body in the detection of phishing emails, we have presented an efficient model using transformers. The presented model achieved an accuracy of 99.51% using open source datasets. The body of the email might contain phishing URLs, which may lead to a phishing attack. In order to overcome this, we have presented an efficient deep learning basedmodel for phishing URL detection. The accuracy achieved for the DNN, LSTM, and CNN are 99.52%, 99.57%, and 99.43% respectively. Overall, this research thesis presents efficient techniques for detecting phishing emails and URLs using word embedding, deep learning, and machine learning clas- sifiers.	en_US
dc.language.iso	en	en_US
dc.publisher	National Institute Of Technology Karnataka Surathkal	en_US
dc.title	Phishing Email and URL Detection using Machine learning and Deep learning	en_US
dc.type	Thesis	en_US
Appears in Collections:	1. Ph.D Theses

Files in This Item:

File	Description	Size	Format
187105-CO004-Somesha M.pdf		8.95 MB	Adobe PDF	View/Open

Show simple item record