FPGA-Based Network Traffic Classification Using Machine Learning

Elnawawy, Mohammed

dc.contributor.advisor	Shanableh, Tamer
dc.contributor.advisor	Sagahyroon, Assim
dc.contributor.author	Elnawawy, Mohammed
dc.date.accessioned	2020-01-20T07:09:58Z
dc.date.available	2020-01-20T07:09:58Z
dc.date.issued	2019-11
dc.identifier.other	35.232-2019.45
dc.identifier.uri	http://hdl.handle.net/11073/16556
dc.description	A Master of Science thesis in Computer Engineering by Mohammed Elnawawy entitled, “FPGA-Based Network Traffic Classification Using Machine Learning”, submitted in November 2019. Thesis advisor is Dr. Tamer Shanableh and thesis co-advisor is Dr. Assim Sagahyroon. Soft copy is available (Thesis, Approval Signatures, Completion Certificate, and AUS Archives Consent Form).	en_US
dc.description.abstract	Traffic classification is the process of associating network traffic with the application or group of applications that generated it. It is an essential part of network management at datacentres and network operators due to its importance in traffic shaping, bandwidth allocation, and cybersecurity. Several techniques were investigated by researchers to classify traffic accurately with methods based on machine learning achieving encouraging results. In this work, we conduct several experiments using naïve Bayes, support vector machine, k-nearest neighbour, and random forest trees on two traffic datasets which are both publicly available. While the first dataset was collected in an uncontrolled environment that resembles real network behavior, the second was captured using a highly controlled environment. In the experiments conducted in this work, we look at the classifiers’ performance and their effect on the classification accuracy and F-score. We also assess the suitability of extracted features using feature selection techniques. Moreover, we determine the optimal percentage of packets within a flow that need to be considered while extracting flow-level features. It is observed that when a larger number of packets is considered, the classification performance improves, but the required processing delay increases. Thus, we argue that 60% of packets in a flow would be a good compromise that ensures high performance in the least possible time. Several graphs are generated during each experiment to investigate the effect of varying each parameter on the classification performance. The results of our experiments indicate that random forest outperforms all other algorithms achieving a maximum accuracy of 98.5% and an F-score of 0.932. Finally, since software-based classifiers are usually slow and hence incapable of coping with the increasing amount of traffic within congested networks, we implement a highly pipelined random forest classifier on a Field-Programmable Gate Array (FPGA). The implementation makes use of the parallel architecture of the FPGA in accelerating such a time-consuming task. The implemented design is capable of achieving an average throughput of 163.24 Gbps which is more than twice the maximum throughput compared to reported work. This enables datacentres to achieve efficient online traffic classification given the dynamic nature of modern networks.	en_US
dc.description.sponsorship	College of Engineering	en_US
dc.description.sponsorship	Department of Computer Science and Engineering	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	Master of Science in Computer Engineering (MSCoE)	en_US
dc.subject	Traffic classification	en_US
dc.subject	Machine learning	en_US
dc.subject	Random forest	en_US
dc.subject	Feature extraction	en_US
dc.subject	FPGA	en_US
dc.subject	Field-Programmable Gate Array (FPGA)	en_US
dc.title	FPGA-Based Network Traffic Classification Using Machine Learning	en_US
dc.type	Thesis	en_US

Files in this item

Name:: 35.232-2019.45a Mohammed Elnaw ...
Size:: 17.58Mb
Format:: PDF

View/ Open

Name:: 35.232-2019.45a Mohammed Elnaw ...
Size:: 2.386Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Masters Theses

Show simple item record