A Master of Science thesis in Computer Engineering by Murad Mohammad Qasaimeh entitled, "FPGA-based Parallel Hardware Architecture for Real-time Object Classification," submitted in June 2014. Thesis advisor is Dr. Tamer Shanableh and co-advisor is Dr. Assim Sagahyroon. Available are both soft and hard copies of the thesis.
Object detection is one of the most important tasks in computer vision. It has multiple applications in many different fields such as face detection, video surveillance and traffic sign recognition. Most of these applications are associated with real-time performance constraints. However, the current implementations of object detection algorithms are computationally intensive and far from real-time performance. The problem is further aggravated in an embedded systems environment where most of these applications are deployed. The high computational complexity makes implementing an embedded object detection system with real-time performance a challenging task. Consequently, there is a strong need for dedicated hardware architectures capable of delivering high detection accuracy within an acceptable processing time given the available hardware resources. The presented work investigates the feasibility of implementing an object detection system on a Field Programmable Gate Array (FPGA) platform as a candidate solution for achieving real-time performance in embedded applications. A parallel hardware architecture that accelerates the execution of three algorithms is proposed. The algorithms are: Scale Invariant Feature Transform (SIFT) feature extraction, Bag of Features (BoF) and Support Vector Machine (SVM). The proposed architecture exploits different forms of parallelism inherent in the aforementioned algorithms to reach real-time constraints. A prototype of the proposed architecture is implemented on an FPGA platform and evaluated using two benchmark datasets. On average, the speedup achieved was x55.06 times when compared with the feature extraction algorithm implemented in pure software. The speedup achieved in the classification algorithm was x6.64 times. The difference in classification accuracy between our architecture and the software implementation was less than 3%. In comparison to existing hardware solutions, our proposed hardware architecture can detect an additional 380 SFIT features in real-time. Additionally, the hardware resources utilized by our architecture are less than those required by existing solutions.