Show simple item record

dc.contributor.advisorShanableh, Tamer
dc.contributor.authorHassan, Mahitab Alaaeldin
dc.date.accessioned2017-06-15T07:23:38Z
dc.date.available2017-06-15T07:23:38Z
dc.date.issued2017-05
dc.identifier.other35.232-2017.19
dc.identifier.urihttp://hdl.handle.net/11073/8879
dc.descriptionA Master of Science thesis in Computer Engineering by Mahitab Alaaeldin Hassan entitled, "Predicting Compression Modes and Split Decisions for HEVC Video Coding Using Machine Learning Techniques," submitted in May 2017. Thesis advisor is Dr. Tamer Shanableh. Soft and hard copy available.en_US
dc.description.abstractThe High Efficiency Video Coding (HEVC) standard presents a substantial video compression efficiency improvement at the expense of increasing the computational complexity. This enhancement is primarily due to the introduction of flexible quad-based-tree partitioning structures for motion estimation (ME) and image transformation. However, finding the optimum coding structure, which is done by an exhaustive rate-distortion optimization (RDO) process, is what contributes to increasing the computational complexity. In this thesis, we propose a set of early termination algorithms to reduce the HEVC video encoding complexity by predicting both the split decisions of Coding Units (CUs) and the coding modes of Prediction Units (PUs). A video sequence-dependent approach is used in which frames belonging to the video being encoded are utilized for generating a classification model. At each CU depth level, features representing the given CU are extracted from both the current and previously encoded CUs. The feature vectors (FVs) are then utilized for generating dimensionality reduction and classification models. These models are in turn used at each coding depth to predict the split and mode decisions of subsequence CUs. In this work, we use stepwise regression, random forest feature importance, and Principal Component Analysis (PCA) for dimensionality reduction. Moreover, polynomial networks, random forests, and J48 decision trees are used for classification. Using seventeen video sequences with four different spatial resolution classes, the proposed solution is assessed in terms of the classification accuracy, Bjontegaard Delta bitrate (BD-rate), BD Peak Signal-to-Noise Ratio (BD-PSNR) and computational complexity reduction (CCR). On average, the CU early termination scheme achieved a CCR of 38.5% with an average classification accuracy of 78.1% at a negligible cost of 0.539% and -0.021 dB in terms of BD-rate and BD-PSNR, respectively. The PU early termination scheme attained an overall CCR of 20.9% with an average classification accuracy of 86.5% at the cost of a BD-rate of 0.248% and a BD-PSNR of -0.01 dB. When jointly implemented, an overall CCR of 50.1% was achieved with a BD-rate increase of 2% and a BD-PSNR decrease of 0.079 dB.en_US
dc.description.sponsorshipCollege of Engineeringen_US
dc.description.sponsorshipDepartment of Computer Science and Engineeringen_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesMaster of Science in Computer Engineering (MSCoE)en_US
dc.subjectVideo codingen_US
dc.subjectHEVC (High Efficiency Video Coding)en_US
dc.subjectMachine learningen_US
dc.subject.lcshVideo compressionen_US
dc.subject.lcshMachine learningen_US
dc.titlePredicting Compression Modes and Split Decisions for HEVC Video Coding Using Machine Learning Techniquesen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record