Department of Computer Science and Engineeringhttp://hdl.handle.net/11073/88162024-03-29T11:40:53Z2024-03-29T11:40:53ZTwo-Stage Deep Learning Solution for Continuous Arabic Sign Language Recognition Using Word Count Prediction and Motion ImagesShanableh, Tamerhttp://hdl.handle.net/11073/253992023-11-21T23:02:24Z2023-01-01T00:00:00ZTwo-Stage Deep Learning Solution for Continuous Arabic Sign Language Recognition Using Word Count Prediction and Motion Images
Shanableh, Tamer
Recognition of continuous sign language is challenging as the number of words is a sentence and their boundaries are unknown during the recognition stage. This work proposes a two-stage solution in which the number of words in a sign language sentence is predicted in the first stage. The sentence is then temporally segmented accordingly and each segment is represented in a single image using a novel solution that entails summation of frame differences using motion estimation and compensation. This results in a single image representation per sign language word referred to as a motion image. CNN transfer learning is used to convert each of these motion images into a feature vector which is used for either model generation or sign language recognition. As such, two deep learning models are generated; one for predicting the number of words per sentence and the other for recognizing the meaning of the sign language sentences. The proposed solution of predicting the number of words per sentence and thereafter segmenting the sentence into equal segments worked well. This is because each motion image can contain traces of previous or successive words. This byproduct of the proposed solution is advantageous as it puts words into context, thus justifying the excellent sign language recognition rates reported. It is shown that bidirectional LSTM layers result in the most accurate models for both stages. In the experimental results section we use an existing dataset that contains 40 sentences generated from 80 sign language words. The experiments revealed that the proposed solution resulted in a word and sentence recognition rates of 97.3% and 92.6% respectively. The percentage increase over the best results reported in the literature for the same dataset are 1.8% and 9.1% for both word and sentences recognitions respectively.
2023-01-01T00:00:00ZVideo-Based Recognition of Human Activity Using Novel Feature Extraction TechniquesIssa, ObadaShanableh, Tamerhttp://hdl.handle.net/11073/252982023-08-25T23:02:22Z2023-06-05T00:00:00ZVideo-Based Recognition of Human Activity Using Novel Feature Extraction Techniques
Issa, Obada; Shanableh, Tamer
This paper proposes a novel approach to activity recognition where videos are compressed using video coding to generate feature vectors based on compression variables. We propose to eliminate the temporal domain of feature vectors by computing the mean and standard deviation of each variable across all video frames. Thus, each video is represented by a single feature vector of 67 variables. As for the motion vectors, we eliminated their temporal domain by projecting their phases using PCA, thus representing each video by a single feature vector with a length equal to the number of frames in a video. Consequently, complex classifiers such as LSTM can be avoided and classical machine learning techniques can be used instead. Experimental results on the JHMDB dataset resulted in average classification accuracies of 68.8% and 74.2% when using the projected phases of motion vectors and video coding feature variables, respectively. The advantage of the proposed solution is the use of FVs with low dimensionality and simple machine learning techniques.
2023-06-05T00:00:00ZStatic Video Summarization Using Video Coding Features with Frame-level Temporal Sub-Sampling and Deep LearningIssa, ObadaShanableh, Tamerhttp://hdl.handle.net/11073/252492024-01-21T08:31:19Z2023-01-01T00:00:00ZStatic Video Summarization Using Video Coding Features with Frame-level Temporal Sub-Sampling and Deep Learning
Issa, Obada; Shanableh, Tamer
There is an abundance of digital video content due to the cloud’s phenomenal growth and security footage, it is therefore essential to summarize these videos in data centers. This paper offers innovative approaches to the problem of key-frame extraction for the purpose of video summarization. Our approach includes feature variables extracted from the bit streams of coded videos, followed by optional stepwise regression for dimensionality reduction. Once the features are extracted and reduced in dimensionality, we apply innovate frame-level temporal sub-sampling techniques followed by training and testing using deep learning architectures. The frame-level temporal subsampling techniques are based on cosine similarity and PCA projections of feature vectors. We create three different learning architectures by utilizing LSTM networks, 1D-CNN networks, and Random Forests. The four most popular video summarization datasets, namely, TVSum, SumMe, OVP, and VSUMM are used to evaluate the accuracy of the proposed solutions. This includes the Precision, Recall, F-score measures, and computational time. It is shown that the proposed solutions when trained and tested on all subjective user summaries, achieved F-scores of 0.79, 0.74, 0.88, and 0.81, respectively, for the aforementioned datasets, showing clear improvements over prior studies.
2023-01-01T00:00:00ZAssessing test suites of extended finite state machines against model and code based faultsEl-Fakih, KhaledAlzaatreh, AymanTurker, Uraz Cengizhttp://hdl.handle.net/11073/250612022-11-01T23:02:26Z2021-01-01T00:00:00ZAssessing test suites of extended finite state machines against model and code based faults
El-Fakih, Khaled; Alzaatreh, Ayman; Turker, Uraz Cengiz
Tests can be derived from extended finite state machine (EFSM) specifications considering the coverage of single-transfer faults, all transitions using a transition tour, all-uses, edge-pair, and prime path with side trip. We provide novel empirical assessments of the effectiveness of these test suites. The first assessment determines for each pair of test suites if there is a difference between the pair in covering EFSM faults of six EFSM specifications. If the difference is found significant, we determine which test suite outperforms the other. The second assessment is similar to the first; yet, it is carried out against code faults of 12 Java implementations of the specifications. Besides, two assessments are provided to determine whether test suites have better coverage of certain classes of EFSM (or code) faults than others. The evaluation uses proper data transformation of mutation scores and p-value adjustments for controlling Type I error due to multiple tests. Furthermore, we show that subsuming mutants have an impact on mutation scores of both EFSM and code faults; and accordingly, we use a score that removes them in order not to invalidate the obtained results. The assessments show that all-uses tests were outperformed by all other tests; transition tours outperformed both edge-pair and prime path with side trips; and single-transfer fault tests outperformed all other test suites. Similar results are obtained over the considered EFSM and code fault domains, and there were no significant differences between the test suites coverage of different classes of EFSM and code faults.
2021-01-01T00:00:00Z