Show simple item record

dc.contributor.advisorZualkernan, Imran
dc.contributor.authorYousuf, Tasneem
dc.date.accessioned2018-06-05T06:52:02Z
dc.date.available2018-06-05T06:52:02Z
dc.date.issued2018-05
dc.identifier.other35.232-2018.12
dc.identifier.urihttp://hdl.handle.net/11073/9357
dc.descriptionA Master of Science thesis in Computer Engineering by Tasneem Yousuf entitled, “A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data”, submitted in May 2018. Thesis advisor is Dr. Imran Ahmed Zualkernan. Soft and hard copy available.en_US
dc.description.abstractRecent availability of very large amounts of educational data in digital format often leads to data overload where it is difficult to determine important trends and patterns beyond those provided by traditional statistical techniques. Therefore, educational data mining (EDM) has emerged. Association mining is a type of EDM technique which is well-known for discovering relationships from data with high scale and velocity, but low variety and veracity. This analysis can be performed at the micro-level (e.g., for teachers), meso-level (e.g., for cohorts of schools), or at macro-levels (e.g., at region, province, or country level). This thesis proposes a methodology for the application of association mining to multi-tier sparse and error-ridden educational data. The methodology uses rule templates and is organized around the four analytical dimensions of people, process, environment, and outcomes. The methodology defines Extract Transform and Load (ETL) processes for this type of data and shows how data from lower levels is aggregated to baskets at higher levels. The proposed methodology was applied to data collected from a large-scale continuous professional development (CPD) process for 2,613 teachers in a developing country. The methodology was used to mine interesting rules which were evaluated using the objective metrics of Support, Confidence, and Lift to determine the quality of rules. The Confidence for each level was set to be at least 0.85. The results are that micro-level analysis (n = 2613 teachers) yielded little or no rules with a very low mean Support of 0.00345 (sd. = 0.00214) and mean Lift 6.98 (sd. = 4.63). The situation remained somewhat the same at the meso-level (n = 1391 schools) with a mean Support of 0.0059 (sd. = 0.00051) and mean Lift of 5.46 (sd. = 3.23). The results were significantly better at the macro level (n = 59 clusters) with a mean Support of 0.089 (sd. = 0.021) and mean Lift of 5.925 (sd. = 2.5). The mined rules discovered several anomalies and fidelity violations in the CPD process at various levels. The methodology was also useful in identifying small groups of teachers (6-8 teachers), schools (8-10 schools), and clusters (4-7 clusters) with common characteristics that can be further administered to help improve the CPD process.en_US
dc.description.sponsorshipCollege of Engineeringen_US
dc.description.sponsorshipDepartment of Computer Science and Engineeringen_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesMaster of Science in Computer Engineering (MSCoE)en_US
dc.subjecteducational analyticsen_US
dc.subjectassociation miningen_US
dc.subjectrule discoveryen_US
dc.subjectApriorien_US
dc.subjectmarket basket analysisen_US
dc.subjectdeveloping countriesen_US
dc.titleA Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Dataen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record