Target Sequence Clustering

Shih, Benjamin

doi:10.1184/R1/6723488.v1

Target Sequence Clustering.pdf (1.4 MB)

Target Sequence Clustering

thesis

posted on 2011-12-01, 00:00 authored by Benjamin Shih

Researchers have discovered many successful algorithms and methodologies for solving problems at the intersection of machine learning and education research. This umbrella category, “educational data mining,” has enjoyed a series of successes that span the research process, from post-hoc data analysis that generates models to the use of those models in successful educational interventions. However, most of these successes have arisen from the use of pre-existing psychological and educational constructs (e.g., guessing) and thus from the use of semi-supervised or fully-supervised machine learning algorithms. Algorithms for novel discovery, also known as unsupervised clustering, have enjoyed significantly fewer successes in this domain, partially because education data exhibit unique, complex structure.

This thesis is a mixture of algorithm development, simulation, and experimentation on real-world data, all designed to define and test a novel paradigm for clustering in education (and a range of other domains). This paradigm, target clustering, revolves around the inclusion of high-level targets, such as student learning from pre-test to post-test. This approach differs from other existing machine learning approaches in that it is designed completely, from the initial concept to the final execution, for solving educational research problems, taking advantage of the structural complexities that are problematic for other algorithms. This thesis includes a range of data sets drawn from a variety of research domains, but does not include new data from experiments in the psychological sense.1 However, the thesis includes analysis of methodology, results, and implications from an educational research perspective and relies entirely on education data and research problems.

History

Date

2011-12-01

Degree Type

Dissertation

Department

Machine Learning

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Richard Scheines,Ken Koedinger

Usage metrics

Keywords

Machine Learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Target Sequence Clustering

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports