## Mixed Membership Distributions with Applications to Modeling Multiple Strategy Usage

This dissertation examines two related questions. How do mixed membership models work? and Can mixed membership be used to model how students use multiple strategies to solve problems?

Mixed membership models have been used in thousands of applications from text and image processing to genetic microarray analysis. Yet these models are crafted on a case-by-case basis because we do not yet understand the larger class of mixed membership models.

The work presented here addresses this gap and examines two different aspects of the general class of models. First I establish that categorical data is a special case, and allows for a different interpretation of mixed membership than in the general case. Second, I present a new identifiability result that characterizes equivalence classes of mixed membership models which produce the same distribution of data. These results provide a strong foundation for building a model that captures how students use multiple strategies.

How to assess which strategies students use, is an open question. Most psychometric models either do not model strategies at all, or they assume that each student uses a single strategy on all problems, even if they allow different students to use different strategies. The problem is, that’s not what students do. Students switch strategies. Even on the very simplest of arithmetic problems, students use different strategies on different problems, and experts use a different mixture of strategies than novices do.

Assessing which strategies students use is an important part of assessing student knowledge, yet the concept of ‘strategy’ can be ill-defined. I use the Knowledge- Learning-Instruction framework to define a strategy as a particular type of integrative knowledge component. I then look at two different ways to model how students use multiple strategies.

I combine cognitive diagnosis models with mixed membership models to create a multiple strategies model. This new model allows for students to switch strategies from problem to problem, and allows us to estimate both the strategies that students are using and how often each student uses each strategy. I demonstrate this model on a modestly sized assessment of least common multiples.

Lastly, I present an analysis of the different strategies that students use to estimate numerical magnitude. Three smaller results come out of this analysis. First, this illustrates the limits of the general mixed membership model. The properties of mixed membership models developed in this dissertation show that without serious changes to the model, it cannot describe the variation between students that is present in this data set. Second, I develop a exploratory data analysis method for summarizing functional data. Finally, this analysis demonstrates that existing psychological theory for how children estimate numerical magnitude is incomplete. There is more variation between students than is captured by current theoretical models.