Skip to content

Adaptive Testing

PWardell86 edited this page Nov 11, 2025 · 17 revisions

Overview

We are using the python library adaptivetesting

The idea is that every user has an unknown, real "ability score". The goal is to estimate this real ability score by asking questions that will give the most information in terms of the real ability score. The ability score is updated with various methods such as Maximum Likelihood Estimation...

There are also many kinds of models each with different parameters. The most common are 1PL, 2PL and 3PL.

3PL. Each question has parameters:

  • $a$: Discrimination - Essentially how much information you get from a response. High discrimination means it is easy to tell the user's ability from this question.
  • $b$: Difficulty - A $\theta$ (ability score) such that a user with that ability score has a 50% chance of getting it right.
  • $c$: Guessing - Self-explanatory. The probability a user answers correctly based on chance alone.

1PL is just $b$, 2PL is just $a$ and $b$.

For MacFAST we are starting off with a 1PL model. The question difficulty is given as the proportion of students who answered correctly (or incorrectly) in the past. To convert this to the correct b for the model we have. $b = \theta_m - \ln(\frac{p}{1 - p})$ where $\theta_m$ is the mean ability score for the "population". This was derived from $P[\textnormal{correct} \mid \theta] = (1 + e^{-(\theta_m - b)})^{-1}$. We start off with the assumption $\theta_m = 0$.

Components of CAT

  1. Item bank calibrated with IRT
  2. Ability score starting point (Typically 0)
  3. Item selection algorithm
  4. Scoring method
  5. Termination criterion

Item Selection Algorithm

TODO if we care about this

Scoring Method

TODO if we care about this

Minimal Example

CAT-Example

Clone this wiki locally