LensKit Demo Experiment

This repository contains a demo experiment for running LensKit experiments on public data sets with current best practices for moderately-sized experiments.

Layout

This experiment uses DVC to script the experiment, and is laid out in several subcomponents:

lkdemo is a Python package containing support code (e.g. log configurations) and algorithm definitions. Two files are of particular interest:
- lkdemo/algorithms.py defines the different algorithms we can train with sensible default configurations.
- lkdemo/datasets.py defines the different data sets, so that any supported data set can be loaded into the format LensKit expects in a uniform fashion.
data contains data files and controls.
data-split contains cross-validation splits, produced by split-data.py. These splits only contain the test files, to save disk space - the train files can be obtained with lkdemo.datasets.ds_diff, as seen in run-algo.py.
runs contains the results of running LensKit train/test runs.
Various Python scripts to run individual pieces of the analysis. They use docopt for parsing their arguments and thus have comprehensive usage docs in their docstrings.
Jupyter notebooks to analyze results. These are parameterized and run with Papermill to analyze different data sets with the same notebook.

Setup

This experiment comes with dependencies specified in pyproject.toml, and locked with uv.lock for use with [uv][]. To set up, run:

$ uv sync

This will create a virtual environment in .venv/, whic you can activate with:

$ . ./.venv/bin/activate

Running

The dvc program controls runs of individual steps, including downloading data. For example, to download the ML-20M data set and recommend with ALS, run:

dvc repro runs/dvc.yaml:ml20m@ALS

To re-run the whole experiment:

dvc repro

To reproduce results on one data set:

dvc repro eval-report-ml100k

Extending

The various dvc.yaml files control the run. Look at them to modify and extend!

You will probably want to consult the DVC user guide.

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
.dvc		.dvc
.vscode		.vscode
data-split		data-split
data		data
lkdemo		lkdemo
runs		runs
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
batch-job.sh		batch-job.sh
convert-dataset.py		convert-dataset.py
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
eval-metrics.ml-100k.json		eval-metrics.ml-100k.json
eval-metrics.ml-10m.json		eval-metrics.ml-10m.json
eval-metrics.ml-1m.json		eval-metrics.ml-1m.json
eval-metrics.ml-20m.json		eval-metrics.ml-20m.json
eval-metrics.ml-25m.json		eval-metrics.ml-25m.json
eval-metrics.ml-32m.json		eval-metrics.ml-32m.json
eval-metrics.ml-latest-small.json		eval-metrics.ml-latest-small.json
eval-report.ipynb		eval-report.ipynb
eval-report.md		eval-report.md
eval-report.ml-100k.ipynb		eval-report.ml-100k.ipynb
eval-report.ml-10m.ipynb		eval-report.ml-10m.ipynb
eval-report.ml-1m.ipynb		eval-report.ml-1m.ipynb
eval-report.ml-20m.ipynb		eval-report.ml-20m.ipynb
eval-report.ml-25m.ipynb		eval-report.ml-25m.ipynb
eval-report.ml-32m.ipynb		eval-report.ml-32m.ipynb
eval-report.ml-latest-small.ipynb		eval-report.ml-latest-small.ipynb
params.yaml		params.yaml
pyproject.toml		pyproject.toml
run-model.py		run-model.py
split-data.py		split-data.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LensKit Demo Experiment

Layout

Setup

Running

Extending

About

Releases

Packages

Contributors 2

Languages

lenskit/lk-demo-experiment

Folders and files

Latest commit

History

Repository files navigation

LensKit Demo Experiment

Layout

Setup

Running

Extending

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages