GitHub - ofirnachum/pyltr: Python learning to rank (LTR) toolkit

pyltr

pyltr is a Python learning-to-rank toolkit with ranking models, evaluation metrics, data wrangling helpers, and more.

This software is licensed under the BSD 3-clause license (see LICENSE.txt).

Example

Import pyltr:

import pyltr

Import a LETOR dataset (e.g. MQ2007 ):

with open('train.txt') as trainfile, \
        open('vali.txt') as valifile, \
        open('test.txt') as evalfile:
    TX, Ty, Tqids, _ = pyltr.data.letor.read_dataset(trainfile)
    VX, Vy, Vqids, _ = pyltr.data.letor.read_dataset(valifile)
    EX, Ey, Eqids, _ = pyltr.data.letor.read_dataset(evalfile)

Train a LambdaMART model, using validation set for early stopping and trimming:

metric = pyltr.metrics.NDCG(k=10)

# Only needed if you want to perform validation (early stopping & trimming)
monitor = pyltr.models.monitors.ValidationMonitor(
    VX, Vy, Vqids, metric=metric, stop_after=250)

model = pyltr.models.LambdaMART(
    metric=metric,
    n_estimators=1000,
    learning_rate=0.02,
    max_features=0.5,
    query_subsample=0.5,
    max_leaf_nodes=10,
    min_samples_leaf=64,
    verbose=1,
)

model.fit(TX, Ty, Tqids, monitor=monitor)

Evaluate model on test data:

Epred = model.predict(EX)
print 'Random ranking:', metric.calc_mean_random(Eqids, Ey)
print 'Our model:', metric.calc_mean(Eqids, Ey, Epred)

Features

Below are some of the features currently implemented in pyltr.

Models

LambdaMART (pyltr.models.LambdaMART)
- Validation & early stopping
- Query subsampling

Metrics

(N)DCG (pyltr.metrics.DCG, pyltr.metrics.NDCG)
- pow2 and identity gain functions
ERR (pyltr.metrics.ERR)
- pow2 and identity gain functions
(M)AP (pyltr.metrics.AP)
Kendall's Tau (pyltr.metrics.KendallTau)
AUC-ROC -- Area under the ROC curve (pyltr.metrics.AUCROC)

Data Wrangling

Data loaders (e.g. pyltr.data.letor.read)
Query groupers and validators (pyltr.util.group.check_qids, pyltr.util.group.get_groups)

Running Tests

Use the run_tests.sh script to run all unit tests.

Building Docs

cd into the docs/ directory and run make html. Docs are generated in the docs/_build directory.

Name	Name	Last commit message	Last commit date
Latest commit Ofir Nachum and Ofir Nachum enable using weighted sums of metrics Apr 27, 2016 a7c167c · Apr 27, 2016 History 45 Commits
docs	docs	Add ROC doc generation	Aug 25, 2015
pyltr	pyltr	enable using weighted sums of metrics	Apr 27, 2016
.gitignore	.gitignore	Add sphinx documentation	Aug 24, 2015
LICENSE.txt	LICENSE.txt	Initial commit	Aug 18, 2015
README.rst	README.rst	fix readme	Apr 24, 2016
run_tests.sh	run_tests.sh	set PYTHONPATH for running tests	Aug 24, 2015
setup.py	setup.py	Add package dependencies	Aug 24, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pyltr

Example

Features

Models

Metrics

Data Wrangling

Running Tests

Building Docs

About

Releases

Packages

Languages

License

ofirnachum/pyltr

Folders and files

Latest commit

History

Repository files navigation

pyltr

Example

Features

Models

Metrics

Data Wrangling

Running Tests

Building Docs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages