Skip to content

Commit b224389

Browse files
committed
fix some old docs
1 parent 376dd0c commit b224389

File tree

2 files changed

+34
-46
lines changed

2 files changed

+34
-46
lines changed

.vscode/ltex.hiddenFalsePositives.en-US.txt

+1
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@
33
{"rule":"EN_A_VS_AN","sentence":"^\\QThis method returns an \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q containing the\nrecommended items.\\E$"}
44
{"rule":"EN_A_VS_AN","sentence":"^\\QYou can optionally specify candidate items with an \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q\nparameter to \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q (it takes an \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q), or a list\nlength with \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q (you can also bake a default list length into the pipeline\nwhen you call \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q).\\E$"}
55
{"rule":"POSSESSIVE_APOSTROPHE","sentence":"^\\QWhere older versions of LensKit used Pandas data frames and series as the primary\ndata structures for interfacing with components\\E$"}
6+
{"rule":"A_NNS","sentence":"^\\QThe\n\\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q method returns a results object\ncontianing the metrics for individual lists, the global metrics, and easy access\n(through \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q) to summary statistics\nof per-list metrics, optionally grouped by keys such as model name.\\E$"}

docs/guide/documenting.rst

+33-46
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,10 @@ Documenting Experiments
22
=======================
33

44
.. todo::
5+
56
This chapter needs to be rewritten for :ref:`2025.1`.
67

7-
When publishing results — either formally, through a venue such as ACM Recsys,
8+
When publishing results — either formally, through a venue such as ACM RecSys,
89
or informally in your organization, it's important to clearly and completely
910
specify how the evaluation and algorithms were run.
1011

@@ -19,19 +20,10 @@ Common Evaluation Problems Checklist
1920
This checklist is to help you make sure that your evaluation and results are
2021
accurately reported.
2122

22-
* Pass `include_missing=True` to :py:meth:`~lenskit.topn.RecListAnalysis.compute`. This
23-
operation defaults to `False` for compatiability reasons, but the default will
24-
change in the future.
25-
26-
* Correctly fill missing values from the evaluation metric results. They are
27-
reported as `NaN` (Pandas NA) so you can distinguish between empty lists and
28-
lists with no relevant items, but should be appropraitely filled before
29-
computing aggregates.
30-
31-
* Pass `k` to :py:meth:`~lenskit.topn.RecListAnalysis.add_metric` with the
32-
target list length for your experiment. LensKit cannot reliably detect how
33-
long you intended to make the recommendation lists, so you need to specify the
34-
intended length to the metrics in order to correctly account for it.
23+
* Pass `k` to your ranking metrics with the target list length for your
24+
experiment. LensKit cannot reliably detect how long you intended to make the
25+
recommendation lists, so you need to specify the intended length to the
26+
metrics in order to correctly account for it.
3527

3628
Reporting Algorithms
3729
~~~~~~~~~~~~~~~~~~~~
@@ -50,17 +42,17 @@ algorithn peformance but not behavior.
5042

5143
For example:
5244

53-
+------------+-------------------------------------------------------------------------------+
54-
| Algorithm | Hyperparameters |
55-
+============+===============================================================================+
56-
| ItemItem | :math:`k_\mathrm{max}=20, k_\mathrm{min}=2, s_\mathrm{min}=1.0\times 10^{-3}` |
57-
+------------+-------------------------------------------------------------------------------+
58-
| ImplicitMF | :math:`k=50, \lambda_u=0.1, \lambda_i=0.1, w=40` |
59-
+------------+-------------------------------------------------------------------------------+
45+
+------------------+-------------------------------------------------------------------------------+
46+
| Algorithm | Hyperparameters |
47+
+============+=====================================================================================+
48+
| ItemKNNScorer | :math:`k_\mathrm{max}=20, k_\mathrm{min}=2, s_\mathrm{min}=1.0\times 10^{-3}` |
49+
+------------------+-------------------------------------------------------------------------------+
50+
| ImplicitMFScorer | :math:`k=50, \lambda_u=0.1, \lambda_i=0.1, w=40` |
51+
+------------------+-------------------------------------------------------------------------------+
6052

6153
If you use a top-N implementation other than the default
62-
:py:class:`~lenskit.basic.TopNRanker`, or reconfigure its candidate
63-
selector, also clearly document that.
54+
:py:class:`~lenskit.basic.TopNRanker`, or reconfigure its candidate selector,
55+
also clearly document that.
6456

6557
Reporting Experimental Setup
6658
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -74,12 +66,13 @@ without modification, report:
7466

7567
- The splitting function used.
7668
- The number of partitions or test samples.
69+
- The timestamp or fraction used for temporal splitting.
7770
- The number of users per sample (when using
78-
:py:class:`~lenskit.splitting.sample_users`) or records per sample (when using
79-
:py:class:`~lenskit.splitting.sample_records`).
71+
:py:func:`~lenskit.splitting.sample_users`) or records per sample (when using
72+
:py:func:`~lenskit.splitting.sample_records`).
8073
- When using a user-based strategy (either
81-
:py:class:`~lenskit.splitting.crossfold_users` or
82-
:py:class:`~lenskit.splitting.sample_users`), the test rating selection
74+
:py:func:`~lenskit.splitting.crossfold_users` or
75+
:py:func:`~lenskit.splitting.sample_users`), the test rating selection
8376
strategy (class and parameters), e.g. ``SampleN(5)``.
8477

8578
Any additional pre-processing (e.g. filtering ratings) should also be clearly
@@ -92,29 +85,23 @@ automated reporting is not practical.
9285
Reporting Metrics
9386
~~~~~~~~~~~~~~~~~
9487

95-
Reporting the metrics themelves is relatively straightforward. The
96-
:py:meth:`lenskit.topn.RecListAnalysis.compute` method will return a data frame
97-
with a metric score for each list. Group those by algorithm and report the
98-
resulting scores (typically with a mean).
88+
Reporting the metrics themselves is relatively straightforward. The
89+
:py:meth:`lenskit.bulk.RunAnalysis.measure` method returns a results object
90+
contianing the metrics for individual lists, the global metrics, and easy access
91+
(through :meth:`~lenskit.bulk.RunAnalysis.list_summary`) to summary statistics
92+
of per-list metrics, optionally grouped by keys such as model name.
9993

100-
The following code will produce a table of algorithm scores for hit rate, nDCG
101-
and MRR, assuming that your algorithm identifier is in a column named ``algo``
94+
The following code will produce a table of algorithm scores for hit rate, NDCG
95+
and MRR, assuming that your algorithm identifier is in a column named ``model``
10296
and the target list length is in ``N``::
10397

104-
rla = RecListAnalysis()
105-
rla.add_metric(topn.hit, k=N)
106-
rla.add_metric(topn.ndcg, k=N)
107-
rla.add_metric(topn.recip_rank, k=N)
108-
scores = rla.compute(recs, test, include_missing=True)
109-
# empty lists will have na scores
110-
scores.fillna(0, inplace=True)
98+
rla = RunAnalysis()
99+
rla.add_metric(Hit(k=N))
100+
rla.add_metric(NDCG(k=N))
101+
rla.add_metric(RecipRank(k=N))
102+
results = rla.measure(recs, test)
111103
# group by agorithm
112-
algo_scores = scores.groupby('algorithm')[['hit', 'ndcg', 'recip_rank']].mean()
113-
algo_scores = algo_scores.rename(columns={
114-
'hit': 'HR',
115-
'ndcg': 'nDCG',
116-
'recip_rank': 'MRR'
117-
})
104+
model_metrics = results.list_summary('model')
118105

119106
You can then use :py:meth:`pandas.DataFrame.to_latex` to convert ``algo_scores``
120107
to a LaTeX table to include in your paper.

0 commit comments

Comments
 (0)