Skip to content

Commit 0da128f

Browse files
fix tasks list (#906)
* Update registry.py refix the tasks list command * Update registry.py feat(tasks): add suite filtering and fix community task discovery - Revert the community task path fix - Add --suites parameter to lighteval tasks list - Default to core suites only to prevent overwhelming output - Add dependency checking for multilingual tasks * Update main_tasks.py feat(tasks): add suite filtering and fix community task discovery - Revert the community task path fix - Add --suites parameter to lighteval tasks list - Default to core suites only to prevent overwhelming output - Add dependency checking for multilingual tasks * Update aimo_evals.py fix `community` tasks loading error due to an unexpected keyword argument 'metric' * Update german_rag_evals.py fix `community` tasks loading error due to an unexpected keyword argument 'metric' * Update oz_evals.py fix `community` tasks loading error due to an unexpected keyword argument 'metric' * Update serbian_eval.py fix `community` tasks loading error due to an unexpected keyword argument 'metric' * Update turkic_evals.py fix `community` tasks loading error due to an unexpected keyword argument 'metric' * Update turkic_evals.py Remove the import of `MetricUseCase` because it is given the following error: `Failed to load community tasks from turkic_evals: cannot import name 'MetricUseCase' from 'lighteval.metrics.utils.metric_utils' (/~/repos/lighteval-070825/lighteval/src/lighteval/metrics/utils/metric_utils.py) (registry.py:70)` * Update registry.py adding 'all' as one option in the `--suites` arg to simplify getting the list of both core and optional suites * Update main_tasks.py adding 'all' as one option in the `--suites` arg to simplify getting the list of both core and optional suites
1 parent 222a31c commit 0da128f

File tree

7 files changed

+194
-74
lines changed

7 files changed

+194
-74
lines changed

community_tasks/aimo_evals.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ def aimo_prompt(line, task_name: str = None):
4949
evaluation_splits=["train"],
5050
few_shots_split="train",
5151
few_shots_select="sequential",
52-
metric=[Metrics.quasi_exact_match_math],
52+
metrics=[Metrics.quasi_exact_match_math],
5353
generation_size=2048,
5454
stop_sequence=None,
5555
)

community_tasks/german_rag_evals.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ def prompt_fn_context_question_match(line, task_name: str = None):
162162
evaluation_splits=["test"],
163163
few_shots_split="test",
164164
few_shots_select="sequential",
165-
metric=[Metrics.loglikelihood_acc],
165+
metrics=[Metrics.loglikelihood_acc],
166166
version=1,
167167
)
168168

@@ -179,7 +179,7 @@ def prompt_fn_context_question_match(line, task_name: str = None):
179179
evaluation_splits=["test"],
180180
few_shots_split="test",
181181
few_shots_select="sequential",
182-
metric=[Metrics.loglikelihood_acc],
182+
metrics=[Metrics.loglikelihood_acc],
183183
version=1,
184184
)
185185

@@ -197,7 +197,7 @@ def prompt_fn_context_question_match(line, task_name: str = None):
197197
evaluation_splits=["test"],
198198
few_shots_split="test",
199199
few_shots_select="sequential",
200-
metric=[Metrics.loglikelihood_acc],
200+
metrics=[Metrics.loglikelihood_acc],
201201
version=1,
202202
)
203203

@@ -214,7 +214,7 @@ def prompt_fn_context_question_match(line, task_name: str = None):
214214
evaluation_splits=["test"],
215215
few_shots_split="test",
216216
few_shots_select="sequential",
217-
metric=[Metrics.loglikelihood_acc],
217+
metrics=[Metrics.loglikelihood_acc],
218218
version=1,
219219
)
220220

community_tasks/oz_evals.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ def prompt_fn_oz_eval_task(line, task_name: str = None):
7878
evaluation_splits=["test"],
7979
few_shots_split=None,
8080
few_shots_select=None,
81-
metric=[Metrics.loglikelihood_acc],
81+
metrics=[Metrics.loglikelihood_acc],
8282
version=0,
8383
)
8484

0 commit comments

Comments
 (0)