Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@shubhra
Copy link

@shubhra shubhra commented Aug 11, 2023

Example:

numactl -C0-15 python deepsparse/src/deepsparse/transformers/eval_downstream.py \
        <model_path>\
        --num-cores 16 \
        --dataset openai_humaneval \
        --humaneval-method pass_at_k \
        --engine deepsparse \
        --start 0 \
        --max-samples 2 
  • This will create a subset of the HumanEval dataset starting at index 0 (start) and pick 2 samples (max-samples) to run the evaluation on.
  • If benchmark-humaneval argument is supplied, the evaluation will run on a pre-selected smaller subset of the dataset that contains 11 samples and will ignore start and max-samples.
  • Set humaneval-method to perplexity to evaluate perplexity instead of pass@k.
  • Add --n-solutions <n> to specify the number of solutions required per task . Default is 1.

Note: Remove numactl -C0-15 if you don't need to specify which cores to run on.

@shubhra shubhra marked this pull request as draft August 11, 2023 14:56
@shubhra shubhra changed the title Changes to support pass at k evaluation on the HumanEval dataset Changes to support pass@k evaluation on the HumanEval dataset Aug 11, 2023
@jeanniefinks
Copy link
Member

Per the main README announcement, DeepSparse is being deprecated by June 2, 2025. Closing the PR as work has been suspended; thank you for the inputs and support!

@jeanniefinks jeanniefinks deleted the shubhra/humaneval_pass@k branch May 29, 2025 23:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants