Skip to content

[BUG] Sampler crashes with some vocabulary sizes? #778

@hidoba

Description

@hidoba

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Pytorch version

2.6

Model

No response

Describe the bug

I'm trying to run sampling on my own data and I can't get it to work on most of the small vocabulary sizes (the c extension crashes).

Reproduction steps

Minimal example:

import torch
from exllamav2.ext import exllamav2_ext as ext_c, none_tensor
import random

VOCAB_SIZE = 40001 # 40000 works, 40001 doesn't

# Generate uniform random logits between -10 and 10
logits = (torch.rand(1, 1, VOCAB_SIZE, dtype=torch.float32) * 20) - 10

output_tokens = torch.empty((1, 1), dtype=torch.long)
output_probs = torch.empty((1, 1), dtype=torch.float)

output_ktokens = none_tensor
output_kprobs = none_tensor

m = ext_c.sample_basic(
    logits,                 # logits
    0.8,            # temp_scale
    50,                  # top_k
    0.8,                  # top_p
    0,                  # top_a
    0,                  # min_p
    0,                    # tfs
    0,                # typical
    random.random(),        # random_num
    output_tokens,          # output_tokens_tensor
    output_probs,           # output_probs_tensor
    output_kprobs,          # output_kprobs_tensor
    output_ktokens,          # output_ktokens_tensor
    none_tensor,            # logit_filter
    False,                  # mirostat
    [],                     # mirostat_mu
    1.5,           # mirostat_tau
    0.1,           # mirostat_eta
    1,                    # temp
    none_tensor,            # xtc_mask
    0,        # xtc_probability
    0,          # xtc_threshold
    0,               # min_temp
    0,               # max_temp
    0,          # temp_exponent
    0,       # smoothing_factor
    0                    # skew
)
print(f"Sampling finished. Output token: {output_tokens.item()}")

Expected behavior

I expect it to work (give the sampled output token).

Logs

double free or corruption (!prev)
Aborted (core dumped)

Additional context

Exllamav2 Version: 0.2.9+cu124.torch2.6.0

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions