Skip to content

SpecPrefill labeled 'for MoE models' but code has no MoE gating #1045

@gojack10

Description

@gojack10

The admin UI (_modal_model_settings.html) and model_settings.py docstrings describe SpecPrefill as "for MoE/hybrid models", but the implementation in patches/specprefill.py has zero MoE-specific logic . It uses architecture-agnostic query extractors including _llama_extract_queries for standard dense transformers, and no MoE checks exist in the scheduler or engine.

This is misleading for users with dense models who may skip the feature based on the label.

Fix: PR #1044 removes the "MoE" qualifier from all 3 locations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions