We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 3416a45 commit b6cd9ccCopy full SHA for b6cd9cc
xllm/core/common/global_flags.cpp
@@ -390,6 +390,9 @@ DEFINE_string(reasoning_parser,
390
// --- qwen3 reranker config ---
391
DEFINE_bool(enable_qwen3_reranker, false, "Whether to enable qwen3 reranker.");
392
393
-DEFINE_bool(enable_prefetch_weight,
394
- false,
395
- "Whether to enable prefetch weight.");
+DEFINE_bool(
+ enable_prefetch_weight,
+ false,
396
+ "Whether to enable prefetch weight,only applicable to Qwen3-dense model."
397
+ "The default prefetching ratio for gateup weight is 40%."
398
+ "If adjustments are needed, e.g. export PREFETCH_COEFFOCIENT=0.5");
0 commit comments