Skip to content

[misc] improve diagnostics for missing processor on multi-modal templates#10553

Open
gaurav0107 wants to merge 2 commits into
hiyouga:mainfrom
gaurav0107:fix/10539-qwen3-5-finetune-processor-was-not-found
Open

[misc] improve diagnostics for missing processor on multi-modal templates#10553
gaurav0107 wants to merge 2 commits into
hiyouga:mainfrom
gaurav0107:fix/10539-qwen3-5-finetune-processor-was-not-found

Conversation

@gaurav0107

Copy link
Copy Markdown

Summary

Fixes #10539 (and helps several similar reports: #10447, #10193, #9385, #9182, #8780).

When a multi-modal template (e.g. qwen3_5, qwen3_vl, glm4v) is selected but AutoProcessor.from_pretrained fails or returns a non-Processor instance, model/loader.py silently sets processor = None. Later, mm_plugin._validate_input raises:

ValueError: Processor was not found, please check and update your model file.

This message gives users no idea what to do. The actual root cause is almost always one of:

  • An incomplete HF cache (e.g. missing preprocessor_config.json),
  • A transformers version that does not yet support the model's processor class, or
  • The wrong template choice (a multi-modal template selected for text-only fine-tuning).

Changes

  • src/llamafactory/model/loader.py
    • Promote info_rank0 -> warning_rank0 for the silent processor-load failure so users actually see the underlying exception.
    • Log a warning (instead of debug) when a loaded processor is dropped because its class name does not contain "Processor", including the dropped class name for diagnosis.
  • src/llamafactory/data/mm_plugin.py
    • Replace the four generic "please check and update your model file." errors with actionable guidance: verify the model files are complete, upgrade transformers, or pick a text-only template if multi-modal inputs are not needed.

Behavior

  • No change on the success path.
  • Only diagnostic improvements in the failure path. No tests pin the prior message strings (verified by grep across tests/ and tests_v1/).

Verification

  • uvx ruff check src/llamafactory/data/mm_plugin.py src/llamafactory/model/loader.py -> all checks passed
  • uvx ruff format --check src/llamafactory/data/mm_plugin.py src/llamafactory/model/loader.py -> already formatted
  • python3 tests/check_license.py scripts src tests tests_v1 -> exit 0

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error messages and logging when multi-modal processors fail to load. In mm_plugin.py, ValueError messages are expanded with detailed troubleshooting steps, and in loader.py, loading failures and dropped processors are now logged as warnings with additional context. The review feedback highlights two key improvements: nesting the try-except blocks in loader.py to prevent unhandled exceptions from sibling except blocks, and expanding the processor check in mm_plugin.py to cover video and audio tokens as well as image tokens.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 109 to 115
except Exception as e:
logger.info_rank0(f"Failed to load processor: {e}.")
logger.warning_rank0(
f"Failed to load processor: {e}. "
"If your model expects a multi-modal processor, please verify the model files are complete "
"(e.g. `preprocessor_config.json`) and that your `transformers` version supports this model."
)
processor = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In Python, an exception raised inside an except block (such as the except ValueError: block on lines 103-108) will not be caught by a sibling except Exception as e: block of the same try statement. If the second AutoProcessor.from_pretrained call fails, the exception will propagate and crash the application instead of being caught and logged with the new diagnostic warning.

To ensure all processor loading failures are gracefully caught and logged, the inner try/except ValueError should be nested inside an outer try/except Exception block.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed in e19bce0 by nesting the inner except ValueError retry inside an outer try so any failure from the second AutoProcessor.from_pretrained call is now caught by the outer except Exception as e and logged via warning_rank0. Thanks for the review.

Comment thread src/llamafactory/data/mm_plugin.py Outdated
@@ -175,16 +175,31 @@ def _validate_input(
)

if self.image_token is not None and processor is None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the model/template is video-only or audio-only (where self.image_token is None but self.video_token or self.audio_token is set) and the processor fails to load, this check will be skipped. The user will then receive a more specific (and potentially misleading) error message like Video processor was not found or Audio feature extractor was not found instead of the comprehensive diagnostic message for the missing processor.

Checking if any of the multi-modal tokens are expected when processor is None provides a much more robust and helpful diagnostic experience.

Suggested change
if self.image_token is not None and processor is None:
if (self.image_token is not None or self.video_token is not None or self.audio_token is not None) and processor is None:

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed in e19bce0 by checking all three modality tokens (image, video, audio). Video-only/audio-only templates with a missing processor will now surface the comprehensive diagnostic instead of the more specific sub-processor error. Thanks for the review.

…ates

When a multi-modal template (e.g. qwen3_5, qwen3_vl, glm4v) is selected but
AutoProcessor.from_pretrained fails or returns a non-Processor instance, the
processor is silently set to None and _validate_input later raises a generic
"Processor was not found, please check and update your model file." error.

Users have repeatedly reported this with no clue what to fix
(see hiyouga#10539, hiyouga#10447, hiyouga#10193, hiyouga#9385, hiyouga#9182, hiyouga#8780).

This change:
- Promotes the silent processor-load failure log from info to warning so
  users actually see the underlying error.
- Adds the dropped processor's class name to the existing info log when a
  loaded processor is dropped because its name does not contain "Processor"
  (kept at info to avoid noise for legitimate text-only loads of repos that
  also expose multi-modal files).
- Replaces the four "please check and update your model file" errors in
  mm_plugin._validate_input with actionable guidance: verify model files
  are complete, upgrade transformers, or use a text-only template.

No behavior change on the success path; only diagnostics in the failure path.
@gaurav0107 gaurav0107 force-pushed the fix/10539-qwen3-5-finetune-processor-was-not-found branch from 2881d87 to 9bac771 Compare June 6, 2026 13:34
@gaurav0107 gaurav0107 marked this pull request as ready for review June 6, 2026 13:34
- loader.py: nest the inner ValueError retry inside the outer Exception
  handler so a failure on the second AutoProcessor.from_pretrained call
  is caught and logged instead of propagating.
- mm_plugin.py: include video_token and audio_token in the "processor is
  None" guard so video-only/audio-only templates also surface the
  comprehensive diagnostic instead of a misleading sub-processor error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

llama-factory-0.9.5版本qwen3.5微调报ValueError: Processor was not found, please check and update your model file.

1 participant