Skip to content

Fix: Replace hardcoded local path for CLIP text model with Hugging Fa…#203

Open
uttkxrrsh wants to merge 1 commit into
Vchitect:masterfrom
uttkxrrsh:patch-1
Open

Fix: Replace hardcoded local path for CLIP text model with Hugging Fa…#203
uttkxrrsh wants to merge 1 commit into
Vchitect:masterfrom
uttkxrrsh:patch-1

Conversation

@uttkxrrsh
Copy link
Copy Markdown

…ce identifier

Description

This PR addresses an issue in the YOLO-World XL configuration file (yolo_world_v2_xl_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py) where the text_model_name was hardcoded to a local absolute path (/mnt/petrelfs/...).

Leaving this hardcoded breaks the execution for anyone cloning the repository and attempting to run the configuration on their local machines or alternative compute clusters, resulting in a FileNotFoundError.

Changes Made

  • Restored 'openai/clip-vit-base-patch32' as the default text_model_name. This ensures the Hugging Face transformers library can automatically fetch and cache the model weights, making the script plug-and-play for new users.
  • Commented out the local paths. Users running this on isolated compute nodes (e.g., via Slurm) can still easily override this variable in their local workflows to point to pre-downloaded checkpoints.

Testing

  • Verified that setting text_model_name = 'openai/clip-vit-base-patch32' correctly initializes the HuggingCLIPLanguageBackbone without requiring a pre-existing local directory.

…ce identifier

### Description
This PR addresses an issue in the YOLO-World XL configuration file (`yolo_world_v2_xl_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py`) where the `text_model_name` was hardcoded to a local absolute path (`/mnt/petrelfs/...`). 

Leaving this hardcoded breaks the execution for anyone cloning the repository and attempting to run the configuration on their local machines or alternative compute clusters, resulting in a `FileNotFoundError`.

### Changes Made
* Restored `'openai/clip-vit-base-patch32'` as the default `text_model_name`. This ensures the Hugging Face `transformers` library can automatically fetch and cache the model weights, making the script plug-and-play for new users.
* Commented out the local paths. Users running this on isolated compute nodes (e.g., via Slurm) can still easily override this variable in their local workflows to point to pre-downloaded checkpoints.

### Testing
* Verified that setting `text_model_name = 'openai/clip-vit-base-patch32'` correctly initializes the HuggingCLIPLanguageBackbone without requiring a pre-existing local directory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant