SigLIP training/fine-tune #2383
Replies: 1 comment 6 replies
-
| @alexisdrakopoulos I'm biased, but I'd typically check this repo out locally -- I have many diff instances of it -- and hack away. It's setup so that if  you're in the root of the repo you can run the scripts and it'll reference the timm module in the local path. You can still do a  Transforms and data loading is in https://github.com/huggingface/pytorch-image-models/tree/main/timm/data ... aug pipelines are built in transforms_factory.py and one can add / change the stack, can override with your own and switch to albumentations stack, etc. That said, for siglip fine-tune, you can get pretty darn good results using the script as is to fine-tune on existing datasets on the hf hub, etc. Without doing really any special augs (just default imagenet base), you can get pretty good fine-tune with these models because they are so strong: 
 There are a lot of other similarly capable encoders, in the past week I just added the aimv2 and pali2 encoders (not in a pip release yet), there are also the pali 1 encoders, the siglip models, and quite a number of CLIP encoders on different datasets.  | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
It's been a while since I've worked on CV problems. I saw some trainer scripts but they're all really high level. Do you have any advice or lower-level python scripts for training SigLIP?
I'd like control over data-loading/augmentations.
Beta Was this translation helpful? Give feedback.
All reactions