Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIT Image processor support. #149

Closed
narendra9079 opened this issue Feb 27, 2025 · 3 comments · Fixed by #151
Closed

BIT Image processor support. #149

narendra9079 opened this issue Feb 27, 2025 · 3 comments · Fixed by #151

Comments

@narendra9079
Copy link
Contributor

narendra9079 commented Feb 27, 2025

I want to use this https://huggingface.co/Xenova/dinov2-small model with fast embed but it is not supported because in load_preprocessor function the fast embed supports only CLIPImageprocessor and ConvNextfeatureExtractor image processors while this dino model is using BitImageProcessor so that was giving me error.

So, is there any future plans to support that BitImageProcessor for fast embed ?? because this dinov2 model is light weight model and its accuracy is also good.

And can you please tell why you are not supporting this BitImageProcessor ??

@narendra9079
Copy link
Contributor Author

"BitImageProcessor"=>{
if config["do_convert_rgb"].as_bool().unwrap_or(false) {
transformers.push(Box::new(ConvertToRGB));
}
println!("######### INSIDE OF BIT IMAGE ");
if config["do_resize"].as_bool().unwrap_or(false) {
let size = config["size"].clone();
let shortest_edge = size["shortest_edge"].as_u64();
let (height, width) = (size["height"].as_u64(), size["width"].as_u64());

            if let Some(shortest_edge) = shortest_edge {
                let size = (shortest_edge as u32, shortest_edge as u32);
                transformers.push(Box::new(Resize {
                    size,
                    resample: FilterType::CatmullRom,
                }));
            } else if let (Some(height), Some(width)) = (height, width) {
                let size = (height as u32, width as u32);
                transformers.push(Box::new(Resize {
                    size,
                    resample: FilterType::CatmullRom,
                }));
            } else {
                return Err(anyhow!(
                    "Size must contain either 'shortest_edge' or 'height' and 'width'."
                ));
            }
        }
    
        if config["do_center_crop"].as_bool().unwrap_or(false) {
            let crop_size = config["crop_size"].clone();
            let (height, width) = if crop_size.is_u64() {
                let size = crop_size.as_u64().unwrap() as u32;
                (size, size)
            } else if crop_size.is_object() {
                (
                    crop_size["height"]
                        .as_u64()
                        .map(|height| height as u32)
                        .ok_or(anyhow!("crop_size height must be contained"))?,
                    crop_size["width"]
                        .as_u64()
                        .map(|width| width as u32)
                        .ok_or(anyhow!("crop_size width must be contained"))?,
                )
            } else {
                return Err(anyhow!("Invalid crop size: {:?}", crop_size));
            };
            transformers.push(Box::new(CenterCrop {
                size: (width, height),
            }));
        }
    
        transformers.push(Box::new(PILToNDarray));
    
        if config["do_rescale"].as_bool().unwrap_or(true) {
            let rescale_factor = config["rescale_factor"].as_f64().unwrap_or(1.0f64 / 255.0);
            transformers.push(Box::new(Rescale {
                scale: rescale_factor as f32,
            }));
        }
    
        if config["do_normalize"].as_bool().unwrap_or(false) {
            let mean = config["image_mean"]
                .as_array()
                .ok_or(anyhow!("image_mean must be contained"))?
                .iter()
                .map(|value| {
                    value
                        .as_f64()
                        .map(|num| num as f32)
                        .ok_or(anyhow!("image_mean must be float"))
                })
                .collect::<Result<Vec<f32>>>()?;
            let std = config["image_std"]
                .as_array()
                .ok_or(anyhow!("image_std must be contained"))?
                .iter()
                .map(|value| {
                    value
                        .as_f64()
                        .map(|num| num as f32)
                        .ok_or(anyhow!("image_std must be float"))
                })
                .collect::<Result<Vec<f32>>>()?;
            transformers.push(Box::new(Normalize { mean, std }));
        }
    }

I added this code in load_preprocessor function to support the BITImageProcessor. With this code I am able to use this https://huggingface.co/Xenova/dinov2-small model can you please share some opinion on this??

@Anush008
Copy link
Owner

Anush008 commented Mar 4, 2025

Cool. Please contribute if you're interested.

@narendra9079
Copy link
Contributor Author

Can you please check this weather it is correct or not ?? when I checked then it was working also I made these changes according to the processor_config file of dino model just like CLIP Image processor config files function are there in fastembed.

@Anush008 Anush008 linked a pull request Mar 11, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants