UI Screenshot Classification by Fine-tuning CLIP

This project implements a UI screenshot classification system by fine-tuning the OpenAI CLIP (Contrastive Language-Image Pre-Training) model. It classifies UI screenshots into 28 different app categories such as Sports, Travel, Dating, etc.

Project Structure

.gitattributes: Git attributes file for managing large files.
categories.pkl: Pickle file containing the list of unique classes.
clip_finetuned.pth: The fine-tuned CLIP model weights (large file, managed with Git LFS).
finetuning-clip-notebook.py: Script for fine-tuning the CLIP model.
README.md: This file, containing project documentation.
streamlit_classification.py: Streamlit app for deploying the classification model.

Features

Fine-tuned CLIP model for UI screenshot classification
28 different app categories for classification
Streamlit web application for easy image upload and classification
Visualization of top 5 predictions with probabilities

Installation

Clone this repository:

git clone https://github.com/yourusername/UI-Screenshot-Classification-by-fine-tuning-CLIP.git
cd UI-Screenshot-Classification-by-fine-tuning-CLIP

Install the required dependencies:
```
pip install -r requirements.txt
```
Note: The requirements.txt file is not shown in the project structure. Make sure to create one with all the necessary dependencies.

Usage

Fine-tuning the Model

To fine-tune the CLIP model on your own dataset:

Prepare your dataset in the format expected by the finetuning-clip-notebook.py script.
Run the fine-tuning script:
```
python finetuning-clip-notebook.py
```

This script will fine-tune the model and save the weights to clip_finetuned.pth.

Running the Streamlit App

To run the classification app locally:

Ensure you have the fine-tuned model (clip_finetuned.pth) in the project directory.

Run the Streamlit app:

streamlit run streamlit_classification.py

Open your web browser and navigate to the provided local URL (typically http://localhost:8501).
Upload a UI screenshot image to get the classification results.

Online Demo

You can try out the UI screenshot classification tool online at: [Placeholder for Streamlit Share link]

Model Performance

The current model achieves:

Accuracy: 65%
F1 Score: 62%

Note: These metrics represent the initial performance. Further optimization and experimentation with hyperparameters could potentially improve these results.

Limitations

The current model's performance (65% accuracy, 62% F1 score) leaves room for improvement.
Due to time and computational constraints, extensive hyperparameter tuning has not been performed.
The model has been trained on a subset of 50,000 samples from the RICO-SCA dataset. Training on the full dataset might yield better results.

Future Work

Experiment with different CLIP model variants (e.g., ViT-L/14).
Perform more extensive hyperparameter tuning.
Train on the full RICO-SCA dataset or incorporate additional UI screenshot datasets.
Implement data augmentation techniques to improve model generalization.
Explore ensemble methods or other advanced techniques to boost performance.

Acknowledgements

This project uses the OpenAI CLIP model.
The RICO-SCA dataset was used for training and evaluation.
Streamlit was used for creating the web application interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UI Screenshot Classification by Fine-tuning CLIP

Project Structure

Features

Installation

Usage

Fine-tuning the Model

Running the Streamlit App

Online Demo

Model Performance

Limitations

Future Work

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.gitattributes		.gitattributes
README.md		README.md
categories.pkl		categories.pkl
clip_finetuned.pth		clip_finetuned.pth
finetuning-clip-notebook.py		finetuning-clip-notebook.py
streamlit_classification.py		streamlit_classification.py

faizankhan29/UI-Screenshot-Classification-by-fine-tuning-CLIP

Folders and files

Latest commit

History

Repository files navigation

UI Screenshot Classification by Fine-tuning CLIP

Project Structure

Features

Installation

Usage

Fine-tuning the Model

Running the Streamlit App

Online Demo

Model Performance

Limitations

Future Work

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages