AI-Powered Image Captioning App

This project is an AI-powered image captioning app that allows users to upload images and generates descriptive captions using Salesforce’s BLIP model. It integrates modern web development tools like ReactJS, Flask, and Hugging Face’s AI models to deliver an interactive and responsive experience.

Features

Image Upload: Upload images directly from your device.
AI-Generated Captions: Uses BLIP to produce human-like captions.
Responsive UI: Built with ReactJS for a clean and modern interface.

Tech Stack

Frontend

ReactJS with Vite for fast development and HMR.
Axios for API communication.

Backend

Flask for serving the API.
Hugging Face’s Transformers for AI model inference.

AI Model

BLIP (Bootstrapped Language-Image Pretraining) from Salesforce.

Installation and Setup

Clone the Repository

Start by cloning the repository from GitHub:

git clone git@github.com:allanninal/image-captioning-app.git
cd image-captioning-app

Backend Setup

Navigate to the backend folder:
```
cd backend
```

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

Install the dependencies from requirements.txt:
```
pip install -r requirements.txt
```
Run the Flask server:
```
python app.py
```

Frontend Setup

Navigate to the frontend folder:
```
cd frontend
```
Install dependencies:
```
npm install
```
Run the React development server:
```
npm run dev
```

Usage

Upload an Image: Use the app interface to upload an image.
Generate Captions: Click the "Generate Caption" button to see AI-generated captions for the uploaded image.
Explore Outputs: View descriptive and context-aware captions directly on the app.

How It Works

The React frontend allows users to upload images and sends them to the Flask backend.
Flask processes the image using Hugging Face's BLIP model and returns a caption.
The caption is displayed in the frontend, providing users with a seamless experience.

What’s Next?

Add drag-and-drop functionality for image uploads.
Generate multiple captions for a single image.
Enhance styling with Tailwind CSS or Material-UI.
Deploy the app to platforms like AWS, Heroku, or Vercel.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

If you find this project helpful, consider supporting me on Ko-fi:
ko-fi.com/allanninal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!