Welcome to my comprehensive portfolio website featuring an interactive machine learning model collection and professional showcase.
This is a full-stack React application that serves as both my personal portfolio and a demonstration platform for various machine learning and AI capabilities. The site showcases my work as a Machine Learning Engineer while providing interactive demos of cutting-edge AI models.
The chat functionality has been re-architected to use a powerful, server-hosted Language Model (LLM) powered by a Python FastAPI backend. This replaces the previous implementation that used Transformers.js for client-side inference.
- FastAPI Backend: A new backend service built with FastAPI hosts a Hugging Face text-generation model (e.g.,
mistralai/Mistral-7B-Instruct-v0.2). - GPU-Powered: The backend is designed for deployment on a GPU server for high-performance inference, with Docker support included.
- Streaming API: The backend exposes a
/chatendpoint with streaming support, allowing for real-time, token-by-token responses in the UI. - Decoupled Frontend: The React frontend now communicates with the FastAPI backend, making it a lightweight and highly scalable client application.
This project now consists of two main parts: the React frontend and the FastAPI backend.
The backend runs a Hugging Face model on a Python server. It requires a machine with a GPU for optimal performance.
Prerequisites:
- Python 3.10+
- An NVIDIA GPU with CUDA drivers installed.
Instructions:
-
Navigate to the backend directory:
cd backend-fastapi -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Configure the model: Create a
.envfile in thebackend-fastapidirectory and specify the Hugging Face model you want to use:MODEL_ID="mistralai/Mistral-7B-Instruct-v0.2" -
Run the backend server:
uvicorn main:app --host 0.0.0.0 --port 8000
The server will download the model on the first run, which may take some time.
-
Navigate to the project root directory:
cd /path/to/MLAI-Stack -
Install dependencies:
npm install
-
Run the frontend development server:
npm run dev
The frontend will be available at
http://localhost:5173and will connect to the backend running on port 8000.
The FastAPI backend can be easily deployed as a Docker container, which is the recommended approach for production.
Prerequisites:
- Docker
- NVIDIA Container Toolkit (for GPU support)
Instructions:
-
Navigate to the backend directory:
cd backend-fastapi -
Build the Docker image:
docker build -t llm-backend . -
Run the Docker container with GPU access:
docker run --gpus all -p 8000:8000 -v ./huggingface:/app/huggingface llm-backend
--gpus allprovides the container with access to all available GPUs.-v ./huggingface:/app/huggingfacemounts a local directory to the container to cache the Hugging Face models, preventing re-downloads on container restarts.
Visit the live application to explore the interactive ML demos and portfolio content.
Carlos Gonzalez Rivera
- Email: cargonriv@pm.me
- LinkedIn: Connect with me
- Portfolio: Live demos and project showcase
This project demonstrates the intersection of machine learning research, software engineering, and user experience design. Each component is built with production-quality standards and serves as both a functional tool and a learning resource.