Skip to content

flexaihq/blueprints

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlexAI Experiments

This repository provides a set of different experiments designed for you to try out and explore FlexAI. These experiments range from running your very first training job, to fine-tuning language, diffusion, and text-to-speech models using techniques like QLoRA and LoRA, as well as integrating FlexAI with other platforms, such as experiment trackers.

Getting Started

Prerequisites

FlexAI CLI: Install the FlexAI CLI by following the steps shown in the Installing the FlexAI CLI guide.

The Experiments

The following table lists out the experiments available in this repository. Each experiment is designed to walk you through a specific use case and contains its required code, as well as detailed instructions on how to run it on FlexAI:

No. Section Description
1 A Simple Training Job on FlexAI Step-by-step guide to get your first Training Job on FlexAI running
2 A Simple Distributed Data Parallel (DDP) Training Job on FlexAI Demonstrates that you only need to add 2 flags in order to start a DDP Training Job
3 Resuming a Training Job from a Checkpoint Learn how to resume a Training Job from a previously saved checkpoint
4 Streaming Large Datasets During a Training Job Train a model on a large dataset using streaming
5 Training Job & Experiment Tracking Using Weights and Biases with FlexAI for experiment tracking
6 Fine-Tuning a Language Model with QLoRA Fine-tune a causal language model efficiently using QLoRA
7 Fine-Tuning a Diffusion Model with LoRA Fine-tune a diffusion model efficiently using LoRA
8 Fine-Tuning a Text-to-Speech Model Fine-tune a text-to-speech (TTS) model
9 Fine-Tuning a language Model using Flash Attention Fine-tune a causal language model efficiently using the flash-attn package
10 Train and Serve a French LLM on FlexAI with LlamaFactory Fine-tune Qwen2.5-7B on French data using LlamaFactory and deploy as a production-ready inference endpoint
11 Train and Serve Language Models with Axolotl Fine-tune language models on domain-specific data using Axolotl framework with custom dataset configurations and FSDP
12 Reinforcement Learning Fine-Tuning with EasyR1 Train language models with RL using GRPO and DAPO algorithms for enhanced reasoning capabilities
13 GRPO Training on Vision-Language Models Fine-tune vision-language models using GRPO reinforcement learning with HuggingFace TRL, LoRA, vLLM, and DeepSpeed ZeRO-3
14 Language Model Evaluation with LM-Evaluation-Harness Comprehensive evaluation of language models across 300+ tasks and benchmarks using the LM-Evaluation-Harness framework
15 RAG Application with LangChain and FlexAI Inference Endpoints Interactive interface for users to ask questions based on provided documents using Retrieval-Augmented Generation
16 Speech-to-Text Application Using FlexAI Inference Endpoints Interactive interface for recording audio messages and receiving transcriptions
17 Multi-Agent Orchestration with LangGraph Build a multi-agent system where specialized AI agents work together under a central supervisor
18 Text-to-Image Inference with FlexAI Endpoints Deploy and use Stable Diffusion 3.5 Large for high-quality image generation via FlexAI inference endpoints
19 Text-to-Audio Inference with FlexAI Endpoints Deploy and use Stable Audio Open 1.0 for high-quality audio generation via FlexAI inference endpoints
20 Text-to-Speech Inference with FlexAI Endpoints Deploy and use the Kokoro model for natural voice synthesis via FlexAI inference endpoints
21 Text-to-Video Inference with FlexAI Endpoints Deploy and use Wan2.2-T2V-A14B for high-quality video generation via FlexAI inference endpoints
22 Object Detection and Computer Vision with Ultralytics YOLO11 Train and deploy state-of-the-art object detection, segmentation, and pose estimation models using YOLO11

Keep in mind

Notes on HuggingFace Accelerate Integration

This repository includes experiments that utilize the HuggingFace Accelerate library for efficient training.

The FlexAI CLI simplifies running training scripts by automatically determining the appropriate execution method:

  • python: Used for single-accelerator training.
  • torchrun: Automatically used for multi-accelerator distributed training.

If you're accustomed to using the accelerate launch command, you can seamlessly run the same scripts on FlexAI without modification. Simply provide the script to FlexAI, and it will handle execution.

As highlighted in the Accelerate documentation, the accelerate launch command is optional. Instead, the Accelerate functionality integrates directly into your script, making it compatible with other launchers like torchrun.

Note

Unlike accelerate launch, torchrun does not use the YAML configuration file generated by accelerate config.

If your training setup relies on specific configurations from the YAML file, you may need to adjust your script to explicitly define these settings using the Accelerator class.

By doing so, you ensure seamless execution across different environments while maintaining flexibility for various training setups.

About

Experiments runnable via the Flex CLI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6

Languages