FlexAI Experiments

This repository provides a set of different experiments designed for you to try out and explore FlexAI. These experiments range from running your very first training job, to fine-tuning language, diffusion, and text-to-speech models using techniques like QLoRA and LoRA, as well as integrating FlexAI with other platforms, such as experiment trackers.

Getting Started

Prerequisites

FlexAI CLI: Install the FlexAI CLI by following the steps shown in the Installing the FlexAI CLI guide.

The Experiments

The following table lists out the experiments available in this repository. Each experiment is designed to walk you through a specific use case and contains its required code, as well as detailed instructions on how to run it on FlexAI:

No.	Section	Description
1	A Simple Training Job on FlexAI	Step-by-step guide to get your first Training Job on FlexAI running
2	A Simple Distributed Data Parallel (DDP) Training Job on FlexAI	Demonstrates that you only need to add 2 flags in order to start a DDP Training Job
3	Resuming a Training Job from a Checkpoint	Learn how to resume a Training Job from a previously saved checkpoint
4	Streaming Large Datasets During a Training Job	Train a model on a large dataset using streaming
5	Training Job & Experiment Tracking	Using Weights and Biases with FlexAI for experiment tracking
6	Fine-Tuning a Language Model with QLoRA	Fine-tune a causal language model efficiently using QLoRA
7	Fine-Tuning a Diffusion Model with LoRA	Fine-tune a diffusion model efficiently using LoRA
8	Fine-Tuning a Text-to-Speech Model	Fine-tune a text-to-speech (TTS) model
9	Fine-Tuning a language Model using Flash Attention	Fine-tune a causal language model efficiently using the flash-attn package
10	Train and Serve a French LLM on FlexAI with LlamaFactory	Fine-tune Qwen2.5-7B on French data using LlamaFactory and deploy as a production-ready inference endpoint
11	Train and Serve Language Models with Axolotl	Fine-tune language models on domain-specific data using Axolotl framework with custom dataset configurations and FSDP
12	Reinforcement Learning Fine-Tuning with EasyR1	Train language models with RL using GRPO and DAPO algorithms for enhanced reasoning capabilities
13	GRPO Training on Vision-Language Models	Fine-tune vision-language models using GRPO reinforcement learning with HuggingFace TRL, LoRA, vLLM, and DeepSpeed ZeRO-3
14	Language Model Evaluation with LM-Evaluation-Harness	Comprehensive evaluation of language models across 300+ tasks and benchmarks using the LM-Evaluation-Harness framework
15	RAG Application with LangChain and FlexAI Inference Endpoints	Interactive interface for users to ask questions based on provided documents using Retrieval-Augmented Generation
16	Speech-to-Text Application Using FlexAI Inference Endpoints	Interactive interface for recording audio messages and receiving transcriptions
17	Multi-Agent Orchestration with LangGraph	Build a multi-agent system where specialized AI agents work together under a central supervisor
18	Text-to-Image Inference with FlexAI Endpoints	Deploy and use Stable Diffusion 3.5 Large for high-quality image generation via FlexAI inference endpoints
19	Text-to-Audio Inference with FlexAI Endpoints	Deploy and use Stable Audio Open 1.0 for high-quality audio generation via FlexAI inference endpoints
20	Text-to-Speech Inference with FlexAI Endpoints	Deploy and use the Kokoro model for natural voice synthesis via FlexAI inference endpoints
21	Text-to-Video Inference with FlexAI Endpoints	Deploy and use Wan2.2-T2V-A14B for high-quality video generation via FlexAI inference endpoints
22	Object Detection and Computer Vision with Ultralytics YOLO11	Train and deploy state-of-the-art object detection, segmentation, and pose estimation models using YOLO11

Keep in mind

Notes on HuggingFace Accelerate Integration

This repository includes experiments that utilize the HuggingFace Accelerate library for efficient training.

The FlexAI CLI simplifies running training scripts by automatically determining the appropriate execution method:

python: Used for single-accelerator training.
torchrun: Automatically used for multi-accelerator distributed training.

If you're accustomed to using the accelerate launch command, you can seamlessly run the same scripts on FlexAI without modification. Simply provide the script to FlexAI, and it will handle execution.

As highlighted in the Accelerate documentation, the accelerate launch command is optional. Instead, the Accelerate functionality integrates directly into your script, making it compatible with other launchers like torchrun.

Note

Unlike accelerate launch, torchrun does not use the YAML configuration file generated by accelerate config.

If your training setup relies on specific configurations from the YAML file, you may need to adjust your script to explicitly define these settings using the Accelerator class.

By doing so, you ensure seamless execution across different environments while maintaining flexibility for various training setups.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
code		code
experiments		experiments
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlexAI Experiments

Getting Started

Prerequisites

The Experiments

Keep in mind

Notes on HuggingFace Accelerate Integration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

flexaihq/blueprints

Folders and files

Latest commit

History

Repository files navigation

FlexAI Experiments

Getting Started

Prerequisites

The Experiments

Keep in mind

Notes on HuggingFace Accelerate Integration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages