Awesome LLMOps
An awesome & curated list of the best LLMOps tools for developers.
Contribute
Contributions are most welcome, please adhere to the contribution guidelines.
- Table of Contents
- Model
- Serving
- LLMOps
- Search
- Code AI
- Training
- Data
- Large Scale Deployment
- Performance
- AutoML
- Optimizations
- Federated ML
- Awesome Lists
- Alpaca
- Code and documentation to train Stanford's Alpaca models, and generate the data.
- BELLE
- A 7B Large Language Model fine-tune by 34B Chinese Character Corpus, based on LLaMA and Alpaca.
- Bloom
- BigScience Large Open-science Open-access Multilingual Language Model
- dolly
- Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
- Falcon 40B - Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.
- FastChat (Vicuna)
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
- GLM-6B (ChatGLM)
- An Open Bilingual Pre-Trained Model, quantization of ChatGLM-130B, can run on consumer-level GPUs.
- GLM-130B (ChatGLM)
- An Open Bilingual Pre-Trained Model (ICLR 2023)
- GPT-NeoX
- An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
- Luotuo
- A Chinese LLM, Based on LLaMA and fine tune by Stanford Alpaca, Alpaca LoRA, Japanese-Alpaca-LoRA.
- StableLM
- StableLM: Stability AI Language Models
- disco-diffusion
- A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
- midjourney - Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.
- segment-anything (SAM)
- produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
- stable-diffusion
- A latent text-to-image diffusion model
- stable-diffusion v2
- High-Resolution Image Synthesis with Latent Diffusion Models
- bark
- Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.
- whisper
- Robust Speech Recognition via Large-Scale Weak Supervision
- Alpaca-LoRA-Serve
- Alpaca-LoRA as Chatbot service
- DeepSpeed-MII
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
- FlexGen
- Running large language models on a single GPU for throughput-oriented scenarios.
- Flowise
- Drag & drop UI to build your customized LLM flow using LangchainJS.
- llama.cpp
- Port of Facebook's LLaMA model in C/C++
- Modelz-LLM
- OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)
- whisper.cpp
- Port of OpenAI's Whisper model in C/C++
- x-stable-diffusion
- Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention.
- BentoML
- The Unified Model Serving Framework
- Mosec
- A machine learning model serving framework with dynamic batching and pipelined stages, provides an easy-to-use Python interface.
- TFServing
- A flexible, high-performance serving system for machine learning models.
- Torchserve
- Serve, optimize and scale PyTorch models in production
- Triton Server (TRTIS)
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.
- langchain-serve
- Serverless LLM apps on Production with Jina AI Cloud
- Deepchecks
- Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
- Evidently
- Evaluate and monitor ML models from validation to production.
- Great Expectations
- Always know what to expect from your data.
- whylogs
- The open standard for data logging
- Arize-Phoenix
- ML observability for LLMs, vision, language, and tabular models.
- deeplake
- Stream large multimodal datasets to achieve near 100% GPU utilization. Query, visualize, & version control data. Access data w/o the need to recompute the embeddings for the model finetuning.
- GPTCache
- Creating semantic cache to store responses from LLM queries.
- Haystack
- Quickly compose applications with LLM Agents, semantic search, question-answering and more.
- langchain
- Building applications with LLMs through composability
- LangFlow
- An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
- LlamaIndex
- Provides a central interface to connect your LLMs with external data.
- promptfoo
- Open-source tool for testing & evaluating prompt quality. Create test cases, automatically check output quality and catch regressions, and reduce evaluation cost.
- Weights & Biases (Prompts)- A suite of LLMOps tools within the developer-first W&B MLOps platform. Utilize W&B Prompts for visualizing and inspecting LLM execution flow, tracking inputs and outputs, viewing intermediate results, securely managing prompts and LLM chain configurations.
- xTuring
- Build and control your personal LLMs with fast and efficient fine-tuning.
- ZenML
- Open-source framework for orchestrating, experimenting and deploying production-grade ML solutions, with built-in
langchain
&llama_index
integrations. - Dify
- Open-source framework aims to enable developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
- AquilaDB
- An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
- Chroma
- the open source embedding database
- Jina
- Build multimodal AI services via cloud native technologies · Neural Search · Generative AI · Cloud Native
- Marqo
- Tensor search for humans.
- Milvus
- Vector database for scalable similarity search and AI applications.
- Pinecone - The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.
- pgvector
- Open-source vector similarity search for Postgres.
- pgvecto.rs
- Vector database plugin for Postgres, written in Rust, specifically designed for LLM.
- Qdrant
- Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud
- txtai
- Build AI-powered semantic search applications
- Vald
- A Highly Scalable Distributed Vector Search Engine
- Vearch
- A distributed system for embedding-based vector retrieval
- Weaviate
- Weaviate is an open source vector search engine that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
- CodeGen
- CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
- CodeT5
- Open Code LLMs for Code Understanding and Generation.
- fauxpilot
- An open-source alternative to GitHub Copilot server
- tabby
- Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.
- code server
- Run VS Code on any machine anywhere and access it in the browser.
- conda
- OS-agnostic, system-level binary package manager and ecosystem.
- Docker
- Moby is an open-source project created by Docker to enable and accelerate software containerization.
- envd
- 🏕️ Reproducible development environment for AI/ML.
- Jupyter Notebooks
- The Jupyter notebook is a web-based notebook environment for interactive computing.
- Kurtosis
- A build, packaging, and run system for ephemeral multi-container environments.
- alpaca-lora
- Instruct-tune LLaMA on consumer hardware
- LMFlow
- An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
- Lora
- Using Low-rank adaptation to quickly fine-tune diffusion models.
- peft
- State-of-the-art Parameter-Efficient Fine-Tuning.
- p-tuning-v2
- An optimized prompt tuning strategy achieving comparable performance to fine-tuning on small/medium-sized models and sequence tagging challenges. (ACL 2022)
- QLoRA
- Efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance.
- Accelerate
- 🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
- Apache MXNet
- Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler.
- Caffe
- A fast open framework for deep learning.
- ColossalAI
- An integrated large-scale model training system with efficient parallelization techniques.
- DeepSpeed
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
- Horovod
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
- Jax
- Autograd and XLA for high-performance machine learning research.
- Kedro
- Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.
- Keras
- Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow.
- LightGBM
- A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
- MegEngine
- MegEngine is a fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
- metric-learn
- Metric Learning Algorithms in Python.
- MindSpore
- MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
- Oneflow
- OneFlow is a performance-centered and open-source deep learning framework.
- PaddlePaddle
- Machine Learning Framework from Industrial Practice.
- PyTorch
- Tensors and Dynamic neural networks in Python with strong GPU acceleration.
- PyTorchLightning
- The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
- XGBoost
- Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library.
- scikit-learn
- Machine Learning in Python.
- TensorFlow
- An Open Source Machine Learning Framework for Everyone.
- VectorFlow
- A minimalist neural network library optimized for sparse data and single machine environments.
- Aim
- an easy-to-use and performant open-source experiment tracker.
- ClearML
- Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
- Guild AI
- Experiment tracking, ML developer tools.
- MLRun
- Machine Learning automation and tracking.
- Kedro-Viz
- Kedro-Viz is an interactive development tool for building data science pipelines with Kedro. Kedro-Viz also allows users to view and compare different runs in the Kedro project.
- LabNotebook
- LabNotebook is a tool that allows you to flexibly monitor, record, save, and query all your machine learning experiments.
- Sacred
- Sacred is a tool to help you configure, organize, log and reproduce experiments.
- Weights & Biases
- A developer first, lightweight, user-friendly experiment tracking and visualization tool for machine learning projects, streamlining collaboration and simplifying MLOps. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.
- Maniford
- A model-agnostic visual debugging tool for machine learning.
- netron
- Visualizer for neural network, deep learning, and machine learning models.
- OpenOps
- Bring multiple data streams into one dashboard.
- TensorBoard
- TensorFlow's Visualization Toolkit.
- TensorSpace
- Neural network 3D visualization framework, build interactive and intuitive model in browsers, support pre-trained deep learning models from TensorFlow, Keras, TensorFlow.js.
- dtreeviz
- A python library for decision tree visualization and model interpretation.
- Zetane Viewer
- ML models and internal tensors 3D visualizer.
- Zeno
- AI evaluation platform for interactively exploring data and model outputs.
- ArtiVC
- A version control system to manage large files. Lake is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.
- Dolt
- Git for Data.
- DVC
- Data Version Control | Git for Data & Models | ML Experiments Management.
- Delta-Lake
- Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
- Pachyderm
- Pachyderm is a version control system for data.
- Quilt
- A self-organizing data hub for S3.
- JuiceFS
- A distributed POSIX file system built on top of Redis and S3.
- LakeFS
- Git-like capabilities for your object storage.
- Lance
- Modern columnar data format for ML implemented in Rust.
- Piperider
- A CLI tool that allows you to build data profiles and write assertion tests for easily evaluating and tracking your data's reliability over time.
- LUX
- A Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process.
- Featureform
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
- FeatureTools
- An open source python framework for automated feature engineering
- Upgini
- Free automated data & feature enrichment library for machine learning: automatically searches through thousands of ready-to-use features from public and community shared data sources and enriches your training dataset with only the accuracy improving features
- Feast
- An open source feature store for machine learning.
- ClearML
- Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management.
- MLflow
- Open source platform for the machine learning lifecycle.
- MLRun
- An open MLOps platform for quickly building and managing continuous ML applications across their lifecycle.
- ModelFox
- ModelFox is a platform for managing and deploying machine learning models.
- Kserve
- Standardized Serverless ML Inference Platform on Kubernetes
- Kubeflow
- Machine Learning Toolkit for Kubernetes.
- PAI
- Resource scheduling and cluster management for AI.
- Polyaxon
- Machine Learning Management & Orchestration Platform.
- Primehub
- An effortless infrastructure for machine learning built on the top of Kubernetes.
- Seldon-core
- An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
- Weights & Biases
- A lightweight and flexible platform for machine learning experiment tracking, dataset versioning, and model management, enhancing collaboration and streamlining MLOps workflows. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.
- Airflow
- A platform to programmatically author, schedule and monitor workflows.
- aqueduct
- An Open-Source Platform for Production Data Science
- Argo Workflows
- Workflow engine for Kubernetes.
- Flyte
- Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale.
- Kubeflow Pipelines
- Machine Learning Pipelines for Kubeflow.
- LangFlow
- An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
- Metaflow
- Build and manage real-life data science projects with ease!
- Ploomber
- The fastest way to build data pipelines. Develop iteratively, deploy anywhere.
- Prefect
- The easiest way to automate your data.
- VDP
- An open-source unstructured data ETL tool to streamline the end-to-end unstructured data processing pipeline.
- ZenML
- MLOps framework to create reproducible pipelines.
- Kueue
- Kubernetes-native Job Queueing.
- PAI
- Resource scheduling and cluster management for AI (Open-sourced by Microsoft).
- Slurm
- A Highly Scalable Workload Manager.
- Volcano
- A Cloud Native Batch System (Project under CNCF).
- Yunikorn
- Light-weight, universal resource scheduler for container orchestrator systems.
- dvc
- Data Version Control | Git for Data & Models | ML Experiments Management
- ModelDB
- Open Source ML Model Versioning, Metadata, and Experiment Management
- MLEM
- A tool to package, serve, and deploy any ML model on any platform.
- ormb
- Docker for Your ML/DL Models Based on OCI Artifacts
- ONNX-MLIR
- Compiler technology to transform a valid Open Neural Network Exchange (ONNX) graph into code that implements the graph with minimum runtime support.
- TVM
- Open deep learning compiler stack for cpu, gpu and specialized accelerators
- octoml-profile
- octoml-profile is a python library and cloud service designed to provide the simplest experience for assessing and optimizing the performance of PyTorch models on cloud hardware with state-of-the-art ML acceleration technology.
- scalene
- a high-performance, high-precision CPU, GPU, and memory profiler for Python
- Archai
- a platform for Neural Network Search (NAS) that allows you to generate efficient deep networks for your applications.
- autoai
- A framework to find the best performing AI/ML model for any AI problem.
- AutoGL
- An autoML framework & toolkit for machine learning on graphs
- AutoGluon
- AutoML for Image, Text, and Tabular Data.
- automl-gs
- Provide an input CSV and a target field to predict, generate a model + code to run it.
- autokeras
- AutoML library for deep learning.
- Auto-PyTorch
- Automatic architecture search and hyperparameter optimization for PyTorch.
- auto-sklearn
- an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
- Dragonfly
- An open source python library for scalable Bayesian optimisation.
- Determined
- scalable deep learning training platform with integrated hyperparameter tuning support; includes Hyperband, PBT, and other search methods.
- DEvol (DeepEvolution)
- a basic proof of concept for genetic architecture search in Keras.
- EvalML
- An open source python library for AutoML.
- FEDOT
- AutoML framework for the design of composite pipelines.
- FLAML
- Fast and lightweight AutoML (paper).
- Goptuna
- A hyperparameter optimization framework, inspired by Optuna.
- HpBandSter
- a framework for distributed hyperparameter optimization.
- HPOlib2
- a library for hyperparameter optimization and black box optimization benchmarks.
- Hyperband
- open source code for tuning hyperparams with Hyperband.
- Hypernets
- A General Automated Machine Learning Framework.
- Hyperopt
- Distributed Asynchronous Hyperparameter Optimization in Python.
- hyperunity
- A toolset for black-box hyperparameter optimisation.
- Katib
- Katib is a Kubernetes-native project for automated machine learning (AutoML).
- Keras Tuner
- Hyperparameter tuning for humans.
- learn2learn
- PyTorch Meta-learning Framework for Researchers.
- Ludwig
- a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code.
- MOE
- a global, black box optimization engine for real world metric optimization by Yelp.
- Model Search
- a framework that implements AutoML algorithms for model architecture search at scale.
- NASGym
- a proof-of-concept OpenAI Gym environment for Neural Architecture Search (NAS).
- NNI
- An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
- Optuna
- A hyperparameter optimization framework.
- Pycaret
- An open-source, low-code machine learning library in Python that automates machine learning workflows.
- Ray Tune
- Scalable Hyperparameter Tuning.
- REMBO
- Bayesian optimization in high-dimensions via random embedding.
- RoBO
- a Robust Bayesian Optimization framework.
- scikit-optimize(skopt)
- Sequential model-based optimization with a
scipy.optimize
interface. - Spearmint
- a software package to perform Bayesian optimization.
- TPOT
- one of the very first AutoML methods and open-source software packages.
- Torchmeta
- A Meta-Learning library for PyTorch.
- Vegas
- an AutoML algorithm tool chain by Huawei Noah's Arb Lab.
- FeatherCNN
- FeatherCNN is a high performance inference engine for convolutional neural networks.
- Forward
- A library for high performance deep learning inference on NVIDIA GPUs.
- NCNN
- ncnn is a high-performance neural network inference framework optimized for the mobile platform.
- PocketFlow
- use AutoML to do model compression.
- TensorFlow Model Optimization
- A suite of tools that users, both novice and advanced, can use to optimize machine learning models for deployment and execution.
- TNN
- A uniform deep learning inference framework for mobile, desktop and server.
- EasyFL
- An Easy-to-use Federated Learning Platform
- FATE
- An Industrial Grade Federated Learning Framework
- FedML
- The federated learning and analytics library enabling secure and collaborative machine learning on decentralized data anywhere at any scale. Supporting large-scale cross-silo federated learning, cross-device federated learning on smartphones/IoTs, and research simulation.
- Flower
- A Friendly Federated Learning Framework
- Harmonia
- Harmonia is an open-source project aiming at developing systems/infrastructures and libraries to ease the adoption of federated learning (abbreviated to FL) for researches and production usage.
- TensorFlow Federated
- A framework for implementing federated learning
- Awesome Argo
- A curated list of awesome projects and resources related to Argo
- Awesome AutoDL
- Automated Deep Learning: Neural Architecture Search Is Not the End (a curated list of AutoDL resources and an in-depth analysis)
- Awesome AutoML
- Curating a list of AutoML-related research, tools, projects and other resources
- Awesome AutoML Papers
- A curated list of automated machine learning papers, articles, tutorials, slides and projects
- Awesome Federated Learning Systems
- A curated list of Federated Learning Systems related academic papers, articles, tutorials, slides and projects.
- Awesome Federated Learning
- A curated list of federated learning publications, re-organized from Arxiv (mostly)
- awesome-federated-learningacc
- All materials you need for Federated Learning: blogs, videos, papers, and softwares, etc.
- Awesome Open MLOps
- This is the Fuzzy Labs guide to the universe of free and open source MLOps tools.
- Awesome Production Machine Learning
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
- Awesome Tensor Compilers
- A list of awesome compiler projects and papers for tensor computation and deep learning.
- kelvins/awesome-mlops
- A curated list of awesome MLOps tools.
- visenger/awesome-mlops
- An awesome list of references for MLOps - Machine Learning Operations
- currentslab/awesome-vector-search
- A curated list of awesome vector search framework/engine, library, cloud service and research papers to vector similarity search.