The SuperSONIC project implements server infrastructure for inference-as-a-service applications in large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.
Currently, SuperSONIC supports the following functionality:
- GPU inference-as-a-service via Nvidia Triton Inference Server
- Load balancing across many GPUs via Envoy Proxy
- Load-based autoscaling via KEDA
- Monitoring via Prometheus, Grafana, and OpenTelemetry
- Rate limiting
- Token-based authentication
Kubernetes cluster
ideally with access to GPUs, but CPUs are enough for a minimal deployment.
Helm
Helm is a package manager for Kubernetes. To install Helm on your machine, follow the official instructions at https://helm.sh/docs/intro/install/.
Custom Resource Definitions (CRDs) – not needed for minimal deployment
-
Prometheus CRDs
If you are using an established Kubernetes cluster (e.g. at an HPC), there is a high chance that these CRDs are already installed. Otherwise, cluster admin can use the following commands:
How to install Prometheus CRDs
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update kubectl create namespace monitoring helm install prometheus-operator prometheus-community/kube-prometheus-stack --namespace monitoring --set prometheusOperator.createCustomResource=false --set defaultRules.create=false --set alertmanager.enabled=false --set prometheus.enabled=false --set grafana.enabled=false
-
KEDA CRDs (only if using autoscaling)
How to install Prometheus CRDs
helm repo add kedacore https://kedacore.github.io/charts helm repo update kubectl create namespace keda helm install keda kedacore/keda --namespace keda
If you are installing SuperSONIC for the first time, proceed to the Minimal deployment section below.
If you already have a functional values.yaml
and/or installed SuperSONIC previously, use the following installation commands:
helm repo add fastml https://fastmachinelearning.org/SuperSONIC
helm repo update
helm install <release-name> fastml/supersonic -n <namespace> -f <values.yaml>
To construct the values.yaml
file for your application, follow Configuration guide.
The full list of configuration parameters is available in the Configuration reference.
1. Install cvmfs-csi plugin to load models from CVMFS
For an example installation, we will use CMS models loaded from CVMFS. SuperSONIC allows other types of model repository, including an arbitrary Persistent Volume, an NFS volume, or S3 storage.
cvmfs-csi plugin allows to easily mount CVMFS into a Kubernetes cluster by creating a new storage class. A Persistent Volume created with this storage class will have CVMFS contents visible inside.
Cluster admin can use the following commands to install cvmfs-csi
:
kubectl create namespace cvmfs-csi
helm install -n cvmfs-csi cvmfs-csi oci://registry.cern.ch/kubernetes/charts/cvmfs-csi --values cvmfs/values-cvmfs-csi.yaml
kubectl apply -f cvmfs/cvmfs-storageclass.yaml -n cvmfs-csi
2. Install SuperSONIC with minimal configuration
The minimal deployment will install only a single CPU-based Triton server and an Envoy Proxy.
We will use values/values-minimal.yaml
as our minimal
configuration file.
helm repo add fastml https://fastmachinelearning.org/SuperSONIC
helm repo update
helm install <release-name> fastml/supersonic -n <namespace> -f values/values-minimal.yaml
3. Deploy a test job to run inferences
To test your SuperSONIC installation, we will create a small Nvidia Performance Analyzer job, which will send a single inference request with random input data to Envoy Proxy endpoint.
-
In
tests/perf-analyzer-job.yaml
, edit the following parameters to match your deployment:metadata: namespace: <namespace>
In
perf_analyzer
command:-u <release-name>.<namespace>.svc.cluster.local:8001
-
Submit the job to your Kubernetes cluster:
kubectl apply -n <namespace> -f tests/perf-analyzer-job.yaml
-
Track job performance and inspect logs:
kubectl get pods -l job-name=perf-analyzer-job -n <namespace> kubectl logs <pod-name> -n <namespace>
This option may be useful for testing unreleased features.
git clone https://github.com/fastmachinelearning/SuperSONIC.git
cd SuperSONIC
git checkout <branch-or-commit>
helm dependency build helm/supersonic
helm install <release-name> helm/supersonic -n <namespace> -f <your-values.yaml>
CMS | ATLAS | IceCube | |
---|---|---|---|
Purdue Geddes | ✅ | - | - |
Purdue Anvil | ✅ | - | - |
NRP Nautilus | ✅ | ✅ | ✅ |
UChicago | - | ✅ | - |
UW–Madison | ⏳ | - | - |
Dmitry Kondratyev, Benedikt Riedel, Yuan-Tang Chou, Miles Cochran-Branson, Noah Paladino, David Schultz, Mia Liu, Javier Duarte, Philip Harris, and Shih-Chieh Hsu
SuperSONIC: Cloud-Native Infrastructure for ML Inferencing
In Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration (PEARC '25)
Association for Computing Machinery, New York, NY, USA. Article 29, 1–5. 2025.
https://doi.org/10.1145/3708035.3736049