Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions config/manifests/gateway/kubvernor/gateway.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: inference-gateway
spec:
gatewayClassName: kubvernor-inference-gateway
listeners:
- name: http
port: 80
protocol: HTTP
20 changes: 20 additions & 0 deletions config/manifests/gateway/kubvernor/httproute.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: llm-route
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: inference-gateway
rules:
- backendRefs:
- group: inference.networking.x-k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct
matches:
- path:
type: PathPrefix
value: /
timeouts:
request: 300s
39 changes: 39 additions & 0 deletions site-src/guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,42 @@ A cluster with:
kubectl get httproute llm-route -o yaml
```

=== "Kubvernor Rust API Gateway"

[Kubvernor Rust API Gateway](https://github.com/kubvernor/kubvernor) is a higly experimental project so not ready for production but it supports version v0.5.1 of Inference Extension Spec.

1. Requirements
- Rust and Cargo installed

2. Run Kubvernor Rust API Gateway as documented in [README](https://github.com/kubvernor/kubvernor/blob/main/README.md)


3. Deploy the Gateway

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kubvernor/gateway.yaml
```

Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
```bash
$ kubectl get gateway inference-gateway
NAME CLASS ADDRESS PROGRAMMED AGE
inference-gateway kubvernor-inference-gateway <MY_ADDRESS> True 22s
```

5. Deploy the HTTPRoute

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kubvernor/httproute.yaml
```

6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:

```bash
kubectl get httproute llm-route -o yaml
```



### Deploy the InferencePool and Endpoint Picker Extension

Expand Down Expand Up @@ -404,3 +440,6 @@ A cluster with:
```bash
kubectl delete ns kgateway-system
```
=== "Kubvernor"

No further clean up is needed.
7 changes: 7 additions & 0 deletions site-src/implementations/gateways.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,15 @@ This project has several implementations that are planned or in progress:
* [Google Kubernetes Engine][4]
* [Istio][5]
* [Kgateway][6]
* [Kubvernor Rust API Gateway][7]

[1]:#agentgateway
[2]:#alibaba-cloud-container-service-for-kubernetes
[3]:#envoy-ai-gateway
[4]:#google-kubernetes-engine
[5]:#istio
[6]:#kgateway
[7]:#kubvernor-rust-api-gateway

## Agentgateway

Expand Down Expand Up @@ -93,3 +95,8 @@ Issue](https://github.com/istio/istio/issues/55768).
gateway that can run [independently](https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_3), as an [Istio waypoint](https://kgateway.dev/blog/extend-istio-ambient-kgateway-waypoint/),
or within your [llm-d infrastructure](https://github.com/llm-d-incubation/llm-d-infra) to improve accelerator (GPU)
utilization for AI inference workloads.

## Kubvernor Rust API Gateway
[Kubvernor Rust API Gateway][krg] is an open-source, highly experimental implementation of API controller in Rust programming language. Currently, Kubvernor supports Envoy Proxy. The project aims to be as generic as possible so Kubvernor can be used to manage/deploy different gateways (Envoy, Nginx, HAProxy, etc.). Kubvernor Rust API Gateway is conformant with Gateway-API-Inference-Extension v0.5.1.

[krg]:https://github.com/kubvernor/kubvernor