Set Up the Gateway
API: modelplane.ai/v1alpha1 · InferenceGateway
The InferenceGateway sets up the control plane’s front door: one unified,
OpenAI-compatible address that every ModelService is exposed through, routing
each request on to the inference cluster serving it.
The InferenceGateway is a singleton: create exactly one, named default, on
your Modelplane control plane. It fronts every inference cluster in the fleet, so
you don’t create one per cluster.
The backend field selects which gateway runs it. Traefik is the only value
today.
On a cloud cluster with a native LoadBalancer controller, the gateway’s Service
gets an external address on its own. On kind or bare-metal, where there’s no such
controller, set spec.traefik.loadBalancer: MetalLB and give it an address pool
in spec.traefik.metallb.addressPool so the gateway gets an IP. See the example
below.
Once the gateway is ready, read its external address from status.address:
kubectl get ig default -o jsonpath='{.status.address}'That address is the host of every ModelService URL
(http://<address>/<namespace>/<service>), so it’s what you hand to ML teams.
Example
# The InferenceGateway creates a unified, OpenAI-compatible endpoint on the
# control plane cluster. It installs Traefik Proxy and creates a Gateway that
# routes traffic to model replicas on remote inference clusters.
#
# Create one InferenceGateway per control plane. It must be named "default".
#
# For kind or bare-metal clusters, set loadBalancer to MetalLB and configure an
# address pool. For cloud clusters with native LoadBalancer support, omit the
# loadBalancer field entirely.
apiVersion: modelplane.ai/v1alpha1
kind: InferenceGateway
metadata:
name: default
spec:
backend: Traefik
traefik:
version: "40.2.0"
# Remove the loadBalancer section if your cluster supports LoadBalancer
# services natively (e.g. GKE, EKS).
loadBalancer: MetalLB
metallb:
addressPool: "172.18.255.200-172.18.255.250"