Version

Set Up the Gateway

On this page

API: modelplane.ai/v1alpha1 · InferenceGateway

The InferenceGateway sets up the control plane’s front door: one unified, OpenAI-compatible address that every ModelService is exposed through, routing each request on to the inference cluster serving it.

The InferenceGateway is a singleton: create exactly one, named default, on your Modelplane control plane. It fronts every inference cluster in the fleet, so you don’t create one per cluster.

The backend field selects which gateway runs it. Traefik is the only value today.

On a cloud cluster with a native LoadBalancer controller, the gateway’s Service gets an external address on its own. On kind or bare-metal, where there’s no such controller, set spec.traefik.loadBalancer: MetalLB and give it an address pool in spec.traefik.metallb.addressPool so the gateway gets an IP. See the example below.

Once the gateway is ready, read its external address from status.address:

kubectl get ig default -o jsonpath='{.status.address}'

That address is the host of every ModelService URL (http://<address>/<namespace>/<service>), so it’s what you hand to ML teams.

Example

# The InferenceGateway creates a unified, OpenAI-compatible endpoint on the
# control plane cluster. It installs Traefik Proxy and creates a Gateway that
# routes traffic to model replicas on remote inference clusters.
#
# Create one InferenceGateway per control plane. It must be named "default".
#
# For kind or bare-metal clusters, set loadBalancer to MetalLB and configure an
# address pool. For cloud clusters with native LoadBalancer support, omit the
# loadBalancer field entirely.
apiVersion: modelplane.ai/v1alpha1
kind: InferenceGateway
metadata:
  name: default
spec:
  backend: Traefik
  traefik:
    version: "40.2.0"

    # Remove the loadBalancer section if your cluster supports LoadBalancer
    # services natively (e.g. GKE, EKS).
    loadBalancer: MetalLB
    metallb:
      addressPool: "172.18.255.200-172.18.255.250"