GKECluster Custom Resource
A GKECluster provisions a GKE cluster with dedicated node pools for GPU inference and system workloads. It outputs secrets containing the cluster kubeconfig and a GCP service account key that consumers can use to target the cluster.
#Metadata
#Example
Manifest
apiVersion: infrastructure.modelplane.ai/v1alpha1
kind: GKECluster
metadata:
name: inference-us-central
namespace: platform
spec:
project: my-gcp-project
region: us-central1
nodePools:
- name: system
role: System
machineType: e2-standard-4
nodeCount: 2
- name: gpu-h100
role: GPU
machineType: a3-highgpu-8g
diskSizeGb: 500
nodeCount: 0
maxNodeCount: 4
zones: [us-central1-a]
gpu:
acceleratorType: nvidia-h100-80gb
acceleratorCount: 8
#Spec
GKEClusterSpec defines the desired state of GKECluster.
GKE cluster Kubernetes version. Must be a version supported by the REGULAR release channel.
VPC networking configuration. Defaults are suitable for standalone clusters. Override when VPC-peering multiple clusters to avoid CIDR collisions.
Boot disk size in GB.
GPU configuration. Required when role is GPU.
GCE machine type (e.g. e2-standard-4, a2-highgpu-8g, g2-standard-48).
Maximum number of nodes for autoscaling.
Minimum number of nodes for autoscaling. Set to 1 or higher for pools that must always be available.
Unique name for this node pool. Used as a suffix in the GKE NodePool resource name.
Initial number of nodes.
Determines what workloads this pool runs. System pools host controllers, gateways, and infrastructure. GPU pools host inference workloads and are tainted to exclude non-GPU pods.
GCP project ID where the cluster will be created.
GCP region for the cluster (e.g. us-central1, europe-west4).
#Status
Observed ModelCache RWX storage state.
Name of the Modelplane-managed ReadWriteMany StorageClass composed on this cluster for ModelCache PVCs. ModelCache reads this to target the cache PVC.
Key within the Secret that holds the credential data.
Name of the Secret.
The type of credential this secret contains. Kubeconfig contains a kubeconfig file with the cluster endpoint and CA certificate. GCPServiceAccountKey contains a GCP service account JSON key that can authenticate to the cluster via GKE IAM.