# Modelplane Documentation > Modelplane is the open source control plane for AI model serving. It extends Crossplane to manage AI inference across a fleet of GPU clusters. ## Documentation - [Overview](https://v0-1.docs.modelplane.ai/overview/): What Modelplane is, why it exists, and how it works. - [Deploy a Model](https://v0-1.docs.modelplane.ai/models/model-deployment/): Deploy a model to the fleet, from a single pod to disaggregated prefill and decode. - [Get started](https://v0-1.docs.modelplane.ai/getting-started/): A guided tour of Modelplane, from an empty control plane to a model served across regions. - [Installation](https://v0-1.docs.modelplane.ai/getting-started/installation/): Stand up the Modelplane control plane on a local kind cluster. - [Qwen3-8B](https://v0-1.docs.modelplane.ai/examples/qwen3-8b/): An 8.2B dense chat model on a single NVIDIA L4. - [Set Up the Gateway](https://v0-1.docs.modelplane.ai/platform/inference-gateway/): Unified OpenAI-compatible endpoint on the control plane cluster. - [Why Modelplane](https://v0-1.docs.modelplane.ai/overview/why/): The problem Modelplane solves and how it compares to the alternatives. - [Build the platform](https://v0-1.docs.modelplane.ai/getting-started/build-the-platform/): Set up the gateway, give the control plane cloud credentials, and provision your first GPU cluster. - [Define Hardware Classes](https://v0-1.docs.modelplane.ai/platform/inference-class/): Hardware recipe defining GPU type, count, and provisioning for a node pool. - [Expose a Model](https://v0-1.docs.modelplane.ai/models/model-service/): Expose model endpoints via a unified OpenAI-compatible URL. - [How it schedules](https://v0-1.docs.modelplane.ai/architecture/scheduling/): How Modelplane places a deployment's replicas across the fleet, and the limits of that placement. - [How Modelplane works](https://v0-1.docs.modelplane.ai/overview/how-it-works/): The architecture, the resources, and what happens when you deploy a model. - [Qwen3-Coder-480B](https://v0-1.docs.modelplane.ai/examples/qwen3-coder/): A 480B code MoE, multi-node BF16 over EFA or single-node FP8 on SGLang. - [Cache Model Weights](https://v0-1.docs.modelplane.ai/models/model-cache/): Stage model weights on cluster storage before serving. - [Deploying a model](https://v0-1.docs.modelplane.ai/getting-started/deploying-a-model/): Declare what your model needs and serve it behind a unified endpoint. - [Kimi-K2](https://v0-1.docs.modelplane.ai/examples/kimi-k2/): A 1T MoE served prefill/decode disaggregated across two H200 nodes. - [Register a Cluster](https://v0-1.docs.modelplane.ai/platform/inference-cluster/): A Kubernetes cluster registered with Modelplane for model serving. - [FAQ](https://v0-1.docs.modelplane.ai/overview/faq/): Short answers to the questions practitioners ask about Modelplane first. - [Glossary](https://v0-1.docs.modelplane.ai/overview/glossary/): Terms used throughout the Modelplane docs and what they mean. - [AI tools](https://v0-1.docs.modelplane.ai/overview/ai-tools/): Connect AI assistants and coding agents to the Modelplane docs through MCP, Markdown, and llms.txt. - [Architecture](https://v0-1.docs.modelplane.ai/architecture/): How Modelplane is built, the Crossplane foundation, the composition-function model, and the choices behind them. - [Llama-3.1-8B](https://v0-1.docs.modelplane.ai/examples/llama-3.1-8b/): An 8B dense chat model on a single NVIDIA L4. - [Route to External Providers](https://v0-1.docs.modelplane.ai/models/model-endpoint/): A reachable inference endpoint, composed per replica or created manually for external providers. - [Scale the platform](https://v0-1.docs.modelplane.ai/getting-started/scale-the-platform/): Grow from one cluster to a multi-region fleet. - [Supported Providers](https://v0-1.docs.modelplane.ai/platform/providers/): The clouds and neoclouds Modelplane runs on today, and the Crossplane providers it grows into. - [API Reference](https://v0-1.docs.modelplane.ai/reference/): Every Modelplane API type, grouped by Platform, Models, and Composed. - [Scale the model](https://v0-1.docs.modelplane.ai/getting-started/scale-the-model/): Serve the model from two regions behind a single endpoint. - [Clean up](https://v0-1.docs.modelplane.ai/getting-started/clean-up/): Tear down everything you created during the tour. - [EKSCluster](https://v0-1.docs.modelplane.ai/reference/eksclusters/) - [GKECluster](https://v0-1.docs.modelplane.ai/reference/gkeclusters/) - [ServingStack](https://v0-1.docs.modelplane.ai/reference/servingstacks/) ## Resources - Full documentation as one file: https://v0-1.docs.modelplane.ai/llms-full.txt - Connect an AI assistant (MCP, Markdown): https://v0-1.docs.modelplane.ai/overview/ai-tools/ - GitHub: https://github.com/modelplaneai/modelplane