Most enterprise AI teams can tell you which agents are running in production. Few can tell you what those agents are doing in practice: who they’re acting on behalf of, what data they touched, what each call cost, or whether any policy was violated. Databricks built Unity AI Gateway to make that visible and governable.
The Databricks Unity AI Gateway was announced in April 2026 and expanded at Data + AI Summit 2026 in June. It extends Unity Catalog’s proven governance model from data assets to AI interactions at runtime. Traditional governance tells you who can access a model. Unity AI Gateway tells you what the agent did with that access, and whether it followed the rules. Kanerika has deployed this governance model in production for regulated enterprises. See how it worked in our real-time compliance AI agent case study .
In this article, we’ll cover what Databricks Unity AI Gateway is, how it evolved from the original AI Gateway, its four governance pillars, how MCP governance works in practice, who benefits most, how to get started, and how Kanerika helps enterprises implement it on Databricks.
Key Takeaways Unity AI Gateway extends Unity Catalog’s governance from data assets to the runtime behavior of AI agents, models, and MCP servers. It governs enterprise AI across four pillars: cost controls and smart routing, unified asset governance, runtime guardrails, and end-to-end observability. On-behalf-of (OBO) execution ties every agent action to the requesting user’s identity, closing the shared service account gap that affects most production agentic deployments. MCP governance is the most underutilized capability in the current release, giving teams full policy control and audit visibility over every tool call an agent makes to external systems. Unity AI Gateway is currently in Beta on both AWS and Azure Databricks, with no usage charges during the Beta period. Kanerika is a Databricks Consulting Partner with production deployments of governed AI agents across regulated industries.
Ready to Govern AI at Enterprise Scale? Kanerika Helps Organizations Build Secure, Compliant, and High-Performance AI Platforms with Databricks Unity AI Gateway.
Book a Meeting
What Is Databricks Unity AI Gateway? Unity AI Gateway is Databricks’ governance solution for enterprise AI. Built on Unity Catalog, it extends governance beyond data and model assets to the runtime interactions between AI agents, models, MCP servers, and enterprise tools. Where Unity Catalog governs what exists and who can access it, Unity AI Gateway governs what happens during each interaction.
The distinction is operational. Access controls tell you whether an agent can call a model. Unity AI Gateway tells you what the agent said, what it received, what it spent, and whether every attached policy was followed, across every call, in real time.
1. How It Evolved from the Original AI Gateway? The original Databricks AI Gateway was a traffic management layer for model serving endpoints. It handled rate limiting and routing, and teams could cap how many requests a user made. But the gateway had no content evaluation, no identity-aware controls, and no policy engine.
Unity AI Gateway replaces that with a full governance stack. It evaluates request and response content against defined policies, enforces identity-aware controls using the requesting user’s actual permissions, and logs every interaction to Unity Catalog for audit and compliance. For teams comparing this to standalone options, our breakdown of LLM gateway architectures covers where Unity AI Gateway sits relative to other approaches.
2. Where It Fits in the Databricks Stack? Unity Catalog has governed enterprise data since 2021 through a unified permissions model, lineage tracking, and audit logging. Unity AI Gateway sits directly above it, extending that same infrastructure into the AI interaction layer.
For enterprises already on Unity Catalog, the integration is immediate. The same RBAC model, the same policy engine, and the same audit tables now cover every agent call, model invocation, and tool interaction, with no parallel system to configure or maintain separately. Kanerika’s work on Databricks data lineage shows how Unity Catalog’s lineage model extends naturally into this runtime governance layer.
The Four Governance Pillars of Unity AI Gateway Unity AI Gateway organizes enterprise AI governance across four pillars. Each addresses a different problem teams face when scaling agents into production.
1. AI Cost Controls and Smart Routing AI spend without controls becomes a finance problem fast. Developers run agents in loops, teams route tasks to frontier models when cheaper options would perform equally well, and the bill arrives at month-end with no clear attribution. Unity AI Gateway addresses this before costs compound.
Spend controls operate at four levels: per-user, per-use-case, per-workspace, and per-account. Hard spend caps stop new requests automatically when a threshold is reached, as outlined in Databricks’ official Unity AI Gateway announcement . All spend is tracked in Unity Catalog system tables with full cost attribution by user, workspace, model, and provider, including actual dollar costs rather than raw token counts.
Smart routing pairs with spend controls to direct requests to the most cost-appropriate model based on task complexity and quality requirements. Simpler tasks go to less expensive models automatically. When a provider hits rate limits or becomes unavailable, the gateway fails over to a configured backup without manual intervention.
Budget Level What It Controls Action When Limit Is Reached Per-user Individual developer or operator spend Alert, then hard cap Per-use-case Named agent workflow or application Alert, then hard cap Per-workspace Team or environment budget Alert, then hard cap Per-account Total organizational AI spend Hard cap, requests stopped
Budget enforcement works at every level of the organization, from individual developers to the full account.
2. Unified Asset Governance Across Models, Agents, and MCP Servers Everything Unity AI Gateway governs must first be registered in Unity Catalog as a securable object. This includes Databricks-hosted foundation models, external providers (OpenAI, Anthropic, Google, and others), MCP servers, agent endpoints, and skills.
Registration is one-time per asset. Once registered, access is controlled with standard Unity Catalog GRANT and REVOKE statements, the same syntax used for tables and volumes. Managed MCP services for Google Drive, Jira, Slack, and GitHub are available by default. External MCP servers register through HTTP connections with tool-level filtering configured at registration time. Teams already familiar with Databricks security configuration will recognize this as an extension of the access control model already in place.
The result is a single governed inventory of every AI asset the organization uses, with permissions, lineage, and audit coverage applied consistently. Teams building this into a broader data governance program will find it inherits Unity Catalog’s existing access and lineage infrastructure rather than requiring a parallel system.
3. Runtime Guardrails for Safety and Compliance Databricks ships built-in guardrail templates for the most common enterprise risks. PII detection and redaction masks emails, SSNs, and phone numbers before they reach external models. Content safety blocks toxic output with sensitivity-tunable filters. Prompt injection detection catches jailbreak attempts, and a hallucination guard validates responses against grounding sources.
Each guardrail applies to requests, responses, or both, and is configured per endpoint. A customer-facing chatbot and an internal coding agent can carry different guardrail profiles without additional infrastructure.
Beyond the built-in templates, teams define custom guardrails using their own system prompt and a chosen model. A financial services firm might flag responses referencing specific regulatory thresholds. A healthcare organization might block requests that include diagnosis codes in certain contexts. Because these are prompt-based rather than regex-based, they adapt to natural language complexity without ongoing rule maintenance. All guardrail decisions are logged in Unity Catalog tables, creating a single audit record across access, content, cost, and compliance.
4. End-to-End Observability and Payload Logging A single agent response might involve three model calls, four MCP tool invocations, and writes to two external systems, all in under a second. Understanding what happened requires tracing the full chain from trigger to outcome.
Every request through Unity AI Gateway can be logged as a structured record in Unity Catalog inference tables. These capture the request payload, response, model called, tokens used, latency, cost, and both user and agent identity. Being Delta tables, teams can query them with SQL and join them with business data for monitoring dashboards or compliance exports.
Unity AI Gateway also streams AI activity, audit logs, and runtime telemetry into Lakewatch, Databricks’ security monitoring layer, for threat detection. For organizations already using Databricks MLflow , inference table data correlates directly with model traces, giving teams end-to-end provenance from raw data through to agent output. This connects directly to the broader discipline of AI agent observability , which covers how enterprises trace agent behavior in production.
MCP Governance: The Feature Most Teams Are Missing Model Context Protocol servers are now the standard way to connect AI agents to enterprise tools. They are also one of the most significant governance gaps in production AI deployments today, and the feature most teams skip when evaluating Unity AI Gateway.
1. Why MCP Servers Create a New Governance Problem When an agent calls an MCP server to access Salesforce, GitHub, or an internal API, it typically does so through a shared service account with elevated permissions. The audit log shows the service account, not the person who triggered the action. There’s no policy layer evaluating what the agent is asking for, or whether the request is within bounds.
At scale, this creates three compounding problems. Security teams have no visibility into what agents are accessing through MCP. Compliance teams cannot produce audit trails that satisfy regulators. Platform teams have no way to restrict agent behavior based on the identity of the human user behind the request. The broader implications of ungoverned agent behavior are covered in our analysis of agentic AI risks in enterprise environments.
2. How Unity AI Gateway Closes That Gap Unity AI Gateway sits between the agent and every MCP server call. Before execution, it evaluates each call against all attached service policies, checking the requesting user’s identity, the agent’s permissions, the specific tool being invoked, and the call parameters. The gateway can allow, deny, transform, or route the call to a human for approval.
The more consequential change is on-behalf-of (OBO) user execution. When a user triggers an agent workflow, the agent calls MCP servers using that user’s exact permissions rather than a shared service account. If the user cannot access a record, the agent cannot either, regardless of what the service principal technically allows. Every action maps to both the agent and the real user in the audit trail. For regulated industries, that is the difference between an audit-ready system and one that fails on first review.
Payload logging for MCP calls captures the full request and response as system tables in Unity Catalog, queryable alongside the rest of the audit trail. This feature is currently in a gated beta requiring direct Databricks enrollment.
Unity AI Gateway vs Governing AI Without It Capability Without Unity AI Gateway With Unity AI Gateway Model access control Per-team API key management Unity Catalog privilege grants MCP governance Shared service accounts, fragmented audit OBO execution, full payload logging Guardrails Custom-built per application Centralized, configurable per endpoint Spend visibility End-of-month billing Real-time per-user, per-workspace tracking Audit trail Split across providers and tools Unified in Unity Catalog inference tables Failover Manual or custom logic Automatic across configured backup models
Unity AI Gateway does not replace models or agents. It governs the layer between them and the organization.
Who Benefits Most from Unity AI Gateway 1. Data and AI Platform Teams Platform teams carry accountability for how AI is deployed across the organization but rarely have the tooling to enforce consistent behavior across every team’s agents and models. Unity AI Gateway gives them a single control plane: one place to register AI assets, set policies, track spend, and pull audit logs, without coordinating separately with every team running agents on Databricks.
2. Security and Compliance Teams Security teams working on enterprise data governance face a new challenge: AI agents are now among the most privileged actors in the enterprise, and most governance frameworks were built before agents existed. Unity AI Gateway extends the same trust boundaries Unity Catalog applies to data into every model call, tool invocation, and MCP interaction. Compliance teams get queryable audit logs, guardrail enforcement records, and identity-traced actions in one place.
3. Engineering Teams Running Coding Agents at Scale Coding agents like Cursor, Claude Code, and GitHub Copilot are being adopted fast in enterprise engineering organizations. Each connects to MCP servers to access code repositories, CI/CD systems, internal APIs, and documentation. Without governance, each agent creates its own OAuth tokens, its own tool connections, and its own audit trail, or more commonly, none at all. Databricks AgentBricks provides related context on how Databricks structures agent infrastructure at the platform level.
Unity AI Gateway addresses this through a central MCP registry. Every coding agent is routed through the same governance layer, with the same spend controls, identity enforcement, and payload logging applied consistently. Engineering teams keep the flexibility to use the tools they prefer; platform and security teams get the visibility they require.
How to Get Started with Unity AI Gateway Unity AI Gateway is currently in Beta on both AWS and Azure Databricks. Account admins must enable it through the Previews page in the account console before workspace users can access it. AWS GovCloud and Azure Government regions are currently excluded from the Beta. The full configuration path is documented at docs.databricks.com .
1. Registering LLM Endpoints and MCP Servers Each AI service must be registered as a Unity Catalog securable object before the gateway can govern it. Databricks-hosted foundation models are available by default. External models require a Unity Catalog connection object pointing to the provider’s API, and MCP servers register through HTTP connections with tool-level filtering configured at registration time.
Once registered, access is controlled with standard Unity Catalog GRANT and REVOKE statements. Teams who have worked through Databricks security configuration will recognize this as an extension of the access control model already in place.
2. Setting Spend Caps and Service Policies Budget policies are configured through the Budgets section of account settings, with Unity AI Gateway selected as the resource type. Set thresholds at whichever combination of per-user, per-use-case, per-workspace, and per-account levels the organization requires.
Service policies are written as Unity Catalog functions and attached to registered services. Built-in guardrails (PII, content safety, prompt injection) attach directly. Custom guardrails require a system prompt and a model selection. All policies apply to both the registered service and any agents calling it through the gateway.
-- Grant a team access to a registered model endpoint
GRANT EXECUTE ON FUNCTION ai_gateway.endpoints.claude_sonnet
TO `[email protected] `;
-- Attach a PII guardrail policy to an MCP server registration
GRANT EXECUTE ON FUNCTION ai_gateway.policies.pii_redaction
TO `[email protected] `;Both statements run in the Unity Catalog SQL editor. No separate admin console or API call is required, the same permission model used for Delta tables applies here.
3. Enabling Payload Logging for Audit Trails Inference table logging is enabled per endpoint in the gateway configuration. Once active, every request and response is stored as a Delta table in Unity Catalog, queryable by identity, cost, latency, and content. Lakewatch integration is enabled at the account level and streams AI activity automatically once configured. MCP payload logging is in a gated beta; contact the Databricks account team to enroll.
Databricks Asset Bundles Complete Guide to Deployment and CI/CD Learn how Databricks Asset Bundles simplify CI/CD, automate deployments, and improve collaboration across development, testing, and production.
Learn More
How Kanerika Helps Enterprises Implement Unity AI Gateway on Databricks At Kanerika, AI agent deployments begin with governance design, before any agent code runs. Every engagement starts with defining what the agent can access, under what conditions, and how its actions are audited. This approach is grounded in the same principles covered in our overview of enterprise data governance and extended to the AI interaction layer.
As a Databricks Consulting Partner , Kanerika has deployed AI agents across regulated industries where compliance requirements are a baseline. The firm has production systems running across manufacturing, financial services, logistics, and healthcare, with governance embedded at the infrastructure layer from day one.
For organizations building on Databricks and working toward production-ready agentic AI, Kanerika brings the implementation depth to design AI governance structures that hold under real operational conditions. Kanerika holds ISO 27001 and SOC II Type II certifications, is recognized as an Everest Group Top Aspirant in the Data and AI Services PEAK Matrix 2025, and maintains a 98% client retention rate across 100+ enterprise clients.
A regulated enterprise managing over one million expert consultations globally needed policy checks running against data operations in real time, before they executed. Manual batch-cycle review was too slow, and violations were surfacing hours after the fact.
Challenge Compliance violations detected on batch review schedules, not in real time Manual teams unable to match the volume of data operations across distributed systems No enforcement layer between data access events and downstream processes
Solution Kanerika built a real-time compliance and risk detection AI agent that intercepted every relevant data operation and assessed it against policy rules in real time. Stateful context tracked risk patterns across full sessions. Governance was embedded at the infrastructure layer, applied before actions executed rather than audited after the fact.
Results 60% reduction in time spent screening negative news 3x faster expert vetting process 70% decrease in backlog cases 40% reduction in compliance event delays
Wrapping Up Unity AI Gateway closes a real gap. Most enterprises can govern who accesses their data. Few can govern what their AI agents do with that access: what they called, what they said, what it cost, and whether any policy held. By extending Unity Catalog’s governance to the runtime layer, Databricks gives organizations a path to AI deployment that can be audited, controlled, and trusted.
The system is still in Beta, with some features requiring direct enrollment, particularly MCP payload logging. But the core architecture is production-ready for teams willing to configure it correctly before agents go live. The enterprises that build the governance foundation now will have far less to retrofit when the regulatory environment around agentic AI firms up.
Modern AI Needs Modern Governance. Kanerika Helps Enterprises Use Databricks Unity AI Gateway to Secure AI Workloads Without Slowing Innovation.
Book a Meeting
FAQs 1. What is Databricks Unity AI Gateway? Databricks Unity AI Gateway is a centralized governance and management layer for AI models and endpoints within the Databricks Data Intelligence Platform. It enables organizations to securely access, monitor, and manage both proprietary and open-source AI models through a single interface. By applying consistent governance, authentication, and usage policies, it simplifies enterprise AI operations while helping organizations maintain security, compliance, and operational control.
2. Why do organizations use Databricks Unity AI Gateway? Organizations use Databricks Unity AI Gateway to simplify AI governance across multiple teams, applications, and model providers. Instead of managing security and access controls separately for each AI service, the gateway provides a unified management layer. This improves visibility into AI usage, strengthens compliance, and makes it easier to scale AI initiatives while maintaining consistent governance standards across the enterprise.
3. How does Databricks Unity AI Gateway improve AI governance? Databricks Unity AI Gateway improves governance by centralizing authentication, authorization, monitoring, and policy enforcement for AI models. Administrators can define who has access to specific models, monitor API usage, and maintain detailed audit logs for compliance. These capabilities help organizations reduce security risks, improve accountability, and ensure AI systems operate within established governance and regulatory frameworks.
4. Can Databricks Unity AI Gateway work with external AI models? Yes. Databricks Unity AI Gateway supports both Databricks-hosted models and external foundation model providers, allowing organizations to manage diverse AI environments through a single governance layer. This flexibility enables enterprises to adopt the best models for different use cases while maintaining consistent security, monitoring, and access policies regardless of where the models are hosted.
5. What are the benefits of using Databricks Unity AI Gateway? Databricks Unity AI Gateway offers several benefits, including centralized AI governance, stronger security, simplified model management, usage monitoring, and improved compliance. It also helps organizations standardize AI operations across departments while providing greater visibility into model performance and consumption. As AI adoption grows, these capabilities make it easier to manage AI resources efficiently and responsibly.
6. How does Databricks Unity AI Gateway help control AI costs? Databricks Unity AI Gateway provides detailed visibility into AI model usage, making it easier for organizations to understand consumption patterns and identify high-cost workloads. Administrators can track API requests, monitor endpoint activity, and optimize how models are used across teams. These insights help businesses allocate resources more effectively while preventing unnecessary AI spending as deployments scale.
7. Who should use Databricks Unity AI Gateway? Databricks Unity AI Gateway is designed for enterprises building and managing AI applications at scale. It is particularly valuable for AI engineers, data engineers, platform administrators, security teams, and governance leaders responsible for ensuring secure model access, regulatory compliance, and operational consistency. Organizations deploying multiple AI models or supporting multiple business units benefit the most from its centralized management capabilities.
8. How can organizations successfully implement Databricks Unity AI Gateway? Successful implementation starts with a strong governance strategy and a well-managed Unity Catalog environment. Organizations should establish clear access controls, define AI usage policies, configure monitoring and audit capabilities, and integrate the gateway into existing AI workflows. Regular reviews of model usage, performance, and compliance help ensure the gateway continues to support secure, scalable, and efficient enterprise AI operations over time.