Cross-organizational data sharing has a problem most teams do not talk about. Most of it still runs on workarounds. Scheduled exports to S3, FTP transfers with shared passwords, replicated datasets that are already stale before the recipient runs a single query. These approaches are brittle, ungoverned, and nearly impossible to revoke once the data is out.
Databricks Delta Sharing was built to replace them. Donated to the Linux Foundation in 2021, it is an open protocol that lets data providers share live data directly from their lakehouse without copying files, replicating pipelines, or handing over storage credentials. The recipient reads from the source. The provider stays in control.
This article covers how the protocol works, when to use each sharing model, what Unity Catalog adds to the picture, and where Delta Sharing hits its limits in production.
Key Takeaways Databricks Delta Sharing is an open protocol for sharing live data across organizations without copying or replicating it Three sharing models exist: Databricks-to-Databricks, Open Sharing, and a self-managed open source server, each suited to different recipient setups Unity Catalog is not required to use Delta Sharing, but it adds centralized access control, audit logging, and AI asset sharing Shareable assets now include Delta tables, views, volumes, notebooks, AI models , and materialized views Delta Sharing carries real limitations: performance degrades on high-volume tables, egress costs accumulate cross-region, and lineage does not cross the sharing boundary For enterprises already on Databricks, Delta Sharing can replace most partner data exchange workflows without building a separate API layer
Why Enterprise Data Sharing Breaks Down at Scale Most enterprise data sharing still runs on copies. A team exports a dataset, transfers it to a partner, and both sides treat that snapshot as the source of truth. Here is where that model consistently fails:
Stale data by design: The copy is outdated the moment it lands. When something changes upstream, there is no propagation mechanism, resulting in another export cycle. No governance after hand-off: Once data leaves the lakehouse, the provider loses visibility into how it is used. There are no access controls, no audit logs , and no revocation mechanism. APIs do not scale for bulk reads: Custom APIs built for real-time access require dedicated engineering, rate limiting, and versioning. For terabyte-scale datasets, a REST API is the wrong abstraction. It is designed for request-response patterns, not bulk analytical reads.
Databricks Delta Sharing fits where these approaches break down: it provides governed, real-time data sharing at scale without requiring the recipient to be on any particular platform.
How Databricks Delta Sharing Works Delta Sharing is an open protocol developed by Databricks and donated to the Linux Foundation in 2021. It lets organizations share live data directly from their lakehouse with any recipient, on any platform, without copying files or handing over storage credentials. The way it achieves this comes down to how data access actually works under the hood.
When a recipient queries a shared table, no data moves . The Delta Sharing server issues a short-lived, pre-signed URL pointing to the files in the provider’s cloud storage, and the recipient’s compute reads directly from that location. The diagram below shows that request path end to end.
Three objects make this work:
Provider : the organization or team that owns the data and defines what gets shared.Share : a named, read-only collection of assets, including Delta tables, views, partitions, volumes, notebooks, and AI models.Recipient : a named object in the provider’s Unity Catalog representing the person or team receiving access. The recipient type (Databricks or open) determines which authentication model and asset set apply.
Because the data never leaves the provider’s storage, the recipient always reads the latest version. Cox Automotive Europe uses this model to share governed data with subsidiaries without copying or replicating across environments. There is no refresh cycle, no replication lag, and no synchronization job.
Databricks-to-Databricks vs Open Sharing Databricks Delta Sharing offers three sharing models. The right choice depends on whether the recipient is on Databricks, what assets need to be shared, and how much governance infrastructure the provider has in place.
When Databricks-to-Databricks Sharing Makes Sense Databricks-to-Databricks (D2D) sharing requires both sides to have Unity Catalog-enabled workspaces. Setup uses a metastore sharing identifier that uniquely identifies the recipient’s Unity Catalog metastore. No credential files or tokens are exchanged manually.
D2D supports the broadest asset set and provides the deepest governance integration, with Unity Catalog managing access, auditing, and lineage on both sides.
When Open Sharing Makes Sense Open sharing lets providers share tabular data with recipients on any platform, including Power BI, pandas, the Python delta-sharing connector, Apache Spark, Tableau, and custom applications. The recipient receives a credential file with a Delta Sharing endpoint and bearer token.
Unity Catalog governs the provider side, but the recipient’s environment is uncontrolled. Asset support is narrower: tables, views, partitions, and change data feeds only. For teams evaluating whether Databricks vs Snowflake is the right platform, the sharing model is one of the key differentiators.
Comparing the Two Models Dimension Databricks-to-Databricks Open Sharing Recipient platform Databricks (Unity Catalog) Any platform Credential handling Metastore identifier, no manual tokens Bearer token or OIDC federation Supported assets Tables, views, volumes, notebooks, AI models, materialized views Tables, views, partitions, change data feeds Governance depth Full Unity Catalog on both sides Provider side only Audit support Both provider and recipient Provider side only Setup effort Medium (metastore identifier exchange) Low (credential file download) Best for Internal teams, trusted partners on Databricks External partners, mixed-platform consumers Key limitation Recipient must be on Unity Catalog-enabled workspace Narrower asset support; token management on recipient side
The third model, a self-managed open source Delta Sharing server , allows sharing from any platform to any platform without Databricks. Databricks does not document this path; it is maintained as a separate open source project requiring the provider to manage infrastructure independently.
Not Sure Delta Sharing Fits Your Setup? Kanerika runs a structured readiness assessment before any configuration begins.
Talk to our Team
Unity Catalog’s Role in Secure Delta Sharing Unity Catalog is not a prerequisite for Databricks Delta Sharing, but the gap between a Unity Catalog deployment and the open source Delta Sharing alternative is significant. Here is what Unity Catalog adds.
1. Centralized Access Control for Shared Assets Access is managed through GRANT and REVOKE SQL commands, the same commands that govern the rest of the lakehouse. Shares and recipients are securable objects in the catalog. Permissions are auditable, consistent, and manageable through standard data admin tooling.
2. Shares as Unity Catalog Securable Objects A share functions like a catalog object with its own privilege model. Admins grant access for a specific recipient, revoke it when a partnership ends, and track every access event in the audit log. A financial services firm sharing market data can demonstrate exactly who accessed what and when, without building a separate data governance layer.
3. Row and Column-Level Control Through Views A provider creates a view that filters rows, masks columns, or applies dynamic access policies based on recipient identity, then shares that view instead of the underlying table. This is the standard pattern for compliance-constrained data sharing.
4. Audit, Lineage, and Usage Tracking Unity Catalog logs every access event for shared tables. Providers query system tables to see which recipients accessed which assets, how often, and from which compute, enough for most compliance audit requirements without a separate logging infrastructure .
One structural limit worth noting: lineage does not cross the sharing boundary. Downstream pipelines built on a shared table are not visible in the provider’s catalog graph. Teams with end-to-end lineage requirements need to plan for this explicitly. The same issue appears across data governance principles in data mesh architectures.
What Teams Can Share with Databricks Delta Sharing The asset scope for Databricks Delta Sharing has expanded since its 2021 launch. Delta Sharing also powers Databricks Marketplace , where organizations can publish and access datasets, AI models, and notebooks without platform lock-in. As of early 2026, what is shareable depends on the sharing model in use.
Delta tables are the foundation of what gets shared. Any Delta table in Unity Catalog can be added to a share. Change data feed (CDF) support lets recipients query only incremental changes rather than scanning the full table each time. This is one of the Databricks lakehouse architecture features that most directly benefits sharing workflows.
Views are the standard pattern for fine-grained access. A provider can filter rows, mask PII columns, or scope access by recipient before sharing.
Asset Type D2D Sharing Open Sharing Delta tables Yes Yes Views and filtered datasets Yes Yes Partitions and change data feeds Yes Yes Volumes (non-tabular files) Yes No AI models Yes No Notebooks Yes No Materialized views and streaming tables Yes (GA Sept 2025 ) No
The asset gap between D2D and open sharing is usually the deciding factor. Teams sharing AI models or raw files with internal partners need D2D. Teams sharing tabular data with external parties on mixed platforms go with open sharing.
Implementation Plan for Databricks Delta Sharing Moving from evaluation to a working Databricks Delta Sharing deployment is a governance project as much as a technical one. The steps below cover what a solid rollout looks like in practice.
Readiness Area What to Confirm Unity Catalog status Provider workspace enabled; assets registered in catalog Data classification Tables classified as public, internal, confidential, or restricted Ownership model Share owner, access approver, and revocation owner named Share architecture Shares scoped to business purpose, not a single all-data share Recipient type D2D or open sharing selected per recipient platform Monitoring setup System table queries or alerts configured for access anomalies
Clear answers to all six areas before starting configuration prevents the most common failure modes: ungoverned access, orphaned recipients, and compliance gaps discovered after a credential file has already gone out.
How to Set Up a Share: Four Steps Provider-side setup in Unity Catalog follows a consistent four-step pattern. Each command maps to one distinct action, none are optional, and order matters.
# Command What It Does 1 CREATE SHARECreates a named container for the assets you want to expose. Name it after the business relationship it serves, not the data inside it, partner_supply_data is more auditable than logistics_tables_v2. 2 ALTER SHARE ...ADD TABLEAdds a Delta table or view to the share. For anything touching PII or restricted columns, expose a view with row filters or column masks applied, never share the raw table directly. 3 CREATE RECIPIENTRegisters the external party. For open sharing, generates a credential file they use to authenticate. For Databricks-to-Databricks sharing, uses the recipient’s metastore identifier instead, no credential file changes hands. 4 GRANT SELECT ON SHAREConnects the recipient to the share. They can query from their own compute immediately after. Revoke access at any time with a single REVOKE command, no credential rotation needed.
Once access is granted, the share appears in the recipient’s Catalog Explorer under “Shared with me” with no additional setup on their side.
Step 1: Confirm Unity Catalog Readiness Workspaces created after November 2023 have Unity Catalog enabled automatically . Older workspaces require a manual upgrade. Confirm the provider workspace is Unity Catalog-enabled and that the assets intended for sharing are already registered in the catalog before planning anything else.
Step 2: Define Provider and Recipient Ownership Before configuring anything, three roles need to be named explicitly: a share owner who controls what gets added, an access approver who signs off on new recipients, and a revocation owner who handles partner offboarding. These should be job titles tied to actual people, not team names or shared inboxes.
A practical pattern: the data domain lead owns the share, the data steward approves recipients, and the security team owns revocation. Whatever the structure, document it before the first credential file goes out. Discovering that no one knows who can revoke access is a compliance problem that surfaces at the worst time.
Step 3: Classify Data Before Sharing Before adding a table to a share, classify it: public, internal, confidential, or restricted. This classification work is foundational to any data governance framework . Confidential and restricted datasets should be shared only through views with row filters or column masks applied. This step is often skipped and creates compliance exposure later.
Step 4: Design Shares by Business Purpose Resist the pattern of one large share with everything in it. Scope shares to a business relationship or data product . This pattern maps directly to data mesh principles , where each domain owns and controls what it exposes. A supplier share, a partner analytics share, and an internal cross-region share should be three separate objects, each with its own recipient list and review cadence.
Step 5: Test Access with a Limited Recipient Group Validate the full read path with a single test recipient before rolling out broadly. Confirm the credential setup works, the data is accessible in the recipient’s preferred tool, and access events appear in the provider’s Unity Catalog system tables.
Step 6: Monitor Usage and Refine Policies Set up alerts on unusual access patterns. A recipient querying a full table at high frequency is a signal that the sharing design or the recipient’s query pattern needs attention. Review the recipient list quarterly and revoke access for inactive relationships.
Before You Deploy: Limitations Worth Knowing Most coverage of Databricks Delta Sharing focuses on what it can do. The deployment considerations below come from production use cases and should be part of any enterprise evaluation.
Performance on high-volume tables: Delta Sharing suits analytical access patterns, not high-throughput operational reads. For large tables with many recipients running varied queries simultaneously, response times degrade. A Databricks SQL Warehouse is the more reliable path for those workloads.Egress costs across regions and clouds: Providers pay cloud egress when a recipient reads data from a share in a different region. For high-volume cross-region sharing, this cost compounds quickly. Cloudflare R2 eliminates egress charges and is worth evaluating for those architectures.Lineage stops at the sharing boundary: Downstream pipelines built on a shared table do not appear in the provider’s Unity Catalog lineage graph. Teams with end-to-end lineage requirements for compliance reporting need to account for this gap explicitly.Tables with RLS or column masks cannot be shared directly: The workaround is to create a view that applies the policy, then share the view instead.The open source self-managed server sacrifices governance depth: Without Unity Catalog, there is no centralized access control, audit logging, or AI asset sharing. This path works for simple tabular exchange but is not suited to enterprise compliance requirements.
Firewall and network restrictions: Delta Sharing delivers data via pre-signed URLs pointing to the provider’s cloud storage. If the recipient’s network blocks outbound access to the provider’s storage endpoint, reads fail with 403 errors. In strict enterprise environments, recipients need stable egress IPs through VNet-injected workspaces with a NAT Gateway, or Serverless compute with Network Connectivity Configurations. Providers can further restrict access using per-recipient IP access lists in Unity Catalog. For the most locked-down setups, Private Link to the storage layer is the cleanest solution, though it adds infrastructure overhead on both sides.None of these is a dealbreaker, but they need to be part of the architecture conversation before the first share goes live.
From Delta Sharing Evaluation to Deployment: Kanerika’s Approach In many cases, enterprises running Databricks have the platform in place but lack the governance architecture to use Delta Sharing safely. In practice, Unity Catalog may be under-configured, data classification policies may not exist, and sharing designs tend to be ad hoc rather than structured around business relationships. As a result, governed data sharing remains out of reach even for teams that have already invested in the platform.
Kanerika is a registered Databricks Consulting Partner with implementation experience across data engineering, lakehouse architecture, and governed data sharing for enterprise clients in financial services, retail, and manufacturing.
One pattern we see consistently: organizations that have been running Databricks for years still have not done formal data classification. Delta Sharing surfaces that gap immediately, because a share cannot be scoped responsibly without knowing which tables are confidential versus internal versus public. In practice, the Delta Sharing rollout becomes the trigger for a data governance project the team was not planning for. Building that classification work into the implementation timeline upfront, rather than discovering it mid-deployment, is the single biggest factor in whether a Delta Sharing rollout stays on schedule.
In one engagement, a logistics client managing data exchange with multiple carrier partners migrated from a manual CSV export process to a Delta Sharing architecture on Databricks . Partner data latency dropped from a 3-day export cycle to near-real-time access. The recurring export job was eliminated, and for the first time the compliance team had an auditable access log showing exactly which carrier accessed which dataset and when. The client’s previous process made that kind of audit trail impossible to produce. (Anonymized case study)
To that end, Kanerika’s implementation practice covers Unity Catalog readiness assessment, data classification and governance policy design, share and recipient architecture, and ongoing monitoring setup. For enterprises evaluating whether Delta Sharing fits their partner data exchange model, Kanerika runs structured readiness assessments before any technical configuration begins.
Case Study : Modernizing Retail Analytics Infrastructure with DatabricksChallenges:
Previously, a national retail corporation ran analytics across siloed systems with no unified data layer, forcing teams to reconcile inconsistent reports manually before any decision could be made. Legacy infrastructure could not scale to meet growing data volumes across store operations, inventory, and supplier data streams. Data sharing with external partners relied on manual exports, creating governance blind spots and stale datasets that reached partners days after generation.
Solution:
Kanerika migrated the client’s fragmented analytics infrastructure to a unified Databricks lakehouse, consolidating structured and unstructured data sources into a single governed environment. Unity Catalog was configured to manage access control, data classification, and audit logging across internal teams and external data consumers. Delta Sharing was implemented to replace manual export workflows, giving approved partners governed, live access to datasets without replication or credential hand-off.
Results:
All distributed on-premise data consolidated into a single, governed Databricks cloud platform, giving business users consistent access for the first time. Legacy hardware, backup systems, and on-premise maintenance overhead fully retired, freeing IT capacity for strategic work. Unity Catalog delivered role-based access control , centralized authentication, and data lineage tracking across all workspaces. Phased migration execution with parallel system availability meant zero business downtime, with legacy systems decommissioned only after full validation.
Wrapping Up Ultimately, Databricks Delta Sharing removes the architecture tax that copy-based data sharing imposes on enterprise teams. However, deploying it well requires governance decisions that most teams underestimate: data classification, share design, and recipient ownership all need to be settled before the first credential file goes out.
For organizations already invested in Databricks and Unity Catalog, Delta Sharing is the most direct path to governed, live data access for partners.
Ready to Get Delta Sharing Live? Kanerika handles the full deployment- Unity Catalog, share design, recipient onboarding, and monitoring setup.
Book a Meeting
FAQs What is Databricks Delta Sharing? Databricks Delta Sharing is an open protocol for securely sharing live data from a Databricks lakehouse with external teams, without copying or replicating it. Developed by Databricks and donated to the Linux Foundation in 2021 , it lets data providers share Delta tables, views, volumes, AI models, and notebooks with recipients on any computing platform through a token-based, read-only access model.
How does Databricks Delta Sharing work? A provider defines a share (a read-only collection of data assets) and assigns recipients to it. When a recipient queries a shared table, the Delta Sharing server verifies access through Unity Catalog, then issues a short-lived, pre-signed URL pointing to the data files in the provider’s cloud storage. The recipient reads directly from storage without the data being moved or copied.
Does Delta Sharing copy data? No. Delta Sharing is a zero-copy protocol. The data stays in the provider’s cloud storage at all times. Recipients receive temporary, read-only access to the files through pre-signed URLs. There is no replication, no export, and no transfer to the recipient’s environment.
Does the recipient need Databricks to use Delta Sharing? No. Open sharing allows recipients on any computing platform to access shared data using standard connectors for Apache Spark, pandas, Power BI, Tableau, and Excel, among others. Databricks-to-Databricks sharing requires the recipient to have a Unity Catalog-enabled Databricks workspace, but open sharing has no platform requirement.
What is the difference between Delta Sharing and Unity Catalog? Delta Sharing is the protocol that governs how data is accessed and exchanged across organizational boundaries. Unity Catalog is the governance layer inside Databricks that manages access control, lineage, and audit logging. Delta Sharing can run without Unity Catalog using the open source server, but Unity Catalog adds centralized governance, AI asset sharing, and deeper audit support to any Delta Sharing deployment.
Is Delta Sharing secure for enterprise data sharing? Delta Sharing uses short-lived tokens or pre-signed URLs for authentication, TLS encryption for all data in transit, and Unity Catalog audit logging to track every access event. In the D2D model, identity is handled through metastore identifiers rather than manually issued credentials. Providers can revoke recipient access at any time, and the read-only model prevents recipients from modifying data.
Can Delta Sharing work across cloud platforms? Yes. Delta Sharing is designed for cross-cloud and cross-region access. A provider on AWS can share data with a recipient running Spark on Google Cloud, or a Power BI user on Azure. When sharing across cloud regions, providers should account for egress costs. Using Cloudflare R2 as the underlying storage eliminates egress charges for cross-region architectures.
What are the main limitations of Databricks Delta Sharing? The key limitations are performance on high-volume tables, egress costs for cross-region sharing, and lineage that stops at the sharing boundary. Tables with row-level security or column masks cannot be shared directly and must be exposed through views. The open source self-managed server lacks Unity Catalog governance and is not suited for enterprise compliance requirements.