Loading the catalogue…
Loading the catalogue…
Databricks is a privately held, San Francisco-based data intelligence platform that releases open-weights models — most notably DBRX (132B-parameter MoE), MPT, and Dolly — under custom commercial licences. As a US-incorporated entity with no first-party inference endpoint on Stav, it is fully exposed to CLOUD Act and FISA Section 702 compulsion, which EU regulated-sector customers must account for in any Transfer Impact Assessment when using hosted Databricks APIs; self-hosted deployment of DBRX weights in an EEA facility removes this exposure on the inference path. Its compliance posture is strong on the platform side — SOC 2 Type II, ISO 27001, ISO 27701, a customer DPA with 2021 SCCs, and a public HackerOne bug bounty — but it has not signed the EU GPAI Code of Practice and carries an active class-action copyright lawsuit (In re Mosaic LLM Litigation, N.D. Cal. No. 3:24-cv-01451, motion to dismiss denied April 2026) that directly implicates the training-data transparency obligations of EU AI Act Article 53.
Databricks is US-incorporated and fully CLOUD Act (18 U.S.C. § 2713) and FISA Section 702 exposed. US authorities can compel access to customer data held anywhere globally. EU regulated-sector customers (finance, healthcare, government) must conduct a Transfer Impact Assessment and cannot rely on SCCs alone without supplementary measures, particularly for hosted Databricks API use cases.
Active class-action copyright lawsuit (In re Mosaic LLM Litigation, N.D. Cal. No. 3:24-cv-01451) covering MPT and DBRX model families. Motion to dismiss denied by Judge Charles R. Breyer on April 21, 2026; case in active discovery. Potential statutory damages of up to $150,000 per work across ~196,000 titles. Described by legal observers as 'bet-the-company litigation.' Databricks has not yet asserted a fair use defence, unlike Meta and Anthropic who both prevailed on that argument in similar cases.
Databricks has not signed the EU GPAI Code of Practice as of June 2026, despite GPAI obligations being legally applicable since August 2025. Major peers (Google, OpenAI, Microsoft, Anthropic, Mistral) have signed. Commission enforcement powers activate August 2, 2026. Non-signatories face heightened scrutiny and must prove compliance via alternative adequate means — but no alternative compliance roadmap has been published by Databricks.
Training data provenance for the MPT series is disputed: the RedPajama/Books3 dataset used was later removed from HuggingFace for copyright infringement. Full training dataset composition for DBRX is not publicly disclosed, directly limiting EU AI Act Article 53 transparency assessment and GPAI Code of Practice Chapter 2 (Copyright) compliance. No public opt-out or consent mechanism described for training data.
CVE-2024-49194 (High severity): JDBC driver vulnerability allowing potential remote code execution via JNDI injection — discovered and patched in version 2.6.40 in late 2024. Additionally, in 2024, Databricks' internal development environment (GitHub repos, one Okta environment) was accessed by a bug bounty researcher. Both handled transparently, but indicate that meaningful exposure vectors exist in the software supply chain.
No named EU Data Protection Officer found in public-facing Databricks documentation. Only a general privacy@databricks.com contact is confirmed. A dedicated DPO may be required under GDPR Article 37 given the scale of personal data processing across Databricks' large European customer base.
DBRX is released under the custom Databricks Open Model Licence with an Acceptable Use Policy and gated access on DBRX Base. This is commercially usable but materially less permissive than Apache 2.0 or MIT. EU customers in regulated sectors should review the AUP for sector-specific restrictions before deploying in regulated production workflows.
Stav AI Act assessment
Editorial assessment, not legal advice. Stav's risk ratings, scores, and verdicts are our own analysis of publicly available information and may be incomplete or out of date. Verify independently before making compliance or procurement decisions.
ISO 27001:2013, ISO 27701, and SOC 2 Type II certifications confirmed; HITRUST CSF for Azure Databricks. Public HackerOne bug bounty programme with 260+ valid submissions from ~150 researchers, supplemented by private programme and third-party penetration testing.
GDPR DPA available with 2021 SCCs incorporated; self-serve DPA amendment process; subprocessor list published; EU–US Data Privacy Framework participation; EU Data Act Addendum published — demonstrating proactive engagement with EU digital regulation beyond GDPR.
Model cards published for DBRX Base and DBRX Instruct with architecture, training data summary (12T tokens), benchmark results, context length, and use limitations. LLM Foundry training framework and Unity Catalog governance framework open-sourced on GitHub. Proactive public security incident disclosure via blog and CVE bulletins.
Deep open-source heritage as creators of Apache Spark and Delta Lake; LLM Foundry, Unity Catalog, and MegaBlocks open-sourced on GitHub. Annual Data + AI Summit with confirmed 2026 edition. Founded by UC Berkeley academics with a strong published research heritage.
Actively scaling: completed Series L fundraise in December 2025; acquired Neon in 2025; launched Agent Bricks, Lakebase, Genie Code, and Lakewatch in 2025–2026; partnerships with Anthropic, OpenAI, Google Cloud, Microsoft Azure, and AWS. Founder-led, operationally stable, with confirmed 2026 annual summit.
Named major European regulated-sector customers (Santander Bank Polska, Rabobank, Raiffeisen, Erste Group, ABN AMRO) actively using the Databricks platform — suggesting a demonstrated track record of operating within EU financial services regulatory environments.
Privacy policy review
Creator profile
Databricks is a privately held, San Francisco-based data intelligence company founded by academics from UC Berkeley, known for pioneering the lakehouse architecture and for releasing DBRX, a 132B-parameter open-weights model under a custom commercial licence. It is fully CLOUD Act exposed as a US-incorporated entity, meaning EU customers in regulated sectors must account for potential US government access to data processed on the platform. While Databricks maintains a robust compliance programme — SOC 2 Type II, ISO 27001, ISO 27701, a public GDPR DPA with 2021 SCCs, and an active HackerOne bug bounty — an ongoing copyright class-action lawsuit (In re Mosaic LLM Litigation) over training data practices for the MPT and DBRX model families represents a material legal risk that EU-regulated buyers should monitor.
Stav editorial summary
Databricks is a United States entity. Training data and weights produced under United States-jurisdiction are covered by the CLOUD Act.
Exposed on training. Inference is unaffected when hosted on Stav infrastructure inside the EEA.
Stav compliance has not yet scored Databricks. Scores are published once the policy review and infrastructure assessment complete.
Findings
Citations gathered when the Compliance Curator last reviewed this creator’s public-facing documents. Grouped by source so the picture stays auditable.
“The safeguards we use include implementing the European Commission’s Standard Contractual Clauses for transfers of personal information from the EEA o...”
“SCCs: Databricks’ customer DPA incorporates the 2021 Standard Contractual Clauses to transfer customer personal data to countries outside Europe where...”
“The weights of the base model (DBRX Base) and the finetuned model (DBRX Instruct) are available on Hugging Face under an open license. ”
Databricks, Inc. is a private company founded in 2013 by Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsa...
Headquartered in San Francisco, with offices around the world, Databricks is on a mission to simplify and democratize data and AI, helping data and AI...
As of April 2025, Databricks' leadership includes: Ali Ghodsi - Co-founder and CEO · Matei Zaharia - Co-founder and Chief Technologist · Naveen Z...
In December 2025, Databricks raised more than $4 billion in a Series L funding round at a $134 billion valuation.
DBRX Base is a mixture-of-experts (MoE) large language model trained from scratch by Databricks.
Wed 29 Apr 2026 // 18:05 UTC · Databricks cannot shake a class action lawsuit targeting its LLM, which several book authors contend was created with a...
The case, O’Nan v. Databricks, Inc., in the United States District Court for the Northern District of California, seeks damages for the authors and an...
Databricks, Inc. is a private company founded in 2013 by Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsa...
Other executives include Matei Zaharia, Co-Founder, Chief Technology Officer, Board Member; Hatim Shafique, Chief Operating Officer and 21 others.
The safeguards we use include implementing the European Commission’s Standard Contractual Clauses for transfers of personal information from the EEA o...
SCCs: Databricks’ customer DPA incorporates the 2021 Standard Contractual Clauses to transfer customer personal data to countries outside Europe where...
The weights of the base model (DBRX Base) and the finetuned model (DBRX Instruct) are available on Hugging Face under an open license.
Use of DBRX is governed by the Databricks Open Model License and the Databricks Open Model Acceptable Use Policy.
Before you can download DBRX's weights and tokenizer, you'll need to visit the DBRX Hugging Face page and accept the license agreement. Note...
Databricks is ISO 27001:2013 certified.
Responsible disclosure culture demonstrated: 2024 internal environment access was publicly disclosed via official blog before external reporting; CVE-2024-49194 disclosed via public knowledge base bulletin. Transparent handling of security incidents is a positive signal for regulated-sector buyers.
Published safeguards & certifications
“Databricks is ISO 27001:2013 certified.”
“We perform penetration testing through a combination of our in-house offensive security team, qualified third-party penetration testers and a year-rou...”
“Before you can download DBRX's weights and tokenizer, you'll need to visit the DBRX Hugging Face page and accept the license agreement. Note...”
“Other executives include Matei Zaharia, Co-Founder, Chief Technology Officer, Board Member; Hatim Shafique, Chief Operating Officer and 21 others.”
“In 2025, Databricks acquired a serverless database startup, Neon, for around $1 billion.”
“The Databricks Bug Bounty Program enlists the help of the hacker community at HackerOne to make Databricks more secure. HackerOne is the #1 hacker-pow...”
“Use of DBRX is governed by the Databricks Open Model License and the Databricks Open Model Acceptable Use Policy. ”
“A vulnerability in the Databricks JDBC Driver could potentially allow remote code execution (RCE) by triggering a JNDI injection via a JDBC URL parame...”
“Aug. 19, 2025, 11:14 PM UTC ... Software firm Databricks Inc. succeeded in fending off new claims from a group of authors alleging its latest AI model...”
“A federal judge in California has allowed authors in a copyright suit against Databricks and its AI subsidiary Mosaic ML to broaden their case to incl...”
“Databricks cannot shake a class action lawsuit targeting its LLM, which several book authors contend was created with a database that contained pirate...”
“Databricks, Inc. is a private company founded in 2013 by Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsa...”
As classified under Regulation (EU) 2024/1689.
Provider of GPAI model (general-purpose).
We perform penetration testing through a combination of our in-house offensive security team, qualified third-party penetration testers and a year-rou...
The Databricks Bug Bounty Program enlists the help of the hacker community at HackerOne to make Databricks more secure. HackerOne is the #1 hacker-pow...
A vulnerability in the Databricks JDBC Driver could potentially allow remote code execution (RCE) by triggering a JNDI injection via a JDBC URL parame...
A federal judge in California has allowed authors in a copyright suit against Databricks and its AI subsidiary Mosaic ML to broaden their case to incl...
Aug. 19, 2025, 11:14 PM UTC ... Software firm Databricks Inc. succeeded in fending off new claims from a group of authors alleging its latest AI model...
Databricks cannot shake a class action lawsuit targeting its LLM, which several book authors contend was created with a database that contained pirate...
In 2025, Databricks acquired a serverless database startup, Neon, for around $1 billion.