Databricks Genie vs Mitzu: Agentic Lakehouse Analytics vs Agentic Product Analytics

TL;DR

Databricks AI/BI Genie is Databricks' native agentic analytics surface — an LLM writes SQL against Unity Catalog tables, grounded by author-curated instructions, example queries, and certified SQL functions. Mitzu is an agentic product analytics platform. The Analytics Agent assembles funnel, retention, segmentation, journey, and cohort specifications; a deterministic query engine turns them into SQL. Both run on Databricks — Mitzu connects to Unity Catalog natively, alongside Snowflake, BigQuery, ClickHouse, Redshift, Postgres, Trino and others.

Use this comparison to evaluate tools through an agentic analytics lens: which platform enables an AI data analyst workflow with trusted SQL and a trusted semantic layer, not just faster dashboarding on top of Databricks.

Databricks has been pushing AI/BI Genie as the native conversational analytics surface on the lakehouse. That makes "Databricks Genie vs Mitzu" a fair question to ask: both run on the warehouse, both promise an agentic analytics workflow, and Databricks-using teams sit squarely in Mitzu's ICP. The honest framing is that they sit at different layers. Genie is general-purpose agentic SQL on Unity Catalog tables. Mitzu is agentic product analytics on the warehouse — narrower category, deterministic engine, semantic layer specialised for funnels, retention, journeys, and cohorts. They are complementary, and Mitzu connects to Databricks as a first-class warehouse.

What is Databricks Genie?

Databricks AI/BI Genie — surfaced through Genie Spaces — is the conversational layer of Databricks' AI/BI product. Business users ask questions in natural language; an LLM translates them into SQL, runs the SQL on a SQL warehouse, and returns answers with auto-generated visualisations. Data access is governed by Unity Catalog: row filters and column masks enforce per-user permissions, and users only see data they are allowed to access.

A Genie Space is configured by a domain expert who registers Unity Catalog datasets and curates a knowledge store: table descriptions, column synonyms, JOIN relationships, example SQL queries, and parameterised SQL functions. When the agent's response exactly matches one of those parameterised examples or functions, it is marked as a Trusted asset to signal verified accuracy. Genie is documented as a "compound AI system" that filters this curated context plus chat history into the LLM prompt that produces each query.

Natural-language to SQL over Unity Catalog tables, with auto-generated visualisations.
Knowledge store — table descriptions, column synonyms, JOIN relationships, example queries, SQL functions curated by domain experts.
Trusted assets — responses that exactly match a parameterised example query or SQL function are flagged as verified.
Unity Catalog governance — row filters, column masks, and per-user SELECT privileges automatically applied.
Embeddable via APIs into apps like Microsoft Teams, Slack, and Glean; also available inside AI/BI Dashboards for ad-hoc follow-ups.
Structured-data only — Genie does not answer questions about unstructured documents (PDFs, Word) according to the Databricks documentation.

Genie is general-purpose by design. The same architecture handles sales analytics, finance, supply chain, customer success, and product usage equally — methodology lives in whatever SQL the LLM produces, helped by whatever example queries the domain expert curated. It does not ship native funnel, retention, segmentation, journey, or cohort primitives.

What is Mitzu?

Mitzu is an agentic product analytics platform that runs on your data warehouse and answers behavioural questions through natural-language conversation, without writing SQL. The category is narrower than general agentic analytics — Mitzu is specialised for product, growth and marketing behavioural questions on event data.

Mitzu meets users in three places: the in-app Analytics Agent, the Slack Agent in any public or private channel, and a remote MCP server that exposes Mitzu's capabilities to any MCP-compatible agent (Claude, Cursor, ChatGPT, custom). Setup is handled by a Configuration Agent that scans the warehouse, recognises common event schemas (Segment, Snowplow, Firebase, GA4, custom), maps user and group identifiers, and builds the semantic layer automatically. Databricks is one of the supported warehouses — see Product Analytics with Mitzu and Databricks.

The trust differentiator: Mitzu's agent does not write SQL. It assembles structured analysis specifications — funnel steps with a conversion window, retention cohorts and return events, segmentation filters with sampled property values, journey definitions — and a deterministic query engine turns those specifications into SQL. The same specification produces the same SQL every time. Methodology errors that LLMs reliably make (a funnel without a window, a retention chart that double-counts, a cohort defined wrong) are guard-railed by the engine, not by prompt engineering or hand-curated example queries.

Databricks Genie vs Mitzu: side-by-side

	Databricks AI/BI Genie	Mitzu
Category	Agentic SQL / general analytics on the lakehouse	Agentic product analytics on the warehouse
Who writes the SQL	LLM, grounded in Unity Catalog metadata + curated example queries / SQL functions	Deterministic query engine, from a typed analysis specification
Grounding	Knowledge store: table descriptions, column synonyms, JOIN relationships, example queries, parameterised SQL functions — curated by domain experts	Auto-built product-analytics semantic layer (events, properties, entities, sampled values)
Setup model	Domain expert curates a Genie Space — instructions, examples, synonyms, JOINs	Configuration Agent scans the warehouse and builds the semantic layer automatically
Methodology primitives	None native — LLM composes ad-hoc SQL per question, helped by example queries	Funnel, retention, segmentation, journey, cohort as first-class primitives
Where it runs	Databricks only (Unity Catalog + SQL warehouse)	Databricks, Snowflake, BigQuery, ClickHouse, Redshift, Athena, Trino/Presto, Postgres, Firebolt, Starburst, MS Fabric
Surfaces	Genie Spaces web UI, mobile, embedded (Teams, Slack, Glean), AI/BI Dashboards	In-app Analytics Agent, Slack Agent, remote MCP server
Governance	Unity Catalog — row filters, column masks, per-user SELECT enforcement	Inherits warehouse governance; SQL is reviewable for every answer
Trust signal	Responses are marked "Trusted" when they exactly match a curated example query or SQL function	Engine output is deterministic — same specification, same SQL, every time
Best for	General-purpose lakehouse analytics across any domain (sales, finance, support, ops, product…)	Product, growth, and marketing behavioural questions where methodology must be right

SQL examples: the same question, two paths

Take a typical product analytics question: "What is our 7-day signup-to-activation conversion rate, broken down by acquisition channel, for the last 30 days?"

Databricks Genie: SQL the LLM might generate

-- Plausible Genie output against a Delta table in Unity Catalog.
-- Looks reasonable; methodology depends on the curated instructions and example queries.
WITH signups AS (
  SELECT user_id,
         min(event_time)             AS signup_at,
         first(properties.channel)   AS channel
  FROM analytics.events
  WHERE event_name = 'signup'
    AND event_time >= current_timestamp() - INTERVAL 30 DAYS
  GROUP BY user_id
),
activations AS (
  SELECT user_id, min(event_time) AS activated_at
  FROM analytics.events
  WHERE event_name = 'activated'
    AND event_time >= current_timestamp() - INTERVAL 37 DAYS
  GROUP BY user_id
)
SELECT s.channel,
       count(*)                                                          AS signups,
       count_if(a.activated_at <= s.signup_at + INTERVAL 7 DAYS)         AS activated_in_7d,
       round(activated_in_7d / signups * 100, 1)                         AS conv_pct
FROM signups s
LEFT JOIN activations a USING (user_id)
GROUP BY s.channel
ORDER BY signups DESC;

Reads cleanly, but the methodology is doing a lot of work in the prompt and the curated example queries. A different prompt run, a slightly different schema, or a missing example for this exact shape can yield: a window measured against the wrong anchor, an activation that pre-dates the signup counted as a conversion, channel attribution joined off the wrong row when a user has multiple signups, or a window that quietly slips because the LLM conflated the lookback with the conversion window. None of these are SQL bugs — they are methodology choices an LLM is making implicitly, every time. The Trusted asset signal helps when an answer matches a parameterised example, but the long tail of behavioural questions rarely matches one exactly.

Mitzu: SQL from a deterministic engine

The Mitzu agent does not write the SQL. It assembles a funnel specification — roughly: { first_event: "signup", subsequent_events: ["activated"], conversion_window: "7d", breakdown: "channel", date_range: "last_30_days" } — and the deterministic engine emits the same SQL every time:

-- Engine output for a 2-step funnel with a 7-day conversion window,
-- broken down by channel, for the last 30 days. Same spec → same SQL.
WITH step_1 AS (
  SELECT user_id,
         min(event_time)            AS step_1_at,
         first(properties.channel)  AS channel
  FROM analytics.events
  WHERE event_name = 'signup'
    AND event_time >= current_timestamp() - INTERVAL 30 DAYS
    AND event_time <  current_timestamp()
  GROUP BY user_id
),
step_2 AS (
  SELECT s1.user_id,
         s1.channel,
         min(e.event_time) AS step_2_at
  FROM step_1 s1
  INNER JOIN analytics.events e
    ON e.user_id = s1.user_id
   AND e.event_name = 'activated'
   AND e.event_time >  s1.step_1_at
   AND e.event_time <= s1.step_1_at + INTERVAL 7 DAYS
  GROUP BY s1.user_id, s1.channel
)
SELECT s1.channel                                AS channel,
       count(DISTINCT s1.user_id)                AS step_1_users,
       count(DISTINCT s2.user_id)                AS step_2_users,
       round(count(DISTINCT s2.user_id)
             / nullif(count(DISTINCT s1.user_id), 0) * 100, 1) AS conv_pct
FROM step_1 s1
LEFT JOIN step_2 s2 USING (user_id)
GROUP BY channel
ORDER BY step_1_users DESC;

The conversion window is enforced strictly (activation must be after signup and within 7 days). Distinct users prevent double-counting. Channel comes from the signup row, so attribution is consistent. The engine has been generating this shape of SQL in production for years; the agent's job is to assemble the specification, not to author the query.

The SQL is shown to the analyst as a verification artifact — not the agent's authored work.

Retention: a second example

Consider "Weekly retention of users who signed up in March, returning event = `feature_used`, eight weeks out." Genie can attempt the SQL, but the methodology — cohort time-bucketing, return-event scoping, the inclusive/exclusive treatment of week zero — depends on whether a sufficiently similar example query was curated in the Genie Space. Mitzu's agent assembles a retention specification — { cohort_event: "signup", cohort_window: "2026-03", return_event: "feature_used", granularity: "week", periods: 8 } — and the deterministic engine produces the same cohort SQL every time, with week-zero and DISTINCT user handling already correct.

Advantages and trade-offs

Databricks Genie

Strengths	Trade-offs
Native Databricks surface — no extra vendor, no data movement, Unity Catalog governance applied automatically.	Lakehouse-only — Genie runs against Unity Catalog, so teams with data in Snowflake, BigQuery, ClickHouse or elsewhere need a different tool there.
General-purpose by design — the same Genie Space can handle sales, finance, ops, support and product questions equally.	The LLM authors SQL — methodology errors on funnels, retention, cohorts and journeys are easy to make and hard to spot in a chat reply.
Knowledge store gives domain experts a place to encode synonyms, JOIN paths, example queries, and parameterised SQL functions.	Reliability moves with the quality of the curated knowledge store — instructions, examples and SQL functions need ongoing maintenance.
Embeddable in Teams, Slack, Glean, and AI/BI Dashboards — meets users where they already work.	Trusted assets only fire on exact matches with a curated example or SQL function — the long tail of behavioural questions usually doesn't qualify.
Strong fit when Databricks already powers a broad analytics surface and you want one chat interface across all of it.	Charts and visualisations are LLM-generated rather than driven by a typed methodology layer; consistency across questions is not guaranteed.

Mitzu

Strengths	Trade-offs
The agent does not write SQL — a deterministic query engine does, from a typed specification. Same input, same SQL, same answer.	Narrower scope — Mitzu is built for product, growth and marketing behavioural questions, not classic BI dashboarding or financial reporting.
Auto-built semantic layer specialised for product analytics — events, event properties, entities, dimension properties and sampled filter values. No hand-authored YAML, no curated example queries to maintain.	Requires event data already in the warehouse. Companies without a warehouse, or with events trapped in a third-party tool that will not export, are not the fit.
Funnel, retention, segmentation, journey and cohort are first-class primitives.	Open-ended statistical exploration belongs in a notebook (Hex, Deepnote, Jupyter), not in Mitzu.
Warehouse-agnostic — runs on Databricks, Snowflake, BigQuery, ClickHouse, Redshift, Athena, Trino/Presto, Postgres, Firebolt, Starburst and MS Fabric.	Self-hosted deployment is available on the Enterprise tier; the lower tiers are SaaS.
Three surfaces share one semantic layer: in-app Analytics Agent, Slack Agent, and a remote MCP server for any external agent.	—
Per-editor seat pricing with unlimited events; warehouse compute stays under the customer's control.	—

Capability scorecard

Where each tool stands on the capabilities that matter for product analytics work.

Capability	Databricks Genie	Mitzu
Runs on the customer's warehouse	✅	✅
Multi-warehouse support (Snowflake, BigQuery, ClickHouse, Redshift, Trino, Postgres…)	❌ Databricks only	✅
Unity Catalog governance (row filters, column masks)	✅	✅ via warehouse permissions
Self-hosted deployment	✅ inside Databricks	✅ Enterprise tier
Deterministic SQL engine (agent does not write SQL)	❌	✅
Auto-built semantic layer specialised for product analytics	❌	✅
Native funnel methodology	❌	✅
Native retention methodology	❌	✅
Native segmentation, journey and cohort primitives	❌	✅
Sampled property values for filters	❌	✅
Reviewable SQL surfaced for every answer	✅	✅
Curated example queries / trusted assets workflow	✅	❌ unnecessary
MCP server for external agents	❌	✅ Remote MCP
Slack agent	✅ via embed API	✅ native
Embedded in BI dashboards	✅ AI/BI Dashboards	❌
General-purpose across any analytics domain	✅	❌ Product analytics only

UI differences

Genie surfaces analytics inside the Databricks workspace. A Genie Space is the unit of configuration: domain experts curate datasets, instructions, example queries and SQL functions, and end users open the Space to ask questions. Answers come back as tables and LLM-generated visualisations, with the underlying SQL inspectable and Trusted-asset badges when applicable. Genie can also be embedded into AI/BI Dashboards as an ad-hoc follow-up panel and surfaced through APIs into Teams, Slack, and Glean.

Mitzu surfaces analytics across three places, all backed by the same semantic layer. The in-app Analytics Agent runs alongside dedicated funnel, retention, segmentation, journey and cohort UIs — the chat is one of several ways to assemble the same analysis specification. The Slack Agent handles questions in any channel, with thread context shared into the agent. The remote MCP server exposes the same capabilities to external agents (Claude, Cursor, ChatGPT, custom) so Mitzu can act as the trusted product-analytics backend for an agent the customer already runs.

When to choose Databricks Genie, Mitzu, or both?

These are layers, not substitutes. Genie gives Databricks teams a general agentic interface to Unity Catalog. Mitzu gives those same teams a product-analytics-specialised agent on top of the same warehouse. The right choice depends on what shape of question dominates your team's analytics workload.

Choose Databricks Genie when your analytics surface is broad and cross-domain on the lakehouse, you want Unity Catalog governance applied to chat-based answers, and you have the analyst cycles to curate Genie Spaces with instructions, example queries, and SQL functions.
Choose Mitzu when product, growth or marketing teams need to ask diagnostic behavioural questions — why did week-2 retention drop, did the new pricing page move trial-to-paid, which onboarding step has the highest drop-off — and you want methodology guard-rails the LLM cannot break.
Run both when Databricks is the system of record for a wide analytics surface and product analytics is one of several question types. Let Genie handle the long tail of cross-domain lakehouse questions and let Mitzu specialise in the behavioural layer.

FAQ

Does Mitzu work with Databricks?

Yes. Databricks is a first-class supported warehouse. Mitzu reads Delta tables and dbt-modelled tables in place — no data movement, no per-event pricing. See Product Analytics with Mitzu and Databricks for a walk-through, and Top 5 Product Analytics tools for Databricks for the broader landscape.

Does Databricks Genie replace Mixpanel, Amplitude, or other product analytics tools?

Not by itself. Genie is general-purpose agentic analytics on Unity Catalog data. For product analytics methodology specifically — funnels with conversion windows, retention cohorts, journey trees, segmentation with sampled filter values — you either add a layer like Mitzu, or build that methodology yourself in curated example queries and SQL functions and rely on the LLM to compose it correctly each time.

Can I use Genie for funnels and retention?

An LLM can absolutely write a funnel or retention query against a Delta table. Whether the methodology is right depends on the prompt, the curated example queries, and the day. The risk is not that the SQL fails to run — it usually runs — but that it answers the wrong question (window measured wrong, double-counted users, attribution joined off the wrong row). A deterministic engine that owns the methodology removes that class of error.

How does Genie handle hallucinations?

Genie mitigates them with curated grounding (table descriptions, column synonyms, JOIN relationships) and by flagging responses as Trusted when they exactly match a parameterised example query or SQL function. That works well for repeat questions covered by curated examples. The remaining surface — questions whose shape isn't covered by an example — still depends on the LLM authoring the SQL correctly.

Where does the data live in either tool?

Inside Databricks for Genie; inside the customer's warehouse (Databricks, Snowflake, BigQuery, ClickHouse, and others) for Mitzu. Both architectures are warehouse-native and neither moves data into a vendor silo. Compliance, data residency, and cost control all stay on the customer's side of the line.

Product