Steep vs Mitzu: Agentic Metrics on the Semantic Layer vs Agentic Product Analytics on the Warehouse

TL;DR

Steep is a metrics-first AI analytics platform: governed metrics defined in dbt MetricFlow or Cube, surfaced through an LLM agent that runs on the semantic model. Mitzu is an agentic product analytics platform. The Analytics Agent assembles funnel, retention, segmentation, journey, and cohort specifications; a deterministic query engine turns them into SQL. Both replace dashboard sprawl with an agent grounded in a semantic layer. The shape of that semantic layer is the divide — BI metrics-and-dimensions in Steep, product-analytics primitives in Mitzu.

Use this comparison to evaluate two semantic-layer-grounded agents through an agentic analytics lens: which platform fits a BI-style metrics workflow, and which fits product, growth, and marketing behavioural analysis with trusted methodology and trusted SQL.

Steep and Mitzu both replace dashboard sprawl with an LLM agent grounded in a semantic layer. They are easy to confuse from the outside — both promise faster answers, trusted definitions, and a chat interface over warehouse data. The honest framing is that they sit at different layers and ground their agents in different shapes of semantic model. Steep is metrics-first agentic analytics, built on a BI semantic layer (dbt MetricFlow or Cube). Mitzu is agentic product analytics, built on an auto-configured semantic layer specialised for funnels, retention, journeys and cohorts, with a deterministic query engine underneath. They are complementary, and the rest of this post is the case for when each is the right tool.

What is Steep?

Steep positions itself as "AI analytics for faster insights and zero chaos" — a modern alternative to traditional dashboard-based BI. The product's core idea is that organisations should ship a governed metrics catalogue, not a sprawl of duplicate dashboards. Three pillars: built on metrics, powered by AI, designed for engagement. Customers cited on the homepage include Framer, Voi, Juni, Bounce and Philadelphia Inquirer.

The semantic layer is the foundation. Teams either define metrics in dbt MetricFlow via dbt Cloud, or in Cube, and Steep publishes them to end users. Metrics, dimensions and entities are modelled once in code and reused across reports and the AI agent. Steep also exposes its own native semantic layer for teams that don't want to maintain MetricFlow or Cube directly.

Steep AI is the agent surface. The product blog describes it as helping teams "move from question to answer faster while staying grounded in the semantic model defined by your data team." The agent can "connect metrics and dimensions, compare segments and time periods, and return clear answers grounded in the business logic" the data team established. The Steep API exposes metric metadata and a query endpoint so external LLMs and agents can use the same governed definitions.

Metrics catalogue — central place for metric, dimension, and entity definitions, governed by the data team.
dbt MetricFlow + Cube integrations — bring an existing semantic layer; or define metrics directly in Steep's native layer.
Steep AI — natural-language agent grounded in the semantic model, with metric links, reasoning steps, and clarifying questions surfaced in the chat.
Reports and exploration — interactive reporting over governed metrics, ad-hoc breakdowns, time-period comparisons.
Metrics API — metadata and query endpoints so any LLM, agent, or downstream app can consume the same metric definitions.
Permissions, smart caching, SSO/SCIM — table stakes governance for a BI tool, on the Business and Enterprise tiers.

Pricing (at time of writing): Pro $15 per seat per month (free up to 3 seats), Business $25 per seat per month, Enterprise custom. All plans include the Steep AI agent. Steep is positioned as a BI-replacement: dashboards out, metrics catalogue + agent in.

What is Mitzu?

Mitzu is an agentic product analytics platform that runs on your data warehouse and answers behavioural questions through natural-language conversation, without writing SQL. The category is narrower than general agentic analytics — Mitzu is specialised for product, growth and marketing behavioural questions on event data.

Mitzu meets users in three places: the in-app Analytics Agent, the Slack Agent in any public or private channel, and a remote MCP server that exposes Mitzu's capabilities to any MCP-compatible agent (Claude, Cursor, ChatGPT, custom). Setup is handled by a Configuration Agent that scans the warehouse, recognises common event schemas (Segment, Snowplow, Firebase, GA4, custom), maps user and group identifiers, and builds the semantic layer automatically. No hand-authored YAML, no weeks of MetricFlow modelling before the first answer.

The trust differentiator: Mitzu's agent does not write SQL. It assembles structured analysis specifications — funnel steps with a conversion window, retention cohorts and return events, segmentation filters with sampled property values, journey definitions — and a deterministic query engine turns those specifications into SQL. The same specification produces the same SQL every time. Methodology errors that LLMs reliably make (a funnel without a window, a retention chart that double-counts, a cohort defined wrong) are guard-railed by the engine, not by prompt engineering.

Steep vs Mitzu: side-by-side

	Steep	Mitzu
Category	Agentic BI / metrics-first analytics	Agentic product analytics on the warehouse
Semantic layer shape	BI semantic layer: metrics, dimensions, entities (via dbt MetricFlow, Cube, or Steep native)	Product analytics semantic layer: events, event properties, entities, dimension properties, sampled property values
Semantic layer authoring	Hand-authored — MetricFlow YAML, Cube schemas, or Steep's UI	Auto-built by the Configuration Agent scanning the warehouse
Who composes the query	Steep AI assembles metric / dimension queries against the semantic layer	Deterministic query engine, from a typed analysis specification the agent assembles
Methodology primitives	Metrics + dimensions + time + comparisons	Funnel, retention, segmentation, journey, cohort as first-class primitives
Where the data lives	Customer warehouse (queries executed there)	Customer warehouse — ClickHouse, Snowflake, BigQuery, Databricks, Redshift, Athena, Trino/Presto, Postgres, Firebolt, Starburst, MS Fabric
Surfaces	Steep web app + Metrics API for external agents	In-app Analytics Agent, Slack Agent, remote MCP server
Pricing model	Per seat: Pro $15, Business $25, Enterprise custom	Per editor seat with unlimited events; warehouse compute stays under the customer's control
Best for	Cross-domain metrics: revenue, finance, marketing performance, operational KPIs	Product, growth and marketing behavioural questions where methodology must be right

SQL examples: the same question, two paths

Take a typical product analytics question: "What is our 7-day signup-to-activation conversion rate, broken down by acquisition channel, for the last 30 days?"

Steep: SQL composed through the semantic layer

Steep's agent grounds itself in metric and dimension definitions. To answer this question, the metrics catalogue would need at least two metrics — signups and activations_within_7d_of_signup — and a channel dimension on the user (or signup) entity. Once those exist, the agent composes a query roughly like:

-- The shape of query a metrics-first agent composes against
-- a BI semantic layer once the right metrics and dimensions exist.
WITH signups AS (
  SELECT user_id,
         min(event_time)            AS signup_at,
         any(properties['channel']) AS channel
  FROM events
  WHERE event_name = 'signup'
    AND event_time >= now() - INTERVAL 30 DAY
  GROUP BY user_id
),
activations_7d AS (
  SELECT s.user_id, s.channel
  FROM signups s
  WHERE EXISTS (
    SELECT 1 FROM events a
    WHERE a.user_id    = s.user_id
      AND a.event_name = 'activated'
      AND a.event_time BETWEEN s.signup_at AND s.signup_at + INTERVAL 7 DAY
  )
)
SELECT s.channel,
       count(*)                                                 AS signups,
       (SELECT count(*) FROM activations_7d a
         WHERE a.channel = s.channel)                           AS activated_7d,
       round(activated_7d / nullIf(signups, 0) * 100, 1)        AS conv_pct
FROM signups s
GROUP BY s.channel
ORDER BY signups DESC;

Three things to notice. First: the question depends on metric definitions that someone has to author and maintain in MetricFlow, Cube, or Steep's native layer — including a custom metric for activation-within-window. Second: dbt MetricFlow and Cube are BI-shaped semantic layers; they express metrics, dimensions, and joins, not funnel, retention or cohort primitives. The data team carries the methodology, encoded as derived metrics.

Third: when a new question arrives — change the window from 7 days to 14, or split by week-of-signup cohort — somebody has to author another metric. Methodology lives upstream in code.

Mitzu: SQL from a deterministic engine

The Mitzu agent does not write the SQL. It assembles a funnel specification — roughly: { first_event: "signup", subsequent_events: ["activated"], conversion_window: "7d", breakdown: "channel", date_range: "last_30_days" } — and the deterministic engine emits the same SQL every time:

-- Engine output for a 2-step funnel with a 7-day conversion window,
-- broken down by channel, for the last 30 days. Same spec → same SQL.
WITH step_1 AS (
  SELECT user_id,
         min(event_time)             AS step_1_at,
         any(properties['channel'])  AS channel
  FROM events
  WHERE event_name = 'signup'
    AND event_time >= now() - INTERVAL 30 DAY
    AND event_time <  now()
  GROUP BY user_id
),
step_2 AS (
  SELECT s1.user_id,
         s1.channel,
         min(e.event_time) AS step_2_at
  FROM step_1 s1
  INNER JOIN events e
    ON e.user_id = s1.user_id
   AND e.event_name = 'activated'
   AND e.event_time >  s1.step_1_at
   AND e.event_time <= s1.step_1_at + INTERVAL 7 DAY
  GROUP BY s1.user_id, s1.channel
)
SELECT s1.channel                                AS channel,
       count(DISTINCT s1.user_id)                AS step_1_users,
       count(DISTINCT s2.user_id)                AS step_2_users,
       round(count(DISTINCT s2.user_id)
             / nullIf(count(DISTINCT s1.user_id), 0) * 100, 1) AS conv_pct
FROM step_1 s1
LEFT JOIN step_2 s2 USING (user_id)
GROUP BY channel
ORDER BY step_1_users DESC;

The conversion window is enforced strictly (activation must happen after signup and within 7 days). Distinct users prevent double-counting. Channel comes from the signup row, so attribution is consistent. Change the window to 14 days, or break down by week-of-signup cohort, and the agent re-assembles the specification — no new metric to author, no extra MetricFlow YAML to merge.

The same deterministic engine has been generating this shape of SQL in Mitzu's UI for years.

Retention: a question Steep can't answer natively

"Week-by-week return rate for users who signed up in each of the last eight weeks." A classic cohort retention question. dbt MetricFlow has no retention primitive — every cohort week and every return-week pair would need an explicit metric. Cube's data model is similar in shape. In Steep, this question pushes the problem upstream into either dozens of derived metrics, or a notebook. In Mitzu, the spec is one line:

-- Engine output for cohort retention: weekly cohorts, weekly return,
-- 8 cohorts back, ClickHouse-flavoured SQL. Same spec → same SQL.
WITH cohorts AS (
  SELECT user_id,
         toMonday(min(event_time)) AS cohort_week
  FROM events
  WHERE event_name = 'signup'
    AND event_time >= toMonday(now()) - INTERVAL 56 DAY
  GROUP BY user_id
),
returns AS (
  SELECT c.cohort_week,
         c.user_id,
         dateDiff('week', c.cohort_week, toMonday(e.event_time)) AS week_n
  FROM cohorts c
  INNER JOIN events e
    ON e.user_id = c.user_id
   AND e.event_time >= c.cohort_week
   AND e.event_name IN ('session_start','core_action')
  GROUP BY c.cohort_week, c.user_id, week_n
)
SELECT cohort_week,
       count(DISTINCT user_id) FILTER (WHERE week_n = 0) AS week_0,
       count(DISTINCT user_id) FILTER (WHERE week_n = 1) AS week_1,
       count(DISTINCT user_id) FILTER (WHERE week_n = 2) AS week_2,
       count(DISTINCT user_id) FILTER (WHERE week_n = 3) AS week_3,
       count(DISTINCT user_id) FILTER (WHERE week_n = 4) AS week_4
FROM returns
GROUP BY cohort_week
ORDER BY cohort_week;

Same point for journey trees (top-three paths from event A within N steps), heatmaps (hour-of-day × day-of-week engagement), churn analysis (users present in week N, absent in N+1), impact analysis (exposed vs control around a release date) and deep-dive investigations ("why did week-2 retention drop in November?" — broken into many tool calls and synthesised). All of these are first-class in Mitzu's specification model. In Steep's BI semantic layer, each one is a derived metric to author and maintain, or a notebook to write outside the tool.

Advantages and trade-offs

Steep

Strengths	Trade-offs
Metrics catalogue is the centre of gravity — finance, revenue, marketing performance, operational KPIs all live in one governed place.	Methodology is upstream in MetricFlow / Cube / Steep's native layer — new question types often mean new metric definitions, authored by the data team.
Integrates cleanly with an existing dbt + dbt Semantic Layer stack via dbt Cloud, and with Cube for teams already running Cube.	BI semantic layer shape — metrics, dimensions, joins. Funnels, retention, journeys and cohorts are not native concepts.
Steep AI grounds answers in trusted metric definitions, so descriptive questions stay consistent across the team.	Diagnostic and behavioural questions (why did retention drop, which onboarding step leaks, did the campaign move conversion) require derived metrics or land outside the tool.
Metrics API exposes the same definitions to external LLMs, agents, and downstream apps.	Setup expects a data team able to model and maintain MetricFlow or Cube — strong fit for orgs already doing it, friction for orgs starting from raw events.
Per-seat pricing is straightforward and approachable; Pro tier is free up to 3 seats.	Volume of native warehouse connectors and product-analytics depth are narrower than tools that specialise in event data.

Mitzu

Strengths	Trade-offs
The agent does not write SQL — a deterministic query engine does, from a typed specification. Same input, same SQL, same answer.	Narrower scope — Mitzu is built for product, growth and marketing behavioural questions, not classic BI dashboarding or financial reporting.
Auto-built semantic layer specialised for product analytics — events, event properties, entities, dimension properties and sampled filter values. No hand-authored YAML.	Requires event data already in the warehouse. Companies without a warehouse, or with events trapped in a third-party tool that will not export, are not the fit.
Funnel, retention, segmentation, journey and cohort are first-class primitives — including impact analysis, churn analysis, heatmaps and multi-step deep-dive investigations.	Open-ended statistical exploration belongs in a notebook (Hex, Deepnote, Jupyter), not in Mitzu.
Warehouse-agnostic — runs on ClickHouse, Snowflake, BigQuery, Databricks, Redshift, Athena, Trino/Presto, Postgres, Firebolt, Starburst and MS Fabric.	Self-hosted deployment is available on the Enterprise tier; the lower tiers are SaaS.
Three surfaces share one semantic layer: in-app Analytics Agent, Slack Agent, and a remote MCP server for any external agent.	—
Per-editor seat pricing with unlimited events; warehouse compute stays under the customer's control.	—

Capability scorecard

Where each tool stands on the capabilities that matter for product analytics work, alongside the BI capabilities that matter for a metrics catalogue.

Capability	Steep	Mitzu
Runs on the customer's warehouse	✅	✅
Multi-warehouse support (Snowflake, BigQuery, Databricks, Redshift, ClickHouse, Trino, Postgres…)	✅ Major databases	✅
Self-hosted deployment	❌	✅ Enterprise tier
Deterministic query engine (agent does not write SQL)	❌	✅
Auto-built semantic layer (no manual YAML)	❌ MetricFlow / Cube / hand-authored	✅
Semantic layer specialised for product analytics	❌ BI-shaped (metrics + dimensions)	✅
Native funnel methodology with conversion window	❌	✅
Native retention methodology with cohorts	❌	✅
Native journey / path analysis	❌	✅
Native segmentation with sampled property values	❌	✅
Native heatmaps + churn + impact analysis primitives	❌	✅
Cohort definitions stored and reusable in the semantic layer	❌	✅
Reviewable SQL surfaced for every answer	✅	✅
Slack agent	❌	✅
Remote MCP server for external agents	❌	✅
Metrics API for external consumers	✅	❌
Governed metric catalogue across finance, ops, marketing	✅	❌ Product-analytics-only
dbt MetricFlow integration	✅	❌
Cube integration	✅	❌

UI differences in practice

Steep's interface is metric-catalogue-first: a sidebar of governed metrics, a report canvas for breakdowns and comparisons, and a chat surface for Steep AI. The chat shows the reasoning steps, the metrics and dimensions it referenced, and a clarifying question when the prompt is ambiguous. Mitzu's interface is agent-first: the Analytics Agent lives at the centre of the workspace, with the semantic layer, saved insights, dashboards, and cohorts as browseable context. Each agent reply renders a chart or table built from an analysis specification — and exposes that specification, the cohort it ran against, and the SQL the engine generated.

The two reflect the underlying division: Steep is a metrics catalogue with an agent on top; Mitzu is a product-analytics methodology layer with an agent as the primary surface.

When to choose Steep, Mitzu, or both?

These are different layers, not direct substitutes. The shape of question that dominates your team's analytics workload should drive the choice.

Choose Steep when the bulk of analytics questions are descriptive metrics across the business — revenue, MRR / ARR, finance, marketing performance, operational KPIs — your team already maintains (or is willing to maintain) a dbt MetricFlow or Cube semantic layer, and a single governed metrics catalogue with an agent on top is the goal.
Choose Mitzu when product, growth, or marketing teams need to ask diagnostic behavioural questions (why did week-2 retention drop, which onboarding step has the highest drop-off, did the new pricing page move trial-to-paid, what does the top journey look like after a sign-up) and you want methodology guard-rails the LLM cannot break.
Run both when Steep is the metrics catalogue for the broader business and Mitzu is the behavioural layer for product analytics — same warehouse, two complementary agents grounded in two different semantic layers.

FAQ

Does Steep do product analytics out of the box?

Steep can answer descriptive metric questions on event data once those metrics are defined in MetricFlow, Cube, or its native semantic layer. Funnel-with-conversion-window, cohort retention, journey trees, segmentation with sampled property values, and impact analysis are not native primitives of a BI semantic layer. They have to be expressed as derived metrics, or land outside the tool. Mitzu is built specifically for that question shape.

Can Mitzu replace Steep for finance and ops dashboards?

Mitzu is purpose-built for product, growth and marketing behavioural questions on event data. It is not a general BI tool for finance, FP&A or operational KPIs across the business. A team running Steep for the metrics catalogue across the org would still want Steep for that surface — and add Mitzu for the behavioural layer on top of the same warehouse.

Both tools talk about a semantic layer. Aren't they the same?

The word is the same; the shape isn't. A BI semantic layer (dbt MetricFlow, Cube, LookML, what Steep relies on) expresses metrics, dimensions, entities and joins. A product-analytics semantic layer (Mitzu's) expresses events, event properties, entities, dimension properties, and sampled property values, plus cohort and insight context. The first is built for slicing metrics; the second is built for funnel, retention, journey, segmentation and cohort questions on behavioural data.

What warehouses does each tool support?

Steep advertises connectors to all major databases on its pricing page, with extra database connectors gated to the Enterprise tier. Mitzu supports ClickHouse, Snowflake, BigQuery, Databricks, Redshift, Athena, Trino/Presto, Postgres, Firebolt, Starburst and MS Fabric natively. Both keep queries running in the customer's warehouse — neither requires data egress.

Does Mitzu have a metrics API like Steep?

Mitzu exposes its capabilities to external agents through a remote MCP server (Model Context Protocol). Any MCP-compatible agent — Claude, Cursor, ChatGPT, custom — can use Mitzu as its product-analytics backend. It is not a generic metrics API in the Steep shape, but it covers the same use case: bring the agent into another surface without losing the governed methodology.

Product

Insights

Data

Steep vs Mitzu: Agentic Metrics on the Semantic Layer vs Agentic Product Analytics on the Warehouse

TL;DR

What is Steep?

What is Mitzu?

Steep vs Mitzu: side-by-side

SQL examples: the same question, two paths

Steep: SQL composed through the semantic layer

Mitzu: SQL from a deterministic engine

Retention: a question Steep can't answer natively

Advantages and trade-offs

Steep

Mitzu

Capability scorecard

UI differences in practice

When to choose Steep, Mitzu, or both?

FAQ

Does Steep do product analytics out of the box?

Can Mitzu replace Steep for finance and ops dashboards?

Both tools talk about a semantic layer. Aren't they the same?

What warehouses does each tool support?

Does Mitzu have a metrics API like Steep?

References

Key Takeaways

About the Author

Subscribe to our newsletter

Ready to transform your analytics?

Related Articles

Product analytics tool with AI agent: capabilities buyers should demand

ChatGPT vs AI Analytics Agent: Which One Gives Reliable Business Answers?

ClickHouse AI vs Mitzu: Agentic SQL on the Warehouse vs Agentic Product Analytics on the Warehouse

How to get started with Mitzu

Connect your data warehouse

Define your events

Start analyzing

Product

Insights

Data

Steep vs Mitzu: Agentic Metrics on the Semantic Layer vs Agentic Product Analytics on the Warehouse

TL;DR

What is Steep?

What is Mitzu?

Steep vs Mitzu: side-by-side

SQL examples: the same question, two paths

Steep: SQL composed through the semantic layer

Mitzu: SQL from a deterministic engine

Retention: a question Steep can't answer natively

Advantages and trade-offs

Steep

Mitzu

Capability scorecard

UI differences in practice

When to choose Steep, Mitzu, or both?

FAQ

Does Steep do product analytics out of the box?

Can Mitzu replace Steep for finance and ops dashboards?

Both tools talk about a semantic layer. Aren't they the same?

What warehouses does each tool support?

Does Mitzu have a metrics API like Steep?

Related reading

References

Key Takeaways

About the Author

Subscribe to our newsletter

Ready to transform your analytics?

Related Articles

Product analytics tool with AI agent: capabilities buyers should demand

ChatGPT vs AI Analytics Agent: Which One Gives Reliable Business Answers?

ClickHouse AI vs Mitzu: Agentic SQL on the Warehouse vs Agentic Product Analytics on the Warehouse

How to get started with Mitzu

Connect your data warehouse

Define your events

Start analyzing