•

RudderStack & Mitzu: Revolutionizing Warehouse Data Insight

You can set up RudderStack with Mitzu.io and BigQuery in under 10 minutes and gain deep insights into event tracking and revenue linkage.

István Mészáros

July 24, 2024

•

5 min read

Share this post

Clickhouse and Mitzu warehouse-native integration

Overview

Heading 2

Subscribe to our newsletter

Join 1000+ Data and Analytics professionals staying up-to-date with Mitzu's newsletter.

Thank you! You have been subscribed!

Oops! Something went wrong while submitting the form.

In this short post, I will show you how to set up RudderStack with BigQuery for Mitzu warehouse-native product analytics.

Introduction to Mitzu.io and Rudderstack

RudderStack CDP: Empowering Data-Driven Organizations

For people who don't know RudderStack, it is a customer data platform (CDP) that gathers and enriches usage data collected from your website or application and puts it into your data warehouse.

RudderStack is one of the few CDPs that is warehouse-native. Being a warehouse-native has many benefits for organizations of any size:

Full ownership and transparency over your data
Complete "customer 360" view accessible with SQL
Secure as all data is accessible only by you and your colleagues
Cost-effective

Mitzu: Unlocking Powerful Product Analytics Without SQL

Mitzu is a warehouse-native product analytics application directly connecting to your data warehouse. It autogenerates native SQL queries executed on top of your data warehouse. The generated SQL queries can answer various business questions in product and marketing analytics. The top business question categories are:

User segmentation
Conversion rates (marketing or feature adaptation)
Retention rates
User journeys

Benefits of using Mitzu warehouse-native PA compared to traditional PA like Mixpanel or Amplitude:

Accessing all data that is present in the data warehouse, not just product usage logs
Enhanced data security
The most cost-effective approach for product analytics
No vendor lock-in
Fits well into a company data strategy once data is centralized in a data warehouse

Synergy in Action: Leveraging RudderStack and Mitzu Together

The main point of RudderStack is to collect all usage analytics events in a single location, such as a data warehouse or data lake. You have all the data required to make good product and marketing decisions if you already maintain your data warehouse.

The only missing piece in the puzzle is enabling all product managers and marketers to access that data. Since product managers and marketing specialists are typically not fluent in SQL or Python, specialized tools would be needed to access insights that enable them.

Enter Mitzu, which is a tool that enables non-technical employees to access data warehouse andlearn key insights from the product data. But here, I collected other 5 tools that works with Rudderstack.

Step-by-Step Guide: Setting Up RudderStack with BigQuery

In this tutorial, I will use BigQuery as a data warehouse. BigQuery is a great data warehouse with a generous free tier. It is also extremely simple to set up and maintain.

Configuring BigQuery data warehouse

The prerequisite for this is to have a GCP account.

Here is a great guide on how to set up BigQuery and a storage Bucket in GCP for RudderStack.

https://cloud.google.com/bigquery/docs/bigquery-web-ui

This should take less than 10 minutes to set up. If you didn’t yet have a data warehouse, now you have it 😊.

Here is an example of how your BigQuery will look after connecting it to RudderStack.

Data collection with RudderStack

RudderStack has many data sources and SDKs to collect data from your websites or mobile applications. This post will not cover how to set up tracking with RudderStack. I will only cover how to move data to BigQuery.

Let's imagine we already have two data sources:

Marketing page events (Website source)
Application event logs from the server side (Application server events)

We also have two destinations:

Our newly created BigQuery data warehouse
Postgres DB - this is just for demonstration of how you can multicast events in RudderStack

This setup will collect all events from our marketing website and the server side of our application. It will then ingest this data to both BigQuery and, after some filtering, Postgres.

If your configuration is correct, you should be able to see a similar chart on the Events Trend tab in RudderStack.

Configuring the BigQuery destination is easy. You can find all the information here. Once the connection is set up, after a while, your BigQuery data set should look something like this:

Each event tracked on the landing page or the application should be stored in a single table.Each table is partitioned by the events' timestamp, enabling efficient data processing.

Seamless Integration: Connecting Mitzu to Your RudderStack Data

All events in our BigQuery data warehouse can be processed with SQL or Python.
The BigQuery dataset that contains our event tables allows you to connect Mitzu with a few clicks.

First, set up the connection to BigQuery from Mitzu. You can reuse the same service account created for RudderStack integration for simplicity.

Next, add the entire dataset as a schema to Mitzu. Navigate to the Event tables tab and hit Add schema. This will ensure that new events are synced with a single click to Mitzu.

Note RudderStack creates these important columns in all tables:

user_id - will contain the users’ identifiers
timestamp - will contain the events’ timestamp
_PARTITIONTIME - this is specific to BigQuery only, this column should be used as the date partition field. Based on this column BigQuery can optimize product analytics queries.

Hint: Make sure only the tables, and not the views, are synced.You can do this with the following regular expression: .*(?<!view)$

Once you are done select the newly added Schema and hit Fetch Selected. You should see something like this. All tables are listed from the BigQuery.

Next, click Save & update event catalog. This will ensure that all tables are represented as product events inside Mitzu.

For your landing page events, RudderStack creates a table called pages , which is not a typical event table. This table may only have the anonymous_id as user ID.So you need to reconfigure this table by selecting and clicking on the Configure table button.

Unleashing Insights: Product Analytics with Mitzu and RudderStack

Once your tables are listed and you have saved and updated your event catalog, you can head to the "insights" page.

I will show you some examples of what you can do with Mitzu and the data in BigQuery.

Start with segmentation:

Create your first funnel insight:

Measure user retention:

Discover user journeys:

From Setup to Insights: Mastering Warehouse-Native Analytics

In this post, I covered how to set up RudderStack with BigQuery for warehouse-native event tracking. This setup created separate tables in BigQuery for each event tracked with RudderStack, making it ideal for SQL and Python-based analytics.

This setup would work with other data warehouses like Clickhouse, Databricks, or Snowflake. I used BigQuery because it is very simple to set up.

I then connected Mitzu to BigQuery and indexed the event tables in the dataset. This process took less than 10 minutes in total.

Lastly, I demonstrated Mitzu's product analytics capabilities over BigQuery event tables.

We have segmented our users and created a simple funnel insight. After that, we measured our users' retention and visualized their journey in the application.

Unbeatable solution for all of your analytics needs

Get started with Mitzu for free and power your teams with data!

Book a Demo Start for Free

Blogs for your growth

View all

Data Analytics

RudderStack & Mitzu: Revolutionizing Warehouse Data Insight

Introduction to Mitzu.io and Rudderstack

RudderStack CDP: Empowering Data-Driven Organizations

Mitzu: Unlocking Powerful Product Analytics Without SQL

Synergy in Action: Leveraging RudderStack and Mitzu Together

Step-by-Step Guide: Setting Up RudderStack with BigQuery

Configuring BigQuery data warehouse

Data collection with RudderStack

Seamless Integration: Connecting Mitzu to Your RudderStack Data

Unleashing Insights: Product Analytics with Mitzu and RudderStack

From Setup to Insights: Mastering Warehouse-Native Analytics

Unbeatable solution for all of your analytics needs

Blogs for your growth

Top 5 Customer Journey Analysis Tools for 2025

Star Schema: Best Model for SaaS Product Analytics Explained

Designing a Composable Product Analytics Stack with Snowplow and Mitzu

How to get started?

Collect data
‍

Setup Mitzu
‍

Start making better decisions faster

Unbeatable solution for all of your analytics needs

Introduction to Mitzu.io and Rudderstack

RudderStack CDP: Empowering Data-Driven Organizations

Mitzu: Unlocking Powerful Product Analytics Without SQL

Synergy in Action: Leveraging RudderStack and Mitzu Together

Step-by-Step Guide: Setting Up RudderStack with BigQuery

Configuring BigQuery data warehouse

Data collection with RudderStack

Seamless Integration: Connecting Mitzu to Your RudderStack Data

Unleashing Insights: Product Analytics with Mitzu and RudderStack

From Setup to Insights: Mastering Warehouse-Native Analytics

Unbeatable solution for all of your analytics needs

Blogs for your growth

Top 5 Customer Journey Analysis Tools for 2025

Star Schema: Best Model for SaaS Product Analytics Explained

Designing a Composable Product Analytics Stack with Snowplow and Mitzu

How to get started?

Collect data‍

Setup Mitzu‍

Start making better decisions faster

Unbeatable solution for all of your analytics needs

Collect data
‍

Setup Mitzu
‍