•

Product Analytics with Star Schema and Mitzu.io

Mitzu can help you unlock the power of star schemas for product analytics. It can also simplify your data structure and enhance your reporting capabilities.

István Mészáros

June 18, 2024

•

5 min read

Share this post

Clickhouse and Mitzu warehouse-native integration

Table of contents

Star Schema: Simplifying Data Warehousing for Product Analytics

In data warehousing, star schema is an excellent model for organizing all the information about your business. Its simplicity and performance capabilities, particularly in product analytics, make it a go-to structure for analysts and business intelligence professionals.

The star schema's design, characterized by central fact tables connected to surrounding dimension tables, streamlines the querying process and improves insight retrieval speed.
Whether you're tracking sales performance, understanding customer behavior, or analyzing product trends, the star schema provides a robust framework that supports detailed and high-performing analytics.
This blog post explores how you can use star schema in warehouse native product analytics.

Visualizing Star Schema: A Practical Example

Before diving deeper into this topic, let me demonstrate a simple star schema data model.

Here, we have one fact table and two dimension tables. (See below)

Fact: page_events - this table contains simulated visitor logs from a non-existing website
Dimension: dim_countries - this table contains country-level information
Dimension: dim_items - this table contains product information that is presented on the website

Dimension Tables:

Fact Table:

Notice that the table has country_code and item_id columns as foreign keys.
In the fact table, we have two event types:

page_visit
element_clicked_on_page

The element_clicked_on_page event has a foreign key item_id that shows which item was clicked on the page. We have information about the items in the corresponding dimension table.

Overcoming Star Schema Limitations in Growing Organizations

Although the star schema is a great model for organizing information, data analysts can easily explore dimensions and facts due to its deduplicated format. Joining facts to dimensions is straightforward for data analysts on a small scale.
However, as the company and the complexity of the data grows, a significant challenge surfaces in the data organization: too many fact and dimension tables lead to knowledge-sharing issues, slowing business intelligence and data analytics.‍

This is especially a problem when it comes to ad-hoc questions. Answering simple questions for the management with SQL will take longer as the business grows in complexity.

Enhancing Data Access: The Power of Denormalization

Data analysts are slow to answer questions primarily because each SQL query requires research on which dimension table holds the required information and how to join to that table.
One potential solution to this is the denormalization of datasets.
In practice, this means creating so-called enriched fact tables where each dimension is pre-joined to the fact table.
This way, the fact tables are enriched with information from different dimensions.

This simple SQL query creates a VIEW for a denormalized fact table.


create or replace view page_events_extended as 
select 
    t1.*, 
    d1.item_name, 
    d1.cost as item_cost,
    d2.country_name,
    d2.currency as original_currency,
    d2.continent
from page_events t1
left join dim_items d1 on d1.item_id = t1.item_id
left join dim_countries d2 on d2.iso_country_code = t1.user_country_code

Notice we use left joins, assuming not all foreign keys are present.
The benefit of this table is that everything is already in place. Data analysts do not need to do any research.

Streamlining Analytics with Warehouse-Native Solutions

These enriched fact tables are ultimately great for quick ad hoc analytics. For example, an account manager might want to answer a question about the sales performance of a particular business segment. Another example is a product manager who might want to know which features in the app convert the most users to paying customers. Most business questions originate from non-technical people in a company. However, answering them requires special skills like SQL knowledge. Delegating questions to data analysts is, in fact, the main contributor to slow decision-making.
Warehouse native product analytics tools enable the non-technical folks in the company to answer their questions themself without SQL knowledge.
In the following section, I will walk through enabling Mitzu’s warehouse-native product analytics using enriched fact tables.

Implementing Mitzu.io: A Step-by-Step Guide to Self-Served Analytics

Mitzu is a warehouse-native product analytics solution that connects directly to your enriched fact tables and enables self-served product analytics.
With Mitzu, non-technical users can create user segmentation, funnels, and retention insights with a handful of clicks.

Step 1.

Connect Mitzu to your data warehouse. In the example below, you can see Snowflake.

Step 2.

List your enriched fact tables. Click save & update event catalog.

Step 3.

Start analyzing your data without SQL.

‍

Key Takeaways: Optimizing Star Schema for Efficient Product Analytics

This post discussed star schema's benefits and potential problems in product analytics. We focused on the usability of facts and dimension datasets with ad-hoc behavioral analytics SQL queries. We also discussed reducing the time to insight by introducing enriched fact tables.
‍

Last, we demonstrated how improving organizational efficiency is possible by introducing a warehouse-native product analytics solution that connects directly to the enriched fact tables.

Unbeatable solution for all of your analytics needs

Get started with Mitzu for free and power your teams with data!

Book a Demo Start for Free

Blogs for your growth

View all

Data Analytics

Product Analytics with Star Schema and Mitzu.io

Star Schema: Simplifying Data Warehousing for Product Analytics