1. The Problem: Fact-to-Fact Joins
In a data warehouse, fact tables are very large (millions/billions of rows).
Sometimes analysts want to combine two different metrics from different fact tables. For example:
-
Sales_Fact
with measures like sales_amount, units_sold -
Inventory_Fact
with measures like on_hand_quantity
If you write a query like:
👉 This is a fact-to-fact join.
It’s problematic because:
-
Both fact tables are huge → performance disaster.
-
Grain may not align (Sales is at transaction grain, Inventory is at daily snapshot grain).
-
Double-counting and incorrect aggregations can happen.
2. The Multipass SQL Solution
Instead of directly joining fact tables, we use multipass SQL:
-
First pass: Aggregate each fact table independently to the right level of detail (usually using conformed dimensions).
-
Second pass: Join the aggregated results (much smaller) together.
Example:
✅ Now the join is between small aggregated tables, not massive fact tables.
✅ Avoids mis-grain joins because aggregation happens first.
✅ More efficient and semantically correct.
3. When to Use Multipass SQL
-
When users want to combine metrics from different processes (sales, inventory, shipments, returns, etc.).
-
When fact tables are at different grains (transaction-level vs. snapshot-level).
-
When fact-to-fact join would otherwise cause performance or correctness problems.
4. Related Design Principle
-
This is part of the "separate fact tables, conformed dimensions" strategy in dimensional modeling.
-
OLAP tools (like MicroStrategy, BusinessObjects, Cognos) often generate multipass SQL automatically behind the scenes to support cross-fact analysis.
👉 In short:
Multipass SQL means aggregating each fact table separately, then combining the smaller result sets, instead of directly joining giant fact tables.
It’s all about performance and avoiding mis-grain mistakes.