BUS220 - Business Intelligence and Analytics | Week 4
Oleh Omelchenko
2026-04-08
From clean data to decisions
Your VP sends you a Slack message
VP
VP of Sales10:23 AM
How are we doing?
What do you show them?
Is this good?
$2,300,000
$2,300,000 (+15% YoY, 102% of target)
Now you can act on it. A metric needs a comparison — a trend, a benchmark, or a target.
What does this dashboard tell you?
ARPDAU
$0.42
D7 Retention
18%
LTV/CAC
1.3
MAU
824K
K-factor
0.6
DAU/MAU
0.31
If your audience needs a glossary to read the dashboard, the dashboard has failed. A metric that people can’t discuss in a meeting won’t change anyone’s behavior.
What makes a good metric?
The Lean Analytics framework gives four properties:
Comparative
Can be measured against time periods, segments, or benchmarks. “Conversion increased from 2% to 3%” is more useful than “conversion is 3%.”
Understandable
If people can’t remember it or discuss it, they won’t act on it. The best metrics fit in one sentence.
A ratio or rate
Ratios normalize for scale. Profit margin lets you compare a 10-person store with a 200-person store. Absolute profit doesn’t.
Behavior-changing
If the number moves, someone should do something different. If nobody’s behavior changes, why are you tracking it?
Vanity metrics feel good but don’t help
A vanity metric makes you feel good but doesn’t change what you do.
Classic examples:
Total registered users (includes everyone who signed up and left)
Page views (one person hitting refresh looks like engagement)
Number of followers (how many actually do something when you post?)
The actionable alternative is usually a ratio: not “total users” but “percent of users who are active this week.”
The car salesman test
A car dealer tells his customer: “You’ll get a satisfaction survey call. Can you give us a 5?”
The dealer spends more time asking for a good rating than providing a good experience. The satisfaction score becomes a vanity metric — it measures gaming, not satisfaction.
— Lean Analytics Ch 2, via Seth Godin
KPIs: the metrics that run the business
A Key Performance Indicator is a metric an organization monitors regularly to evaluate performance against a goal.
Not every metric is a KPI. A dashboard with 30 KPIs has zero KPIs — if everything is key, nothing is.
Role
They probably watch
CEO
Revenue, margin, customer count, churn
VP Sales
Pipeline value, conversion rate, deal size
VP Operations
Delivery time, error rate, cost per unit
Product Manager
Active users, feature adoption, NPS
The VP of Sales and the VP of Operations need different dashboards because they make different decisions.
Metrics come in pairs
Leading vs. lagging
A leading metric predicts the future. Number of qualified leads in the sales pipeline predicts next quarter’s revenue.
A lagging metric reports the past. Customer churn tells you people left — but by the time you measure it, they’re already gone.
Both are useful. Leading metrics give you time to act.
Correlated vs. causal
Ice cream sales and drowning deaths are correlated. Both go up in summer. Banning ice cream won’t save anyone.
Finding a correlation is useful — it helps you predict. Finding a cause is better — it lets you intervene. Causation requires controlled experiments, not just dashboards.
BI Detective
Can you figure out what the numbers actually mean?
BI Detective: Dashboard Investigation
OK
Olena Koval, VP OperationsDivvy Bikeshare, Chicago
Hey! November numbers look bad — rides down 32%, average duration down 12%. Can you figure out what’s going on before our meeting today?
Open the game: Course website → Activities → Week 4 BI Detective. You have 15 minutes.
Work through the tabs left to right. Answer the questions to build your case.
Debrief: what did the investigation reveal?
The ride drop is seasonal, not a crisis
Nov 2023 has +7.6% more rides than Nov 2022. The Oct→Nov drop happens every year when it gets cold. The VP compared adjacent months in a seasonal business — the wrong baseline.
Casual riders drove the entire drop
Casual ridership fell -44%, members only -27%. The overall average duration also dropped — not because anyone rides shorter, but because the long-ride group (casuals ride ~2x longer) shrank in the mix.
Dead ends taught us where NOT to look
Hourly patterns, bike types, stations — all dropped uniformly. When every value in a dimension moves the same way, that dimension doesn’t explain the change.
The Thursday trap
Thursday barely dropped (-1%). Resilient day?
No — October had 4 Thursdays, November had 5. More days = more rides. Per-Thursday, the drop was ~30%, same as every other day.
Always check whether your comparison periods have the same calendar structure.
From what to how
You now know what makes a good metric. The rest of the lecture is about how to compute them.
Column-level calculations
A new value for every row — like adding a formula column in a spreadsheet.
Student
Course
Region
Modules Done
Total Modules
Completed?
Anna
Data 101
EU
8
8
1
Ben
Data 101
US
5
8
0
Clara
SQL Basics
EU
6
6
1
Dan
SQL Basics
US
3
6
0
Eva
SQL Basics
EU
6
6
1
IF [Modules Done] = [Total Modules] THEN 1 ELSE 0 END — each row gets its own result. The new column can then be summed, averaged, or counted in any grouping.
Dynamic calculations
The same formula, different results — depending on how the view is grouped.
Overall
Students
Completed
Rate
5
3
60%
Grouped by Course
Course
Students
Completed
Rate
Data 101
2
1
50%
SQL Basics
3
2
67%
Grouped by Region
Region
Students
Completed
Rate
EU
3
3
100%
US
2
0
0%
SUM([Completed]) / COUNT([Student]) — one formula, three different answers. The grouping determines what gets aggregated. This is how pivot tables, Tableau, and SQL all work.
Additive vs. non-additive metrics
Not every metric can be summed across groups. Compare COUNT vs. COUNT DISTINCT on the same data:
Enrollments
Student
Course
Anna
Data 101
Anna
SQL Basics
Ben
Data 101
Clara
SQL Basics
COUNT (enrollments)
Course
Count
Data 101
2
SQL Basics
2
Total
4 ✓
2 + 2 = 4. Correct.
COUNT DISTINCT (students)
Course
Unique
Data 101
2
SQL Basics
2
Total
3 ≠ 4
2 + 2 ≠ 3. Anna is in both.
COUNT is additive — the parts sum to the whole. COUNT DISTINCT is not — Anna appears in both courses but is one unique student overall. When you slice a dashboard by a dimension, additive metrics stay correct. Non-additive metrics need to be recalculated, not summed.
Calculated fields in Tableau
Tableau’s implementation of the same principle: derive new values from existing columns.
Four types of calculated fields
Type
Example
Arithmetic
SUM([Profit]) / SUM([Sales])
String
LEFT([Product Name], 3)
Date
DATEDIFF('day', [Order Date], [Ship Date])
Conditional
IF [Profit] > 0 THEN "Profit" ELSE "Loss" END
Product
Order Date
Ship Date
Ship Days
Chair
Mar 1
Mar 4
3
Binder
Mar 1
Mar 2
1
Desk
Mar 3
Mar 10
7
Date calc: DATEDIFF(‘day’, [Order Date], [Ship Date]) Each row gets its own result — this is column-level.
Aggregation order matters
Order
Sales
Profit
Row Margin
A-001
$100
$10
10%
A-002
$10,000
$500
5%
Total
$10,100
$510
?
AVG( [Profit] / [Sales] ) → (10% + 5%) / 2 = 7.5%
Averages the row-level ratios. Both orders count equally.
Aggregates first, then divides. The $10K order dominates — as it should.
For business metrics, you almost always want the second version.
NULLs in calculations
After a left join, unmatched rows get NULL — and NULLs behave differently than you might expect:
Order
Sales
Returned
[Returned] = “Yes”
A-001
$500
Yes
TRUE
A-002
$200
NULL
NULL — not FALSE
A-003
$80
NULL
NULL — not FALSE
A formula using a NULL input produces NULL.[Returned] = "Yes" does not evaluate to FALSE when [Returned] is missing — it evaluates to NULL. Use IFNULL or ISNULL to handle this explicitly.
Aggregations skip NULLs silently.AVG([Returned Value]) ignores NULL rows — it doesn’t treat them as zero. Your average is computed over fewer rows than you think.
Filters exclude NULLs by default. If you filter on a field, NULL values disappear without warning.
Joins
Real business data lives in multiple tables. Joins bring them together.
Why multiple tables?
“What’s our return rate by product category?” — the answer requires data from two tables:
Orders
Order ID
Category
Sales
A-001
Furniture
$500
A-002
Technology
$200
A-003
Furniture
$80
+
Returns
Order ID
Returned
A-001
Yes
Only returned orders have rows here.
Category lives in Orders, return status lives in Returns. A join on Order ID combines them — like XLOOKUP from week 1, but for the entire table at once.
Four join types
Join
What you get
Inner
Only rows that match in both tables
Left
All rows from the left, matches from the right (NULL where no match)
Right
Mirror of left
Full outer
All rows from both tables
In analytics, you’ll use left join most of the time.
Left join: Orders + Returns
Order ID
Sales
Returned
A-001
$500
Yes
A-002
$200
NULL
A-003
$80
NULL
All orders kept. Return info attached where it exists. Non-returned orders get NULL — not dropped.
The most common join mistake: fan-out
If the right table has multiple rows per key, your join duplicates rows on the left:
Orders
Order ID
Sales
A-001
$500
A-002
$200
2 rows, $700 total
Returns
Order ID
Reason
A-001
Damaged
A-001
Wrong item
Two rows for one order!
After left join
Order ID
Sales
Reason
A-001
$500
Damaged
A-001
$500
Wrong item
A-002
$200
NULL
3 rows, SUM(Sales) = $1,200 — wrong!
Rule: always check row count and a key total (like SUM of Sales) before and after a join. If either changed, your join created duplicates.
Joins vs. unions
Operation
What it does
Analogy
Join
Adds columns — attaches new fields from a second table
Stapling two pages side by side
Union
Adds rows — stacks data from tables with the same columns
Stapling two pages top to bottom
You saw unions conceptually in week 3 (combining January + February + March files). Joins are new this week.
Next week: the power tools
Week 5: Advanced Calculations
Table calculations — running totals, rankings, percent of total
LOD expressions — calculating at a different level than the view shows
Cohort analysis — grouping customers by when they first appeared