What Should We Measure?

BUS220 - Business Intelligence and Analytics | Week 4

Oleh Omelchenko

2026-04-08

From clean data to decisions

Your VP sends you a Slack message


VP
VP of Sales 10:23 AM

How are we doing?


What do you show them?

Is this good?


$2,300,000

$2,300,000 (+15% YoY, 102% of target)

Now you can act on it. A metric needs a comparison — a trend, a benchmark, or a target.

What does this dashboard tell you?


ARPDAU
$0.42
D7 Retention
18%
LTV/CAC
1.3
MAU
824K
K-factor
0.6
DAU/MAU
0.31


If your audience needs a glossary to read the dashboard, the dashboard has failed. A metric that people can’t discuss in a meeting won’t change anyone’s behavior.

What makes a good metric?

The Lean Analytics framework gives four properties:


Comparative

Can be measured against time periods, segments, or benchmarks. “Conversion increased from 2% to 3%” is more useful than “conversion is 3%.”


Understandable

If people can’t remember it or discuss it, they won’t act on it. The best metrics fit in one sentence.

A ratio or rate

Ratios normalize for scale. Profit margin lets you compare a 10-person store with a 200-person store. Absolute profit doesn’t.


Behavior-changing

If the number moves, someone should do something different. If nobody’s behavior changes, why are you tracking it?

Vanity metrics feel good but don’t help


A vanity metric makes you feel good but doesn’t change what you do.

Classic examples:

  • Total registered users (includes everyone who signed up and left)
  • Page views (one person hitting refresh looks like engagement)
  • Number of followers (how many actually do something when you post?)

The actionable alternative is usually a ratio: not “total users” but “percent of users who are active this week.”

The car salesman test

A car dealer tells his customer: “You’ll get a satisfaction survey call. Can you give us a 5?”

The dealer spends more time asking for a good rating than providing a good experience. The satisfaction score becomes a vanity metric — it measures gaming, not satisfaction.

— Lean Analytics Ch 2, via Seth Godin

KPIs: the metrics that run the business


A Key Performance Indicator is a metric an organization monitors regularly to evaluate performance against a goal.


Not every metric is a KPI. A dashboard with 30 KPIs has zero KPIs — if everything is key, nothing is.


Role They probably watch
CEO Revenue, margin, customer count, churn
VP Sales Pipeline value, conversion rate, deal size
VP Operations Delivery time, error rate, cost per unit
Product Manager Active users, feature adoption, NPS


The VP of Sales and the VP of Operations need different dashboards because they make different decisions.

Metrics come in pairs


Leading vs. lagging

A leading metric predicts the future. Number of qualified leads in the sales pipeline predicts next quarter’s revenue.

A lagging metric reports the past. Customer churn tells you people left — but by the time you measure it, they’re already gone.

Both are useful. Leading metrics give you time to act.

Correlated vs. causal

Ice cream sales and drowning deaths are correlated. Both go up in summer. Banning ice cream won’t save anyone.

Finding a correlation is useful — it helps you predict. Finding a cause is better — it lets you intervene. Causation requires controlled experiments, not just dashboards.

BI Detective

Can you figure out what the numbers actually mean?

BI Detective: Dashboard Investigation

OK
Olena Koval, VP Operations Divvy Bikeshare, Chicago

Hey! November numbers look bad — rides down 32%, average duration down 12%. Can you figure out what’s going on before our meeting today?


Open the game: Course website → Activities → Week 4 BI Detective. You have 15 minutes.


Work through the tabs left to right. Answer the questions to build your case.

Debrief: what did the investigation reveal?

The ride drop is seasonal, not a crisis

Nov 2023 has +7.6% more rides than Nov 2022. The Oct→Nov drop happens every year when it gets cold. The VP compared adjacent months in a seasonal business — the wrong baseline.

Casual riders drove the entire drop

Casual ridership fell -44%, members only -27%. The overall average duration also dropped — not because anyone rides shorter, but because the long-ride group (casuals ride ~2x longer) shrank in the mix.

Dead ends taught us where NOT to look

Hourly patterns, bike types, stations — all dropped uniformly. When every value in a dimension moves the same way, that dimension doesn’t explain the change.

The Thursday trap

Thursday barely dropped (-1%). Resilient day?

No — October had 4 Thursdays, November had 5. More days = more rides. Per-Thursday, the drop was ~30%, same as every other day.

Always check whether your comparison periods have the same calendar structure.

From what to how

You now know what makes a good metric. The rest of the lecture is about how to compute them.

Column-level calculations

A new value for every row — like adding a formula column in a spreadsheet.


Student Course Region Modules Done Total Modules Completed?
Anna Data 101 EU 8 8 1
Ben Data 101 US 5 8 0
Clara SQL Basics EU 6 6 1
Dan SQL Basics US 3 6 0
Eva SQL Basics EU 6 6 1


IF [Modules Done] = [Total Modules] THEN 1 ELSE 0 END — each row gets its own result. The new column can then be summed, averaged, or counted in any grouping.

Dynamic calculations

The same formula, different results — depending on how the view is grouped.


Overall
Students Completed Rate
5 3 60%
Grouped by Course
Course Students Completed Rate
Data 101 2 1 50%
SQL Basics 3 2 67%
Grouped by Region
Region Students Completed Rate
EU 3 3 100%
US 2 0 0%


SUM([Completed]) / COUNT([Student]) — one formula, three different answers. The grouping determines what gets aggregated. This is how pivot tables, Tableau, and SQL all work.

Additive vs. non-additive metrics

Not every metric can be summed across groups. Compare COUNT vs. COUNT DISTINCT on the same data:


Enrollments
Student Course
Anna Data 101
Anna SQL Basics
Ben Data 101
Clara SQL Basics
COUNT (enrollments)
Course Count
Data 101 2
SQL Basics 2
Total 4 ✓
2 + 2 = 4. Correct.
COUNT DISTINCT (students)
Course Unique
Data 101 2
SQL Basics 2
Total 3 ≠ 4
2 + 2 ≠ 3. Anna is in both.


COUNT is additive — the parts sum to the whole. COUNT DISTINCT is not — Anna appears in both courses but is one unique student overall. When you slice a dashboard by a dimension, additive metrics stay correct. Non-additive metrics need to be recalculated, not summed.

Calculated fields in Tableau

Tableau’s implementation of the same principle: derive new values from existing columns.

Four types of calculated fields

Type Example
Arithmetic SUM([Profit]) / SUM([Sales])
String LEFT([Product Name], 3)
Date DATEDIFF('day', [Order Date], [Ship Date])
Conditional IF [Profit] > 0 THEN "Profit" ELSE "Loss" END
Product Order Date Ship Date Ship Days
Chair Mar 1 Mar 4 3
Binder Mar 1 Mar 2 1
Desk Mar 3 Mar 10 7

Date calc: DATEDIFF(‘day’, [Order Date], [Ship Date])
Each row gets its own result — this is column-level.

Aggregation order matters

Order Sales Profit Row Margin
A-001 $100 $10 10%
A-002 $10,000 $500 5%
Total $10,100 $510 ?

AVG( [Profit] / [Sales] )(10% + 5%) / 2 = 7.5%

Averages the row-level ratios. Both orders count equally.


SUM([Profit]) / SUM([Sales])$510 / $10,100 = 5.05%

Aggregates first, then divides. The $10K order dominates — as it should.


For business metrics, you almost always want the second version.

NULLs in calculations

After a left join, unmatched rows get NULL — and NULLs behave differently than you might expect:


Order Sales Returned [Returned] = “Yes”
A-001 $500 Yes TRUE
A-002 $200 NULL NULL — not FALSE
A-003 $80 NULL NULL — not FALSE

A formula using a NULL input produces NULL. [Returned] = "Yes" does not evaluate to FALSE when [Returned] is missing — it evaluates to NULL. Use IFNULL or ISNULL to handle this explicitly.

Aggregations skip NULLs silently. AVG([Returned Value]) ignores NULL rows — it doesn’t treat them as zero. Your average is computed over fewer rows than you think.

Filters exclude NULLs by default. If you filter on a field, NULL values disappear without warning.

Joins

Real business data lives in multiple tables. Joins bring them together.

Why multiple tables?

“What’s our return rate by product category?” — the answer requires data from two tables:


Orders
Order ID Category Sales
A-001 Furniture $500
A-002 Technology $200
A-003 Furniture $80
+
Returns
Order ID Returned
A-001 Yes
Only returned orders have rows here.


Category lives in Orders, return status lives in Returns. A join on Order ID combines them — like XLOOKUP from week 1, but for the entire table at once.

Four join types

Join What you get
Inner Only rows that match in both tables
Left All rows from the left, matches from the right (NULL where no match)
Right Mirror of left
Full outer All rows from both tables


In analytics, you’ll use left join most of the time.

Left join: Orders + Returns
Order ID Sales Returned
A-001 $500 Yes
A-002 $200 NULL
A-003 $80 NULL
All orders kept. Return info attached where it exists. Non-returned orders get NULL — not dropped.

The most common join mistake: fan-out

If the right table has multiple rows per key, your join duplicates rows on the left:

Orders

Order ID Sales
A-001 $500
A-002 $200
2 rows, $700 total

Returns

Order ID Reason
A-001 Damaged
A-001 Wrong item
Two rows for one order!

After left join

Order ID Sales Reason
A-001 $500 Damaged
A-001 $500 Wrong item
A-002 $200 NULL
3 rows, SUM(Sales) = $1,200 — wrong!


Rule: always check row count and a key total (like SUM of Sales) before and after a join. If either changed, your join created duplicates.

Joins vs. unions


Operation What it does Analogy
Join Adds columns — attaches new fields from a second table Stapling two pages side by side
Union Adds rows — stacks data from tables with the same columns Stapling two pages top to bottom


You saw unions conceptually in week 3 (combining January + February + March files). Joins are new this week.

Next week: the power tools


Week 5: Advanced Calculations

  • Table calculations — running totals, rankings, percent of total
  • LOD expressions — calculating at a different level than the view shows
  • Cohort analysis — grouping customers by when they first appeared


Reading for next week:

  • PT Ch 13 — Introduction to Table Calculations
  • PT Ch 16 — Introduction to LOD Expressions