Week 6 Practice: Dashboard Design with SF Bikeshare

BUS220 - Business Intelligence and Analytics

Submission Checklist

2 points (satisfactory completion) | Deadline: 3 days after the practice session

Submit one Tableau workbook (.twbx) with:


Data

Download two files from Moodle:

  • sf-bikeshare-trips.csv — ~260K trips from San Francisco’s bikeshare system
  • sf-bikeshare-stations.csv — station locations and names

The trips file has a start_station_id and an end_station_id. The stations file has metadata for each station. You will need to connect the stations to trips twice — think about why.

Drag the stations file onto the data model canvas twice. Rename the two copies to something meaningful (e.g., “start-stations” and “end-stations”). Relate each to the trips table on the appropriate station ID field. This gives you station names and coordinates for both ends of every trip.


Session 1: What Happened to SF Bikeshare?

You have trip data. You don’t know the story yet. Find it.

The timeline

Start with the most basic question: when did people ride?

Build a chart that shows trip volume over time at a monthly granularity. What do you notice?

There is a gap — trips stop entirely for several months and then resume. This is not a data error. The original bikeshare system (Bay Area Bike Share) shut down, and a new system (Ford GoBike) launched later. The volume before and after the gap is very different.

Once you see the gap, start asking follow-up questions:

  • Is the post-gap growth consistent, or does it have its own patterns?
  • Are there seasonal effects?
  • What does the subscriber_type field look like before vs after the gap? (Watch for data quality surprises.)

The geography

Now look at where the stations are. Build a map using coordinates from the stations table.

Explore:

  • Where are stations concentrated? Are there isolated outliers?
  • Which stations are busiest? (Size marks by trip count.)
  • Do busy stations cluster together, or are they spread out?

Station cohorts

Not all stations existed from the beginning. Some were part of the original system, others were added when the new system launched, and still more appeared in later expansion waves.

Your task: figure out when each station first appeared, and use that to categorize stations into cohorts (e.g., original system, initial launch, later expansion). Then color your map by cohort.

Think about:

  • What LOD expression gives you each station’s launch date?
  • What cohort boundaries make sense given the patterns you see?
  • Is there a geographic pattern to the cohorts?

{FIXED [Station Id] : MIN([Start Date])} gives the first trip date for a station. Then use an IF/THEN to bucket those dates into cohorts. Test in a text table before applying to the map.

Dynamic breakdown

Your timeline chart shows the overall trend. But what is driving that trend? Different breakdowns reveal different parts of the story.

Build a parameter that lets the user choose how to break down the time series — for example, by station cohort, by subscriber type, or by gender. A single chart that changes its color dimension based on a dropdown.

Try each breakdown. What does each one reveal?

  • One breakdown should show you how the expansion contributed to growth
  • Another should show different usage patterns (commute vs leisure?)
  • Another might reveal data quality differences between the old and new systems

Create a string parameter with breakdown names as values. Then a calculated field that returns the appropriate dimension:

CASE [Breakdown]
  WHEN "Cohort" THEN [Station Cohort]
  WHEN "User Type" THEN [Subscriber Type]
  WHEN "Gender" THEN [Member Gender]
END

Put this calculated field on Color. When the user changes the parameter, the chart recolors by a different dimension.

Optional: geographic sets

Use the lasso tool on your map to select a cluster of stations (e.g., the waterfront, or a residential area). Right-click the selection and create a set.

Build a chart comparing In vs Out of the set on any metric that interests you. Try different geographic groupings. What contrasts are most revealing?

Goal by end of Session 1: You can describe the bikeshare timeline, show where and when stations appeared, and dynamically break down trends using a parameter.


Session 2: Did Expansion Actually Grow the Business?

Session 1 was exploratory — you discovered patterns. Now shift to explanatory mode: build a dashboard that answers a specific question.

The question: Did adding new stations create genuine growth, or did the system just get bigger?

The audience: A city transportation planner evaluating whether the expansion was successful.

The core analysis

Your Session 1 timeline chart shows dramatic growth after the 2017 relaunch. But that growth has two possible explanations:

  1. Capacity effect: More stations mechanically means more trips. The system grew because there are more places to pick up and drop off a bike. The existing stations didn’t change.
  2. Network effect: More stations make the whole system more useful. People at original stations take more trips because there are now more destinations nearby. The existing stations grew too.

A planner deciding on further expansion needs to know which one it is.

Your task: isolate the original stations — the ones that existed before the gap — and compare their performance before and after the expansion. Build views that help answer:

  • Did trip volume at original stations increase after the new system launched, or stay flat?
  • Did trip duration or usage patterns change at original stations?
  • Is there a difference between original stations that are near expansion stations vs those that are far from them?

Use the station cohort field from Session 1 to filter your views to only original stations. Then compare their monthly trips in 2016 (before the gap) to 2017–2018 (after expansion). Keep in mind that the dataset uses a bike sample, so absolute numbers matter less than the trend — are original stations growing, flat, or declining relative to their pre-gap baseline?

Building the dashboard

Combine your analysis into a dashboard that a transportation planner could use to decide whether expansion created real network effects. Think about what views belong:

  • A view showing the overall system growth (all cohorts) — context
  • A view showing original stations only — the core question
  • The station map with cohort coloring — geographic context
  • Any relevant breakdowns (subscriber type, time patterns) that strengthen the argument

Apply the design principles from lecture:

  • Overview first, then detail. The planner sees the headline, then investigates.
  • Coordinated views. Click-to-filter and highlight actions connect views so the planner can drill into specific cohorts or time periods.
  • Interactivity needs a reason. Every filter and parameter should answer a “So what?” — not just exist because Tableau makes it easy.

Optional extensions

If you finish the core dashboard, pick one:

Metric switcher. Add a parameter that toggles a chart between different measures (total trips, median duration, subscriber share). One clean view instead of three — the parameter-as-decluttering-tool idea from lecture.

Hour-of-day heatmap. Day of week on one axis, hour on the other, colored by trip count. Filter by station cohort — do original and expansion stations show the same commute peaks, or do expansion stations serve different usage patterns?

Proximity analysis. Are original stations that are close to new expansion stations growing faster than original stations that are far from them? This is harder to build but directly tests the network effect hypothesis. Consider using sets or groups to classify original stations by their proximity to expansion stations.

Top routes. Which station-to-station pairs are most popular? Are the top routes within the same cohort or between original and expansion stations? Cross-cohort routes would be evidence of network effects.


Takeaways

  • Exploratory before explanatory. Session 1 discovered the story. Session 2 curated it for an audience. Both are valid dashboard modes — know which one you are building.
  • LOD expressions create dimensions that don’t exist in the raw data. Station launch date was derived from trip timestamps — computing at a different grain, then using the result as a category.
  • Parameters simplify dashboards. A dynamic breakdown replaces multiple charts with one. Each parameter should make the dashboard less complex, not more.
  • Isolating variables tells a clearer story. Total system growth is impressive but ambiguous. Filtering to original stations separates the capacity effect from the network effect — a sharper answer for the planner.
  • Dashboard actions connect views into a coordinated system. But every action needs to be discoverable — if users can’t find it, it does not exist.