Tinkering With Custom Percentile Bins

Patricia Santillan posted a fun little brain-teaser on the Tableau Forums today. She had a sorted bar chart showing bus ridership by routes. It looked like this:

Bus-1

What she was trying to figure out was how to color the bars by percent of total. Easy, just a quick table calculation, right? Well she didn’t want a continuous measure that would color each bar slightly lighter. (That sort of coloring is redundant, and besides the bar length does a much better job of encoding this information.) Instead Patricia wanted three discrete colors, determined by percentile bins. And just for fun, she wanted non-uniform bins — specifically, Top 50%, Middle 25%-50% and Bottom 25%.

To work this out I used a simplified version of Richard Leeke’s quantile formula, which compares two table calculations: TOTAL() and RUNNING_SUM to set up thresholds Here’s the formula I used:

IF .5 * TOTAL(SUM([Cy Ride]))>=RUNNING_SUM(SUM([Cy Ride])) THEN 'Top'
ELSEIF .75 * TOTAL(SUM([Cy Ride]))>=RUNNING_SUM(SUM([Cy Ride])) THEN 'Middle'
ELSE 'Bottom' END

Essentially we are taking a percent of the total number of riders (in the first line 50% of the total) and comparing that to the running sum, so the color changed when that threshold is met. Here’s what the viz looks like:

Bus-2

You might notice I added a couple of other columns in there. That’s Jonathan Drummey’s fault. I have to admit I took far longer to work this out than I should have, because I didn’t do the step-by-step Jonathan Drummey process. He builds up calculations one small bit at a time so he is certain he knows what Tableau is returning for each part of the calculation. I was short-cutting it, (doing it in my head) and I kept trying to use .25 instead of the .75 in the above formula. It wasn’t until I broke the calc down into components and put them out in the viz that I realized my mistake. You can see that the running sum is, well increasing as it goes down the page (duh!). So to get the bottom 25% we have to trigger it when the running sum hits 75% of the total. [Click the image for clarity.]

Anyway, I hope you take away two things from this post: 1. There’s a lot of good work already out there, much of it in the Calculation Reference Library, which can be adapted to your particular need and 2. Jonathan’s process of building up calculations (especially table calculations) one step at a time, shouldn’t be skipped.

Here’s the original thread, which has the workbook these screenshots were taken from.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s