Tinkering with Bars in Trees

This week I was tinkering with creating a Table Of Content for a client. But I wanted something that would be more informative than just your standard TOC, which have always seemed to be a bit of a waste to me. The client needed an overview of departments for the high-level folks, and then a drill-down to the individual departments that would also be visible to department managers. I thought if I could combine the TOC and the overview viz, I would have the proverbial two birds with one stone scenario.

While I’ve never had much use for Tree Maps, I thought this viz type might serve as a good scaffold for a TOC — at the very least I could color-encode some relevant measure. But it turns out that by using an old trick of Andy Cotgreave I was able to get all the required overview data into the TOC by putting the bar charts inside the tree map. Here’s a remake of the viz using Super Store data:
TT-1

Spoiler Alert: You’ll notice the Brain Teaser title. I post this as an Educational Brain Teaser on  the forums earlier this week. If you want to take the challenge, go there before reading on. The rest of this post provides step-by-step instructions on how to create this viz…

First, there are three challenges involved 1) Getting the bar charts inside the tree map 2) Getting the bars & text alignment worked out, and 3) Sizing the bars and the tree map boxes. So let’s start.

1. Open a workbook and connect to Super Store data. (Your data will probably have different dates than my example because I shifted the dates so I can answer forum questions that need the TODAY() function to be relevant.)

2. Create a calculation titled MIN 1 and put this in the calc: MIN(1).

3. Now using MIN 1 and Sub-Category create a Tree Map. While the boxes will be different shapes, they all have the same area. If they didn’t then the smaller ones would end up hiding the labels, which we don’t want.

EDIT: One thing I forgot to mention is you need to filter the Sub-Category down to 5-10 members. This technique falls apart if you have too many members. My client had ten departments on a 1500×800 dashboard and it just fit. Any more and things get wonkie.

4. Now drag Profit Ratio onto the color shelf (replacing MIN 1). Also drag it onto the Label shelf. You should have something similar to this:

TT-2

5. Now we need to create several calculations to use for our bars:

A. 2013: IF YEAR([Order Date])=2013 THEN [Sales] END

Repeat this for 2014 & 2015 or 2010-12 for your data.

B. Three Years: IF YEAR([Order Date])>=2013 AND YEAR([Order Date])<=2015 THEN [Sales] END

6. Now comes the Cotgreave magic. Create this calc named TT 2013 (or similar):

LEFT(“█████████████████████████████████████████████████████████████████”
,ROUND((SUM( [2013])/SUM([Three Years]))*25,0))

Note that in the original formula, which is written up in this KB article, Andy used 100 instead of 25. I had to change it to get the bars to fit in the boxes, but the ratio is the same so the bar lengths will still be relevant to each other. FYI: The black bar is made up of 100 ASCII character 166, in the Arial Unicode MS font.

Repeat this formula for the other two years.

7. Drag all of three TT calcs and all three year calcs onto the label shelf. You should now have something that looks like this:

TT-3

So now we’ve got the bars and all the rest of the info in the Tree Map labels. All that’s left is to format and align the labels. This next bit was a revelation for me. I hadn’t really played much with the difference between the alignment setting for the entire label and the alignment settings within the Label editor. I hadn’t realized how they interact, and allow us to get a label just right.

8. Click the Label shelf and change the Alignment setting from Automatic to Middle Center. It should look like this:

TT-4

9. Next open the Label Editor, format, reorganize & color the bars so you get something like this:

TT-5

Note that the Sub-Category and Profit Ratio are centered, while the bars and their labels are left justified. If we didn’t left justify the bars they wouldn’t line up at an imaginary zero line and wouldn’t really work. The other wrinkle isn’t that obvious in the screenshot, but to stop the bars from forming a block like the screenshots above, we need to set the Bar font size to something smaller than the text before and after the bars. Mine are set to 12 for the text and 10 for the bars.

That’s it, except I wanted to mention one of the advantages of using a Tree Map as a TOC is the mark is the entire box, which allows the user to click anywhere in the box to trigger a dashboard action, instead of having to click a small arrow.

Anyway, let me know if you use this technique, or have improvements on it. Happy tinkering.

Tinkering with Shapes & Tree Maps

Written by guest contributor Simon Runc….

Following a bit of time spent on the community, and a bit spent Tinkering in Tableau I came across a rather nifty trick, and then an even niftier use!…So, I thought I’d put a quick Viz together to share with the wider ‘Tableau’ world.

The basic trick is put shapes directly into a calculated field, taking advantage of the fact you can use any ASCII character as the string return of a calculation…and many ASCII characters are things like Arrows/Stars…etc.

The basic trick is shown below
CASE [Status]

WHEN ‘Green’ THEN ‘▲’

WHEN ‘Red’ THEN ‘▼’

WHEN ‘Yellow’ THEN ‘►’

END

…It’s that simple.

This also has the knock on advantage of not messing up your Shape assignment when switching data sources (which seems to occasionally happen)…the only disadvantage is that you only have access to ASCII characters and not any image (as you can by importing custom shapes). Speaking of which…check this place out http://www.flaticon.com/

I was then asked a question about putting shapes in Tree-maps…and the 2 events were close enough together for the grey matter to put 2 and 2 together and get 4!

Simon 4

In this example, I’ve taken this one step further by employing a second trick to Colour the shapes depending if they are Up or Down (the measure is the Year on Year performance).

For those who haven’t come across this (and can be used in ToolTips too)….

  • You need to create 2 fields (assuming you just have 2 colours to display, but in theory you can have as many as you want). One for when the result is positive, and one for when it is negative
  • In my example here I’ve created ‘KPI Up’ and ‘KPI Down’

‘KPI Up’: IF [YoY Sales 2013 vs 2012] >=0 THEN ‘▲’ END

‘KPI Down’: IF [YoY Sales 2013 vs 2012]

Notice that there is no ELSE statement, which means if the condition isn’t met it equates to NULL, and Tableau doesn’t (…well can’t) Plot NULLs

Handy Tip: Tableau not plotting [or counting] NULLs is a very useful thing to know, and one I exploit very frequently

  • We can then drag these 2 fields on to the Label Tile, and arrange as follows

Simon 2

As you can see I’ve taken this one step further and created 2 fields (in the same way) for the actual results too.

You can access the workbook (8.3) on Tableau Public

Or from the Community

http://community.tableau.com/thread/172976

I can see this technique could have many other applications, and from a Visualisation point of view it gives the viewer one extra piece of information, without compromising the goal of the Viz. For example in my example, I now also can see (as well as the scale and profitability of the various categories) the Year on Year movement.

Happy adding shapes to everything!!

Tinkering With LOD Expressions & Percent of Total

If you are anything like me, you skim a lot of blogs about Tableau, note the gist of the post and leave reading the details for later. This morning I finally got to ‘later’ for Bora Beran’s excellent introduction to LOD Expressions, and I’m glad I did.

This image he posted caught my eye, stuck in my head, and really helped me get a visual understanding of what LODs are doing especially when it comes to Percent of Total calculations.

Filter order

This is the order of execution of different types of filters, and where different types of LOD expressions fall into this order. Before we get to the percent of total stuff I want to note the caveat Bora mentioned that brought all this home for me:

Fixed calculations happen before dimension filters, as a result unless you promote the pills in your filter shelf to “Context Filters” they will be ignored.

Interesting. So I set up a workbook to test this. It used a FIXED level of detail calculation, something like this:

{FIXED : SUM([Sales])}

Then I created a map and filtered the viz to a map of three States. Since State is a dimension, and is filtering after the FIXED LOD, the returned value didn’t change when I added this filter. But then when I added context to the State filter, the filter jumped in front of the FIXED LOD and therefore changed the value being returned. Remove the context, and the filter becomes a Dimension filter again, and gets executed after the FIXED LOD expression.

Context

Cool, right? I got to thinking that this is really what we need to do when calculating percent of total: have the calc consider a dimension sometimes, but then ignore it at other times. However we don’t currently have a way of allowing the end-user to toggle Context on and off. But do we really need that? Or can we just move down a level and setup calculations with some parts happening before the Dimension Filters and some happening after.

Of course reading further in Bora’s blog we get to exactly that same point. He writes:

Imagine writing the following calculation

SUM([Sales]) / ATTR({SUM([[Sales])})

and having the dimension [State] in your Sheet. With this setup, this calculation will give you ratio of a State’s sales to total Sales. You can add a filter to your Sheet to hide some of the States but filter will only affect the numerator. Since denominator is fixed, it will still give the sum for all States.

Rather than imagining I went ahead and set up that exact viz, and it worked as advertised. A percent of total, without a table calculation in sight! Now that is cool. But then I started thinking, wouldn’t it be even better if we could toggle between showing a percent of total for all states, and a percent of total for only visible state? After a bit of tinkering I worked out a solution, which tickled my fancy enough that I made it into an Educational Brain Teaser I posted on the forums. Go over there, and see if you can work it out, before reading further….

Spoiler Alert! Answer to the brain teaser is posted below….

Here’s the Mystery Calculation I used to create the toggle between percent of all and percent of showing (Select Percent Total is a parameter):

IF [Select Percent Total]=’Percent of All’ THEN
SUM([Sales])/ATTR({SUM([Sales])})
ELSE SUM([Sales])/ SUM({EXCLUDE [State] : SUM([Sales])} )
END

[NOTE: Learn this now, save some hair! Writing: {SUM([Sales])} is the same as writing {FIXED : SUM([Sales])} They return the same result. The first is a shortcut for the other.]

In the above calculation we are using two different ways of getting a percent of total. In the first: SUM([Sales])/ATTR({SUM([Sales])})  we use a FIXED LOD to get the denominator before the State filter takes effect. So we are dividing the total of all states into the sum of each filtered state that is returned. In the second half of the IF statement we are doing all our calculations after the dimension filter has been applied. But we still need to get a TOTAL() denominator to divide into the SUM([Sales]) numerator. To do this we use one of the other type of LODs: EXCLUDE. This tells Tableau to ignore the State field when calculating the SUM([Sales]), so we get a total of all states in the view (after the Dimension filter is applied).

Just study Bora’s order of filtering chart, work out in your head what gets totalled when, and you’ll master the new LOD percent of totals in no time.

Tinkering With Binning LOD Expressions

Like I said in one of my first tinkering blogs, a lot of what I learn is from tinkering around with some question I find on the Tableau Forums. I found this question today: I want to get count based on number of counts. To start with Ashish wanted to figure out how many medical tests each patient had been given. But as with many forum questions, it turned out to be a bit more complicated than that. What she really wanted to figure out was what percent of patients are given 1,2,3,4… tests as compared to the whole. So the assignment is:

  1. Count number of tests for each patient
  2. Bin the patients by total number of tests they’ve been given
  3. Calculate the percent of the total each bin contributes to the whole

In 8.2 something like this is more work than it’s worth (and there’s always that extra dimension that gets mentioned at the very end that mucks everything up). But with the new LOD Expression this becomes quite simple, and even a bit enjoyable. So let’s work through it.

First I want a count of the number of tests per patient. Since I only want the Patient ID to be considered when making the count, I know I’m going to use a FIXED LOD expression. So I created this calc:

{FIXED [Patient ID] : COUNTD([Test LIS Code])}

Now here’s the new bit I learned today: FIXED (and only FIXED) LOD Expressions can be binned! That’s right, we can now bin an aggregated dimension. So we can first aggregate (count) the LIS Codes, and then we can bin them into a bin size of our choice. How cool is that? How did I learn this? As with so many things, I right-clicked a field and looked to see what Tableau was going to let me do. And there was the Create Bins… option (well the Bins…  option in the new Create flyout menu).

Just to finish up, after putting the bins out onto the Column shelf (and adding a parameter to control the size) I just added a quick table calc for percent of total to the LOD calc. Anyway the workbook is posted here.

I hope you’ll enjoy your own tinkering with binned LOD expressions.

Tinkering With LOD Expressions & Distance

Calculating distances on a map just got a whole lot easier, and more flexible, with the introduction of Level Of Detail Expressions in 9.0. Here’s the distance formula Tableau recommends using to calculate the distance between two point on a map when you know the Latitude and Longitude for both points:

3959 * ACOS
(
SIN(RADIANS([Lat])) * SIN(RADIANS([Lat2])) +
COS(RADIANS([Lat])) * COS(RADIANS([Lat2])) * COS(RADIANS([Long2]) – RADIANS([Long]))
)

[Note: this is for miles. If you want kilometers use 6371 instead.]

This formula assumes there is a single record for each pair you want to calculate the distance for. What if you have lots of records for a single pair? Or even many records for many pairs? Then we need to aggregate the calc like this:

3959 * ACOS
(
SIN(RADIANS(MIN([Lat]))) * SIN(RADIANS(MIN([Lat2]))) +
COS(RADIANS(MIN([Lat]))) * COS(RADIANS(MIN([Lat2]))) * COS(RADIANS(MIN([Long2])) – RADIANS(MIN([Long])))
)

Using a MIN() aggregation gives us the single value we need to properly calculate the distance. That works fine, but now we’ve used up our aggregation. Before 9.0 we couldn’t aggregate an aggregate (without getting into the complexities of table calculation), so for instance if we want to wrap our distance calc in a MEDIAN() function to get the median distance to a single point from many points, we would get the ‘aggregate an aggregate’ error. One aspect of the new LOD Expressions I find interesting is that they allow us to aggregate an aggregate without error, and do it at the specific level we want.

Let’s move onto a more concrete example. I had three hospitals, each with hundreds of patients coming from several dozen ZIP codes. Here’s the final formula I used to do the distance calculations:

3959 * ACOS
(
SIN(RADIANS([LOD Lat Patient])) * SIN(RADIANS([LOD Lat Hospital])) +
COS(RADIANS([LOD Lat Patient])) * COS(RADIANS([LOD Lat Hospital])) *
COS(RADIANS([LOD Lon Hospital]) – RADIANS([LOD Lon Patient]))
)

So that’s where we are headed. But before we can write this calc we need to write the four other calcs, like this:

LOD Lat Patient:     {FIXED   [Patient Zip] : MIN( [Latitude Patient]) }

LOD Lon Patient:    {FIXED   [Patient Zip] : MIN( [Longitude Patient]) }

LOD Lat Hospital:    {FIXED [Hospital City] : MIN( [Latitude Hospital]) }

LOD Lon Hospital:    {FIXED [Hospital City] : MIN( [Longitude Hospital]) }

Fairly simple, right? So what’s going on? First note that I’m using FIXED LODs. This means the expression will be calculated using only the dimension in the list, in this case either Patient Zip or Hospital City. Since the expressions are all MIN() we know that each expression will return a single value for each Patient Zip and Hospital City respectively.

To see this in action go to Healthcorp on Public. If you download and open the workbook, you’ll see how I was able to show the distance from each ZIP to each area hospital, as well as then calculate the average and median distance travelled to each hospital by using an aggregation of an aggregation.

I’m really lovin’ these LOD Expressions; they seem to work much closer to the way my mind works, certainly more so than table calculations.

Tinkering with Data Refresh & Auto-Refresh

This week I was tinkering with ways of auto-refreshing data on an unattended computer and discovered how different Desktop and Server are in regard to refreshing data. Let’s start with user invoked refreshes, before getting into auto-refreshes.

Desktop Data Refresh

There are several ways to refresh your data in Desktop. You can right-click the data connection and click Refresh, or use the menus: Data\(name of connection)\Refresh. Or the easiest way is a F5 keystroke, which is what most of us probably use.

If you are using a data Extract then the refresh is a bit different, as F5 only refreshes the view, not the Extract. To refresh the Extract right-click the data connection, but instead of clicking refresh, go down and click Extract and then click the Refresh option in the flyout menu. Use the same process if you like using the menus. There is no key-combination (that I know) to invoke an Extract refresh.

Server Data Refresh

In Server things are different. F5 (or ctrl-F5 or shift-F5) does NOT refresh your data. It only refreshes your browser window. This does not issue a new query to your database, or refresh the data after a data extract has been refreshed; at least not by itself.

[Note: if after you first load the view into your browser, you edit the URL by adding  ?:refresh=yes then a F5 keystroke will refresh both your browser view and the data connection at the same time. (Make sure to delete the #1 at the end of your URL before adding the refresh parameter or it won’t work.)]

The most common way to refresh your data in Server is to click the refresh icon at the top of the view:

Refresh Data

Again, remember clicking your browser reload icon will NOT refresh your data (unless you edited your URL). Only clicking the pictured icon actually refreshes your data.

So these are the ways a user can manually refresh data in Desktop and Server. Now let’s move onto auto-refreshing data.

Auto-Refresh Data

Just to make sure there isn’t any confusion, when I say ‘Auto Refresh’ I mean refreshing the data and/or view without a user being involved. Even with a live connection, the data and/or view has to be refreshed, or the view on screen will remain unchanged. This is true for both desktop and in Server. Tableau is not natively a live streaming software.

So since Tableau isn’t setup to auto-refresh we will need to write some code or script to do it ourselves. The good news is below you will find examples of code you can modify, so you don’t have to start from scratch.

First, I want to give a shout out to Hugh Nguyen and Bill Lyons who each wrote some of the code below. Here’s our forum discussion of Auto Refreshing.

Desktop Auto-Refresh

If you are not publishing to Server, and instead you are opening a viz in Desktop presentation mode and simply want to refresh the view/data every five minutes then here’s the easiest/simplest script I found.

#NoEnv ; Recommended for  future AutoHotkey releases.
; #Warn ; Enable warnings to assist with detecting common errors.
SendMode Input ; Recommended for new scripts 
SetWorkingDir %A_ScriptDir% ; Ensures a consistent starting directory.

#Persistent
SetTimer, PressTheKey, 300000
Return

PressTheKey:
Send, {F5}
Return

It does require you to download and install a free open source utility called AutoHotKeys.

[An added advantage of installing this utility is that Nelson Davis wrote an AutoHotKeys script that will autosave your Tableau work every x minutes! Something Tableau still does not do. Here’s his blog on Autosave. Good explain of using AHK toward the end.]

You also could do this with a batch file script, but that’s a bit more complicated so I’ll leave that for another day.

Server Auto-Refresh

In Server we can use the Tableau JavaScript API to do the refreshing for us. (Note: JavaScript API does not work for Desktop.) Here’s the script Hugh put together to refresh data every 3 seconds (3,000 milliseconds).

<!DOCTYPE html>
<html lang=”en”>
<head>
<title>Tableau JavaScript API</title>
<script type=”text/javascript” src=”https://online.tableausoftware.com/javascripts/api/tableau_v8.js“></script>
</head>
<body>
<div id=”tableauViz”></div>

<script type=’text/javascript’>
var placeholderDiv = document.getElementById(“tableauViz”);
var url = “https://online.tableausoftware.com/t/shawnwallworktableauonline/views/AutoRefreshTest/Dashboard1“;
var options = {
hideTabs: true,
width: “100%”,
height: “1000px”
};
var viz = new tableauSoftware.Viz(placeholderDiv, url, options);

setInterval(function () {viz.refreshDataAsync() }, 3000);

</script>
</body>

Replace the parts in bold with your own specifics and you’re good to go. Also if you change that third line from the bottom to 300,000 it will refresh every 5 minutes.

As long as you have a fairly quick data connection this is the best solution. However, if you have a slow data connection like I was dealing with, then Bill came up with a clever idea for dealing with this situation. He reasoned that if we pre-load several tabs with our views and then use an AutoHotKey script to cycle through the tabs, then the slow loading pages would load before they reached the view position. And he was right. I ended up using a modified version of his script to cycle through 16 tabs, and most of the time the pages do load prior to being cycled into view (yes this particular database is a real dog).

The added benefit of this approach is that you can rotate through several different dashboards/views. Here’s Bill’s latest version of his script. Just make sure to add the ?:refresh=yes parameter to the end of your URLs in the Excel file or your data won’t get refreshed, just your views will be refreshed.

EDIT: And for an extra bonus, I found this post by Tableau’s Jordan Bunker Rotating Dashboard Display He uses Javascript to rotate through dashboards listed in a csv file, similar to what Bill did in his AHK script. It doesn’t look like Jordan is refreshing the data, just rotating the views.

Tinkering with Johan’s Unified Dates

Johan’s Unified Dates

Before the holidays Johan Sivertsen (Kettan) posted an interesting workbook on our Tableau forums as an answer to the often ask year-over-year comparison question. His solution was both elegantly simple, and complex — at the same time!

Unified Dates Orig

Here’s his original workbook. This is the formula he’s using:

DATE(DATEADD(‘year’, 2000 – YEAR([Order Date]), [Order Date]))

So let’s unpack this. First he’s using the DATE() function to convert the results from a Date-Time to just a Date. That’s pretty straight forward. (Also something I’d never thought about doing before, so this old dog thanks you for that new trick Johan.) But what’s going on with that DATEADD() function? Here’s how this function works:

DATEADD(date_part,increment,date)

In the Superstore sample dataset we all use, there are 4 years of data. By subtracting the YEAR() of each of these years from a constant, Johan was able to ‘unify’ all the years; essentially converting them all to a single constant year, in this case the year 2000. Since we are going to mask the year of the axis anyway, it really doesn’t matter what constant you use; except that it has to be between 401 – 9,999. Why? I have no clue; just another quirky little Tableau thing that’s fun to know. 🙂

Once we get that ‘unified’ year axis setup (and format it as MMM D or mmm), then it’s just a matter of dragging the original Order Date out onto the color/label shelves as a year part. The nice thing about this solution is it avoids a lot of date math and conditional calculations, and can be designed to be very flexible.

Very nice stuff Johan, thanks again for this!

Adding Fiscal Years

The original questioner wanted to have the year-over-year comparison start at any month of the year, essentially creating a dynamic user-defined fiscal year comparison. This is where I focused my tinkering — starting with Johan’s original workbook of course. Changing from Johan’s running sum viz to a straight SUM(Sales), here’s the viz I came up with:

Unified Dates Orig FY

Here’s where you can find the workbook.

You’ll notice that I’m using a parameter to let users select the fiscal year start month. If you Edit the Fiscal Year Start parameter you’ll see this:

FY Parameter

A couple of things to note. I’m using an Integer data type and then aliasing it with string values. Computers love numbers and hate strings, so integers execute much faster. Also by using numbers I can do the DATEADD() calcs I need much more directly. It took a bit of trial and error to figure out that -11 would give me a fiscal year starting in December, but that’s the true fun of tinkering around with Tableau, and of the forums in general for that matter. (FYI: Tableau cured my Sudoku addiction!)

After getting the parameter values worked out it’s just a simple matter of doing some date math to get it all working together:

Code

Also check out how the Year Labels calculation works. A common answer to questions on our forums is to separate the calculation that produces the marks, from the calc that labels the marks. This is a technique I use all the time. And points out the wonderful flexibility of our favorite software!

Also make sure you follow the additional links Kettan posted; they lead to some great solutions to other X-over-X issues.