• Subscribe in Bloglines
  • Add to netvibes
  • Subscribe in NewsGator Online
  • Add to Google
  • Instant Cognition Feed for Yahoo!
  • Add to Microsoft Live
  • Get updates in your Inbox:



Instant Cognition Feed

Archive for the 'visualization' Category

creating bullet charts in excel

Friday, August 10th, 2007

Stephen Few's Bullet Chart

Stephen Few has long proposed his bullet chart as a replacement for gauge-type reports. Steve argues that bullet charts are more efficient (take up less space) and encode more and more accurate information than a gauge or speedometer type chart. I’m still unconvinced, especially since I’m not a fan of a visualization that requires retraining of the end consumer. I’d much rather use something that the end consumer ‘gets’ right away.

That being said, I haven’t written a visualization post in a while and just because I’m not convinced on bullet charts doesn’t mean that you aren’t and that you might not be looking for a way to create them in Excel. My version is a combo-chart using both a scatter plot and stacked horizontal bars so the order in which we add data and chart elements is important.

BTW, I am going to assume that you have a passing familiarity with Excel and its charting utilities so don’t be surprised when I outline the steps to take (insert a scatter plot) rather than give detailed instructions (select data in cells A1:B2, go to ‘Insert’, click on the Scatter Plot drop down and select the scatter line with no point markers).

Step 1: In this first step, we will add the data for the scatter plot and create the default scatter plot graph. In this example, we are using the scatter plot to set a goal line for the overall bullet chart.bulletchart_step1

So, when we create the initial scatter plot, you’ll see that the x-coordinates are ‘80′ - this is my target. In this case, I happen to be charting the performance of the Homepage Retention Rate (1-Abandonment Rate) and I am shooting for 80%. I happen to be using whole numbers for simplicity’s sake but it could just as easily be percentages. Don’t worry about the vertical alignment of the line just yet, we can adjust it - if needed - later when we are closer to the completed graph. Notice that I haven’t removed any of the chart junk yet either, that will happen in a later step.

 

Step 2: Now we need to add the data for the horizontal stacks and the stacks themselves. We will have 3 bars (Top, Middle, Bottom) with 4 sections each.bulletchart_step2

When working with combination charts, adding the new data points and chart types can be tricky. In the screen shot to the left, we’re looking at the completed process, I’ve excluded the intermediary step of what the data looks like when we first add it.

In order to add the stack data to chart we first select the data from all four of the stack columns and copy it. Then we select the chart and use ‘Paste Special’ to paste the data as ‘New Series’ with ‘Values ‘Y’ in columns’. This will paste our stack data in as new scatters. We then select each of the stack series and change it’s chart type to ’stacked bar’.

Ok, now that we’ve talked about how to add the stacked data to the chart, let’s talk about the data itself. In the top row, the first data point is ‘50′ - this is our floor (minimum acceptable value). The second data point for the top row is always zero. We need this second stack element for visual handling on the center stack (see the little green block in the center bar) when it is less than our minimum. Oh, I’m also not going into excruciating detail on the formulae used in the data set, they are fairly straight forward and you can take a look at them when you download the spreadsheet at the end of this post.

Step 3: This is easy, set the gap width on the bars to ‘0′ so that they are slammed up against each other.bulletchart_step3

 

 

 

 

 

Step 4: Time to take out the chart junk.bulletchart_step4

Remove the vertical axes, and the secondary horizontal axis.

Remove the horizontal lines.

Remove the legend.

Remove the plot edge lines

Set the scaling on the primary horizontal axis, set the axis-line to none as well as the major tick marks to none.

Step 5: Apply consistent and valuable color-coding to chart elements.bulletchart_step5

The blue stack becomes our representation of ‘Actual’ (e.g. the bullet). The background elements are encoded in shades of gray.

Up to 50% is nearly black - remember, this is our floor

From 50% to 75% is about 50% gray - this is our marginal performance section

And finally, from 75% to 100% is light gray (20%) for acceptable performance. I’ve also changed the target line to full black, it stands out nicely against both the light gray and the blue.

Step 6: Resize the chart.bulletchart_step6

So far, our chart hasn’t look much like a bullet chart has it? Well, that’s pretty simple to take care of. In this example, I’ve held the horizontal length the same but compressed the vertical height. Now we have something that very definitely is looking like a bullet chart.

Step 7: Add a text label and center align the chart to the text label.bulletchart_step7

Ok, so what I’m really doing is adding a text description, and the actual value of my metric in the two cells to the left of my bullet chart. Then, I’m massaging the bullet chart into alignment (hopefully) with the text. Notice that the value I’ve put in the label (86) doesn’t match the value in the chart, we’ll fix that in the next step. You might have also noticed that in this step, I removed the final piece of chart junk from my chart - the chart area border. Up until now, it was helpful to have the border for moving the chart around, but now that I have it correctly situated, I no longer need it.

Step 8: Synchronize label value to chart value.bulletchart_step8

Yep, you guessed, that’s a fancy way of saying set the first value of the center stack equal to the text label value (=D89). Now, I can enter a new value on the label, and the chart will update automatically.

Step 9: Obscure the chart data.bulletchart_step9

We had to jump through a lot of hoops to get to this graphic, but we probably don’t want the report consumer to be thinking about those hoops, so set the text color of your chart configuration data to ‘white’, effectively hiding it from the consumer.

Step 10: Rinse & Repeat

 Ok, so there’s my step-by-step on how to create Bullet charts.

What? You want the spreadsheet? Well, since you asked so nicely…

Download the Excel 2007 version or Download the Excel 2003 version

Suit Up

Wednesday, July 25th, 2007

Photo Credit: CBS Broadcasting, Inc.

That’s Barney’s (Neil Patrick Harris) infamous one-liner from CBS’s "How I Met Your Mother". It’s a favorite show of ours (the wife and I).

Now, Neil was recently nominated for an Emmy for his work as Barney on the show - he is the gooey, sticky center that holds it all together. So, after seeing the nom nod, my wife points me to "Barney’s Blog" thinking that I should have some fun with Barney’s statisticovisual (I love making up words) approach to dating.

WELL…

 

Not to be a downer or anything but my amusement was ruined by several factors.

Photo Credit: CBS Broadcasting, Inc.Take for instance, the first bar chart which is labeled "BARNEY STINSON’S HOOK UP PERCENTAGES"

So, the chart purports that Barney’s hookup percentage ranges from 200 to 500 percent, depending on the condition of not having or having Ted with him. Hmm, so when the big B is flying solo he hooks up 200% of the time? And when he has Ted with him the percentage goes up to 500%? Um, part of the Barney character is a near-pathological need to be right and this ain’t right. Also on this chart are Ted’s hook up percentages which range from 0% (without Barney) to what looks to be somewhere between 10 and 20 percent (with Barney). So, half of the chart is wrong and half is correct?

This chart is SO not awesome (as Barney might say).

Also, what’s with the lazy reliance on Excel 2003’s default chart settings? Barney is an over-the-top kind of guy and if he’s going toPhoto Credit: CBS Broadcasting, Inc. go the trouble of creating these charts and putting them up on foam cores then he is certainly going to send them to a graphic designer (at the least, hopefully he’d send them to a data visualist to really make them sing) to dress them up - I imagine Barney sitting there, screaming "Suit UP!" at his laptop.

Finally, in the last line chart, the y-axis is labeled "Level of Hotness", I’m pretty sure that Barney, if he existed, would have labeled that axis "Level of Awesomeness"

 Photo Credit: CBS Broadcasting, Inc.

Sheesh, CBS marketers, make sure to get into character before writing these posts.

google analytics and sparklines

Monday, May 14th, 2007

Update: Google Analytics Team, please don’t do this anymore. I just turned on the ‘compare to past’ feature and noticed that you are doubling up series on the sparklines. This is a big no-no as far as I am concerned because the sparkline with mutliple series on it just ends up looking like a squiggle. The data from the second series actually becomes chart junk!

If your a visual analysis wonk like me you probably couldn’t help but be excited with the new version of Google Analytics and the introduction of Sparklines. To me, it comes across as a major endorsement of sparklines as a business tool.

But, for this implementation to be a useful endorsement of sparklines, they have to show value as a tool and I don’t think the GA implementation is quite there - yet. Remember the new version of GA is in Beta so the team has lots of opportunity to improve.

From the Visitor Summary Report…

In this case, the shaded area is not useful, it’s distracting. The reason for this is relatively simple - it’s shading the area under the curve (plotted line) in the case of a sparkline, the shaded area should code more or different information. I’d really like so see this shaded area represent some UCL and LCL data, for instance the standard deviation (plus or minus) from the mean of all values.

The value currently associated with the sparklines are the total or total average for the period represented when they should be the current (or most recent) value.

I’d like to see some data markers on the chart - current value, high water mark and low water mark. Of course, the markers should be color coded to their value on the right. So, for instance the most current data point marker could be red and the current value would also be in red. The high and low water marks would be different colors and their associated text would be those colors to tie them together.

If that’s as clear as mud, here’s a quick example:

Notice that my example is a bit longer, this is because Tufte recommends shooting for a 45 degree angle on the line slopes (it eases understanding)

  1. The gray area represents the ‘normal’ distribution of visitors - the "biorhythm" as Avinash would say and it is the mean +/- 1 Standard Deviation
  2. Both the oldest point (starting point) and newest point (end point) are marked in red
  3. The watermarks (high and low) are marked in green
  4. For each of the data points that are marked (except for the starting point) an associated value is presented to the right and is color coded so that associations between the chart and the values are easily made

Finally, I’d like to see the sparklines be more contextual to the report or dashboard that they are in. For instance, I’m not quite sure why total page views is reported in the Visitor report. Also, is that average page views per visit or page views per visitor? I’d rather see average visits per visitor and average usage days per visitor (daily visitors divided by absolute unique visitors) than some of these other metrics.

What do you think of the Google Analytics sparklines?

new google analytics visualizations

Friday, May 11th, 2007

First of all, thank you to Avinash and the rest of the Google Analytics team for getting me into the new version of G.A. while I was at Emetrics. Avinash, in particular seemed interested in my comments on the new visualizations, so without further ado, here is some feedback for the team.

I’d really love to start off by talking about the addition of sparklines, but I just can’t because I am too distracted by the big trend chart.

Here’s what that looks like:

Let’s take a quick moment to compare that to a Google Finance chart:

Notice that they are both ‘web 2.0-ey’ and ‘ajaxey’. However, the GA chart seems to have gone a little over the top with the very thick trend line and large diameter data point markers.

It’s worth noting that the Google Finance chart offers multiple ways to traverse (interact) with this data. I can both traverse and zoom (or unzoom) the chart at the bottom as well as zoom the chart from the top-left text links. In the GA chart my only option for interacting with the data is through the pull down menus at the top right (one for date(s) and one for dimension which are kind of kludgey if you ask me - they don’t respond -hhmmm- smoothly and they rearrange the page whereas the controls in the GF charts don’t require the page to be reordered and react both quickly and smoothly.

None of these are my biggest criticism, the simple fact is that the GA chart with the heavy line and large radius markers makes me feel like it’s a kid’s chart, that is a chart designed for a kid.

I find it very odd that the GA Dashboard chart is liquid - that is it stretches to fit the browser width. This changes the horizontal scale of the chart and can either obscure important trend details or create artifacts in the visualization - a fixed width chart that is appropriately scaled would be better.

Maybe it’s just me but I actually feel patronized by the Google Analytics chart like I’m so clueless that I wouldn’t notice the trends or the data points if they weren’t pessimistically obvious. I *think* that I actually like the roll-over data labels because it puts the detailed data into the chart as opposed to how the GF chart updates the information way up in the right hand corner is you hover over the map. But, again the heavy black border just screams ‘MAKE SURE YOU LOOK AT ME AND IF YOU DON’T THEN YOU DON’T KNOW WHAT’S GOING ON HERE!’.

It’s not that I’m totally in love with the GF charts, as noted above I’d really like to see the detail data down next to the chart in a data label and while I theoretically like the idea of marking key news events on the chart, the implementation can get quite messy and distracting. Unfortunately I don’t have a good solution in mind yet but I’d really like to see it cleaned up.

For the GA chart, I’d recommend:

  1. Don’t talk down to the user, move towards a more elegant presentation like that of Google Finance
  2. Build in interactive controls like the ones used in GF
  3. Make the chart fixed-width, in this case liquidity is bad
  4. Potentially allow for multiple dimensions to be charted (within reason)

Ok, that’s enough about the dashboard chart, sparklines will have to wait for another day.

By the way, overall I like the redesign of GA - for me it is generally more intuitive to navigate and use.

I’d love to hear what you think…

Stephen Few’s Visual Business Intelligence Workshop

Tuesday, April 10th, 2007

Stephen Few, Principal at Perceptual Edge is taking his data visualization courses public! In the past, Stephen has offered his courses to companies and institutions that could scare up enough people to fill the class but now he’s bringing his expertise to the masses.

The workshop will be held in San Francisco from June 5, 2007 - June 7, 2007. Each day dedicated to a different aspect of data visualization.

Day 1: Table and Graph Design for Effective Communication

Day 2: Dashboard Design for at-a-Glance Monitoring

Day 3: Visual Data Analysis for Discovery and Understanding

Download this PDF for a full workshop description and pricing information.

I’ve personally been excited about this workshop since I saw the announcement on Stephen’s blog a few days back and am figuring out how to get there.

If you sign up for all three courses, you’ll receive a 10% discount.

Finally, if you’re a Web Analytics Association member, you can get an additional 10% - look for information in the next newsletter or keep tabs on the Discounts page at the WAA website (members only).

Stephen Few’s 2007 West Coast Business Intelligence Workshop