• Subscribe in Bloglines
  • Add to netvibes
  • Subscribe in NewsGator Online
  • Add to Google
  • Instant Cognition Feed for Yahoo!
  • Add to Microsoft Live
  • Get updates in your Inbox:



Instant Cognition Feed

Archive for the 'graphs' Category

creating bullet charts in excel

Friday, August 10th, 2007

Stephen Few's Bullet Chart

Stephen Few has long proposed his bullet chart as a replacement for gauge-type reports. Steve argues that bullet charts are more efficient (take up less space) and encode more and more accurate information than a gauge or speedometer type chart. I’m still unconvinced, especially since I’m not a fan of a visualization that requires retraining of the end consumer. I’d much rather use something that the end consumer ‘gets’ right away.

That being said, I haven’t written a visualization post in a while and just because I’m not convinced on bullet charts doesn’t mean that you aren’t and that you might not be looking for a way to create them in Excel. My version is a combo-chart using both a scatter plot and stacked horizontal bars so the order in which we add data and chart elements is important.

BTW, I am going to assume that you have a passing familiarity with Excel and its charting utilities so don’t be surprised when I outline the steps to take (insert a scatter plot) rather than give detailed instructions (select data in cells A1:B2, go to ‘Insert’, click on the Scatter Plot drop down and select the scatter line with no point markers).

Step 1: In this first step, we will add the data for the scatter plot and create the default scatter plot graph. In this example, we are using the scatter plot to set a goal line for the overall bullet chart.bulletchart_step1

So, when we create the initial scatter plot, you’ll see that the x-coordinates are ‘80′ - this is my target. In this case, I happen to be charting the performance of the Homepage Retention Rate (1-Abandonment Rate) and I am shooting for 80%. I happen to be using whole numbers for simplicity’s sake but it could just as easily be percentages. Don’t worry about the vertical alignment of the line just yet, we can adjust it - if needed - later when we are closer to the completed graph. Notice that I haven’t removed any of the chart junk yet either, that will happen in a later step.

 

Step 2: Now we need to add the data for the horizontal stacks and the stacks themselves. We will have 3 bars (Top, Middle, Bottom) with 4 sections each.bulletchart_step2

When working with combination charts, adding the new data points and chart types can be tricky. In the screen shot to the left, we’re looking at the completed process, I’ve excluded the intermediary step of what the data looks like when we first add it.

In order to add the stack data to chart we first select the data from all four of the stack columns and copy it. Then we select the chart and use ‘Paste Special’ to paste the data as ‘New Series’ with ‘Values ‘Y’ in columns’. This will paste our stack data in as new scatters. We then select each of the stack series and change it’s chart type to ’stacked bar’.

Ok, now that we’ve talked about how to add the stacked data to the chart, let’s talk about the data itself. In the top row, the first data point is ‘50′ - this is our floor (minimum acceptable value). The second data point for the top row is always zero. We need this second stack element for visual handling on the center stack (see the little green block in the center bar) when it is less than our minimum. Oh, I’m also not going into excruciating detail on the formulae used in the data set, they are fairly straight forward and you can take a look at them when you download the spreadsheet at the end of this post.

Step 3: This is easy, set the gap width on the bars to ‘0′ so that they are slammed up against each other.bulletchart_step3

 

 

 

 

 

Step 4: Time to take out the chart junk.bulletchart_step4

Remove the vertical axes, and the secondary horizontal axis.

Remove the horizontal lines.

Remove the legend.

Remove the plot edge lines

Set the scaling on the primary horizontal axis, set the axis-line to none as well as the major tick marks to none.

Step 5: Apply consistent and valuable color-coding to chart elements.bulletchart_step5

The blue stack becomes our representation of ‘Actual’ (e.g. the bullet). The background elements are encoded in shades of gray.

Up to 50% is nearly black - remember, this is our floor

From 50% to 75% is about 50% gray - this is our marginal performance section

And finally, from 75% to 100% is light gray (20%) for acceptable performance. I’ve also changed the target line to full black, it stands out nicely against both the light gray and the blue.

Step 6: Resize the chart.bulletchart_step6

So far, our chart hasn’t look much like a bullet chart has it? Well, that’s pretty simple to take care of. In this example, I’ve held the horizontal length the same but compressed the vertical height. Now we have something that very definitely is looking like a bullet chart.

Step 7: Add a text label and center align the chart to the text label.bulletchart_step7

Ok, so what I’m really doing is adding a text description, and the actual value of my metric in the two cells to the left of my bullet chart. Then, I’m massaging the bullet chart into alignment (hopefully) with the text. Notice that the value I’ve put in the label (86) doesn’t match the value in the chart, we’ll fix that in the next step. You might have also noticed that in this step, I removed the final piece of chart junk from my chart - the chart area border. Up until now, it was helpful to have the border for moving the chart around, but now that I have it correctly situated, I no longer need it.

Step 8: Synchronize label value to chart value.bulletchart_step8

Yep, you guessed, that’s a fancy way of saying set the first value of the center stack equal to the text label value (=D89). Now, I can enter a new value on the label, and the chart will update automatically.

Step 9: Obscure the chart data.bulletchart_step9

We had to jump through a lot of hoops to get to this graphic, but we probably don’t want the report consumer to be thinking about those hoops, so set the text color of your chart configuration data to ‘white’, effectively hiding it from the consumer.

Step 10: Rinse & Repeat

 Ok, so there’s my step-by-step on how to create Bullet charts.

What? You want the spreadsheet? Well, since you asked so nicely…

Download the Excel 2007 version or Download the Excel 2003 version

Suit Up

Wednesday, July 25th, 2007

Photo Credit: CBS Broadcasting, Inc.

That’s Barney’s (Neil Patrick Harris) infamous one-liner from CBS’s "How I Met Your Mother". It’s a favorite show of ours (the wife and I).

Now, Neil was recently nominated for an Emmy for his work as Barney on the show - he is the gooey, sticky center that holds it all together. So, after seeing the nom nod, my wife points me to "Barney’s Blog" thinking that I should have some fun with Barney’s statisticovisual (I love making up words) approach to dating.

WELL…

 

Not to be a downer or anything but my amusement was ruined by several factors.

Photo Credit: CBS Broadcasting, Inc.Take for instance, the first bar chart which is labeled "BARNEY STINSON’S HOOK UP PERCENTAGES"

So, the chart purports that Barney’s hookup percentage ranges from 200 to 500 percent, depending on the condition of not having or having Ted with him. Hmm, so when the big B is flying solo he hooks up 200% of the time? And when he has Ted with him the percentage goes up to 500%? Um, part of the Barney character is a near-pathological need to be right and this ain’t right. Also on this chart are Ted’s hook up percentages which range from 0% (without Barney) to what looks to be somewhere between 10 and 20 percent (with Barney). So, half of the chart is wrong and half is correct?

This chart is SO not awesome (as Barney might say).

Also, what’s with the lazy reliance on Excel 2003’s default chart settings? Barney is an over-the-top kind of guy and if he’s going toPhoto Credit: CBS Broadcasting, Inc. go the trouble of creating these charts and putting them up on foam cores then he is certainly going to send them to a graphic designer (at the least, hopefully he’d send them to a data visualist to really make them sing) to dress them up - I imagine Barney sitting there, screaming "Suit UP!" at his laptop.

Finally, in the last line chart, the y-axis is labeled "Level of Hotness", I’m pretty sure that Barney, if he existed, would have labeled that axis "Level of Awesomeness"

 Photo Credit: CBS Broadcasting, Inc.

Sheesh, CBS marketers, make sure to get into character before writing these posts.

google analytics and sparklines

Monday, May 14th, 2007

Update: Google Analytics Team, please don’t do this anymore. I just turned on the ‘compare to past’ feature and noticed that you are doubling up series on the sparklines. This is a big no-no as far as I am concerned because the sparkline with mutliple series on it just ends up looking like a squiggle. The data from the second series actually becomes chart junk!

If your a visual analysis wonk like me you probably couldn’t help but be excited with the new version of Google Analytics and the introduction of Sparklines. To me, it comes across as a major endorsement of sparklines as a business tool.

But, for this implementation to be a useful endorsement of sparklines, they have to show value as a tool and I don’t think the GA implementation is quite there - yet. Remember the new version of GA is in Beta so the team has lots of opportunity to improve.

From the Visitor Summary Report…

In this case, the shaded area is not useful, it’s distracting. The reason for this is relatively simple - it’s shading the area under the curve (plotted line) in the case of a sparkline, the shaded area should code more or different information. I’d really like so see this shaded area represent some UCL and LCL data, for instance the standard deviation (plus or minus) from the mean of all values.

The value currently associated with the sparklines are the total or total average for the period represented when they should be the current (or most recent) value.

I’d like to see some data markers on the chart - current value, high water mark and low water mark. Of course, the markers should be color coded to their value on the right. So, for instance the most current data point marker could be red and the current value would also be in red. The high and low water marks would be different colors and their associated text would be those colors to tie them together.

If that’s as clear as mud, here’s a quick example:

Notice that my example is a bit longer, this is because Tufte recommends shooting for a 45 degree angle on the line slopes (it eases understanding)

  1. The gray area represents the ‘normal’ distribution of visitors - the "biorhythm" as Avinash would say and it is the mean +/- 1 Standard Deviation
  2. Both the oldest point (starting point) and newest point (end point) are marked in red
  3. The watermarks (high and low) are marked in green
  4. For each of the data points that are marked (except for the starting point) an associated value is presented to the right and is color coded so that associations between the chart and the values are easily made

Finally, I’d like to see the sparklines be more contextual to the report or dashboard that they are in. For instance, I’m not quite sure why total page views is reported in the Visitor report. Also, is that average page views per visit or page views per visitor? I’d rather see average visits per visitor and average usage days per visitor (daily visitors divided by absolute unique visitors) than some of these other metrics.

What do you think of the Google Analytics sparklines?

using constants in excel charts

Thursday, January 18th, 2007

Avinash posted today on “Leveraging Statistical Control Limits” in web analytics as a way to cut out the noise in your KPIs and therefore being able to focus on the true outliers - data points that are truly worthy and in need of your valuable brain power.

In his post, Avinash challenged his readers to provide an Excel solution since there are no OOTB (Out Of The Box) solutions for introducing constants, statistical or otherwise, into charts - Excel tries to box us into using just vertical and horizontal axes scale controls.

Well, I can’t pass up a challenge like that right?

What do you think of this?

So what’s going on here?

  1. Named Ranges - Named ranges are your friend. In this case, I’ve got named ranges for each important series of data, especially my key metric - page views per visit. That way, I can insert a new row of data and my constants (Standard Deviation, Mean, Upper Control Limit, and Lower Control Limit) will update automatically. BTW, the chart uses the named ranges as well so that it will update automatically every time a new row of data is added.
  2. The chart is a combination of lines and stacked bars. The stacked bars create the gray biorhythm (normal variation) area. Line charts are used for everything else.
  3. For the top 5 outliers (either upper or lower) I’m dynamically adding a data label so that I know, at a glance, the specific values for them.
  4. I’ve eliminated all horizontal lines other than my own constants because they are just distracting

You can download the spreadsheet and play with it if you want.

A couple of caveats:

  • Named ranges won’t do the dishes for you. In other words, when you insert a new row, you’ll need to copy down the formulae from the previous row
  • Excel doesn’t always do a good job of placing data labels so the outlier labels may need to be massaged after an update
  • My spreadsheet only uses 1 standard deviation. While 3 might be a six-sigma best practice, three standard deviations would have put the lower control at 0.1 and common sense tells me that it shouldn’t be possible to have an average page views per visit value lower than 1.

I’ve deliberately not gone into great detail on the inner-mechanics of the spreadsheet (not that their profound or anything) because that would make for an impossibly long post - if you have a question please post it in the comments or email me directly (civy at instantcognition.com).

What I Read This Week

Friday, September 8th, 2006

Ok, so it’s more like what I’ve been keeping my eye on and this is a desperate attempt to make sure that I actually write something this week, but still here are the things that have held my interest.