Home | About | Log in | Get The Feed
Abstracting Data from the Report Presentation

August

25

2006

It’s not that mysterious…

I was talking with some folks the other day and started discussing the importance of separating (or abstracting) the data in a report from the presentation of data in a report and someone asked for an example.

First, let me describe what I mean by abstracting the data from the presentation. Since I’m a web guy, I’ll give you an example via web publishing. Take this blog for instance, it is run on wordpress and at a grossly understated level, wordpress does for blogs what I meant in terms of report design.

All of the data for this site, this text, images and file locations, etc. are stored in a database (mysql) and then there are templates that query the database for the information that they need to present the data (information) in the way that I have designed it to be presented. In this situation the website data is abstracted from the presentation of said data. Contrast this with a more traditional (ok, antiquated) site that is hand-crafted, page by page, and all of the data is hard-coded into the presentation – which one do you think is more efficient?

Similarly, you can design a report where all of the data is hard-coded into the part/piece/sheet that users are supposed to look at or the data can be stored in a different location and called (or queried) into the presentation template.

There are a couple of big benefits to approaching report design this way.

  1. Structuring data for manipulation and collation is often counter-productive to understanding. In other words, setting up the data so that I can easily run calculations and transforms on it is pretty much setting it up as that classic table of raw data that nobody but it’s mother can love and use. And since report design is all about providing understanding, it’s often very useful to keep the data in a seperate location from its presentation
  2. Hard-coded data is hard to update. Since reports aren’t truly effective unless you can produce them easily and efficiently on schedule, hard-coding the data into the report presentation always makes that harder. So, if the data is called or queried into the presentation layer and we get our data from a refreshable query we have solved the deliverability issue – it’s now easy to produce the report on time.

Putting this strategy into an Excel context

This is very simple. In most cases, I’ll have a worksheet called (imaginatively enough) ‘Data’ where I store all of the data that I need for the report and then I’ll have one or more worksheets that provide the presentation of the data (e.g. the information and insight that the data provides).

The simplest way to call or reference the data in the data sheet is a simple reference to the datum’s location in the data sheet. There are at least two easy ways to do this.

  1. Copy & Paste: Copy the cell in the data worksheet that has the datum I want and then using Paste Special –> Paste Link paste it into my presentation worksheet
  2. Write the formula. The above task essentially writes a formula that says ‘this cell equals that cell’. So, I can write the formula in myself. So, let’s say that the datum I want is in cell A1 in the data worksheet. Back on the presentation worksheet in cell B2 I can write the following formula:

=Data!A1

Now, whatever datum is in Data A1 will always show up in Presentation B2

That’s the easy way, but also the most inflexible. I’m going to save detailed explanations and examples for other posts, but if you want to read ahead, using functions like:

  • vlookup
  • lookup
  • hlookup
  • offset
  • indirect
  • named ranges (not a function per se, but used in combination with functions and charts – we can start to get super fancy)

and maybe a couple of others provide you with a great deal of flexibility and you can create quite dynamic and robust report interfaces using them.

5 Responses to “Abstracting Data from the Report Presentation”


This is fundamental, important stuff. Excel is so flexible and you really need to be disciplined when creating spreadsheets. Separate, well structured data is a must.

We use three layers in our report design: Data => Transform => Present. The motivation is the same as you describe, but we use pivottables heavily, so the middle layer generally contains a bunch of synced pivottables that feed the data display.

- Chris


Chris, interesting that you all use a lot of pivot tables – I tend to only use them as a last resort – nothing else will transform the data in the way I need. Is there something about the data that you are working with that requires the heavy use of pivot tables or is it just a preference that you have?


Typically we’re dealing with large data volume (5,000-50,000) rows of data from some operational source. Secondly, we also structure the PTs for consistency (save data with PT off, refresh on open on, rows and columns tuned to show items with no data). Thirdly, we often need to create calculations by combining one view of the data in a PT with another view. Fourthly, we then use phantom PTs on the presentation page to provide user interaction. These PTs echo changes down to all the “slave” PTs on the transform layer.

To sum up, I think our use case is somewhat different from yours and I really need to write these techniques up at Juice Analytics blog. ;-)


I knew I wasn’t going crazy. Your comments DO EAT capitalized letters. Weird.


It’s true – theme o’ the blog – e.e. cummings ;)
in other words, text-transform:lowercase