It’s not that mysterious…
I was talking with some folks the other day and started discussing the importance of separating (or abstracting) the data in a report from the presentation of data in a report and someone asked for an example.
First, let me describe what I mean by abstracting the data from the presentation. Since I’m a web guy, I’ll give you an example via web publishing. Take this blog for instance, it is run on wordpress and at a grossly understated level, wordpress does for blogs what I meant in terms of report design.
All of the data for this site, this text, images and file locations, etc. are stored in a database (mysql) and then there are templates that query the database for the information that they need to present the data (information) in the way that I have designed it to be presented. In this situation the website data is abstracted from the presentation of said data. Contrast this with a more traditional (ok, antiquated) site that is hand-crafted, page by page, and all of the data is hard-coded into the presentation - which one do you think is more efficient?
Similarly, you can design a report where all of the data is hard-coded into the part/piece/sheet that users are supposed to look at or the data can be stored in a different location and called (or queried) into the presentation template.
There are a couple of big benefits to approaching report design this way.
- Structuring data for manipulation and collation is often counter-productive to understanding. In other words, setting up the data so that I can easily run calculations and transforms on it is pretty much setting it up as that classic table of raw data that nobody but it’s mother can love and use. And since report design is all about providing understanding, it’s often very useful to keep the data in a seperate location from its presentation
- Hard-coded data is hard to update. Since reports aren’t truly effective unless you can produce them easily and efficiently on schedule, hard-coding the data into the report presentation always makes that harder. So, if the data is called or queried into the presentation layer and we get our data from a refreshable query we have solved the deliverability issue - it’s now easy to produce the report on time.
Putting this strategy into an Excel context
This is very simple. In most cases, I’ll have a worksheet called (imaginatively enough) ‘Data’ where I store all of the data that I need for the report and then I’ll have one or more worksheets that provide the presentation of the data (e.g. the information and insight that the data provides).
The simplest way to call or reference the data in the data sheet is a simple reference to the datum’s location in the data sheet. There are at least two easy ways to do this.
- Copy & Paste: Copy the cell in the data worksheet that has the datum I want and then using Paste Special –> Paste Link paste it into my presentation worksheet
- Write the formula. The above task essentially writes a formula that says ‘this cell equals that cell’. So, I can write the formula in myself. So, let’s say that the datum I want is in cell A1 in the data worksheet. Back on the presentation worksheet in cell B2 I can write the following formula:
=Data!A1
Now, whatever datum is in Data A1 will always show up in Presentation B2
That’s the easy way, but also the most inflexible. I’m going to save detailed explanations and examples for other posts, but if you want to read ahead, using functions like:
- vlookup
- lookup
- hlookup
- offset
- indirect
- named ranges (not a function per se, but used in combination with functions and charts - we can start to get super fancy)
and maybe a couple of others provide you with a great deal of flexibility and you can create quite dynamic and robust report interfaces using them.