Where is the source file of that data?

On my job I usually have to present data-rich slides that are the result of analysis on sets of data. One of my pet peeves on this types of slides is the omission of the data source. As I always point out to analysts in training, *please* do yourself a favor: put the footnote with the source. You’ll be happy when 5 months later you are asked to support some results of your analysis.

That answers the question “What is the source of your data?”. More often than not, when you have to support your analysis, its also handy to know “Where is the source of my data?”. My recommendation decks are usually in Powerpoint, and I only sparringly use Office’s embedded file feature, so what is on the slide is just a picture of the spreadsheet, chart or table.

My source notes usually look as shown below:

Example footnote with file name in Excel

The way I recommend analysts in training to do it is through the CELL function in Excel.

=CELL("filename")

The formula above will print the whole path and filename of the file. If you want only the filename, you can use

MID(LEFT(CELL("filename",A1), FIND("]", CELL("filename", A1))- 1), FIND("[",CELL("filename", A1))+1, 255)

Easy creation of tornado charts in Excel – 5 steps, no add-ins

UPDATE: A newer post is available, with Excel 2007 instructions and a screencast.

Normalize Excel tables

Business data is quite often expressed across many dimensions. The profitability equation in a company is very simple in concept, but in practice those revenues come across regions, product lines, products etc., making them in fact multidimensional data.

Users of OLAP systems are very aware of multidimensional data. However, many spreadsheet users are not, so they manage to flatten the data the best they can, using pages or subtotals for dimensions beyond the second one.

Modern spreadsheets have “PivotTable” capabilities, which makes easier to deal with multidimensional data. To feed these pivot tables, the information has to be normalized. This post explains how to normalize data from non-normal representation, using a Visual Basic macro

If you are familiar with OLAP and normalized data, skip to the code. I’ll show samples of non-normalized data, how the same data would look normalized, and why this recipe is useful.

Continue reading

Column – Stack combination chart in Excel

Not very common, but sometimes useful, this chart is made up of a series of column pairs, where the one of them is a simple column and the other one is broken down in segments.

Column stack combination chart in Excel

Last time I used it to show budgeted project expenditures versus actuals, month by month. The right column, actuals, was broken down to show capital investment. The chart can also be created so the “entire” column is to the right and the “broken” one is to the left.

Here is the cookbok to make the chart:

  • Create 4 series, horizontally as shown below: broken (bottom part) and broken (top part), entire and dummy. I’ll refer to these series as BB,BT,E and D respectively.
    Data for the chart
  • Fill your data as needed. On the D series, put some value on the first cell – this is just to allow you to select the series and later on you will delete this value. Make this value close to the average of the rest of the data, so you can handle the segment comfortably.
  • Select all five lines of the data: titles, the three series for the columns and your D series. Create a chart, choose column type, and stacked column on the subtype. Make sure series in columns is the selected option. Finish the chart wizard.
  • Right click on the segment for the D series on the first column and move it to the secondary axis: Choose “Format data series” and on the “Axis” tab choose “Secondary Axis”.
  • Do the same with the data corresponding to the E series. Before closing the “Format data series” dialog, go to “Options” and choose overlap -100, gap width 80. This setting will apply to both the D series and the E series.
  • In the “Series order” tab you can choose if your E series will go to the left or to the right by moving it up or down
  • Go to “Format series”, now for the BT series. overlap should be 100 and gap 320.
  • Double-click the right axis to bring the “Format Axis” dialog. Choose the maximum to be equal to that of the left axis. Even if Excel chose by default the same value for both, make sure the checkbox close to maximum is unchecked. You want to make sure manually your axis are synchronized to avoid showing misleading information. By default, Excel charts start at 0, but if you change this, you must do it in both axis. While you are at this dialog, go to the “Patterns” tab and choose “None” in “Thick mark labels”.
  • Delete the dummy value, the dummy label on the chart and format the chart to your liking

Enjoy!