This Tech Tip was submitted by Jim Fergerson, Director of Institutional Research and Assessment, Carleton College, Northfield, Minnesota.
Do you ever need to do a series of summary statistics calculations on specific subsets of data in an Excel spreadsheet? For example, you may need to repeatedly calculate the same set of statistics for faculty by rank, for students by class year, or for various peer groups. If so, you should look at the little-known (and not well-documented) AGGREGATE function that was introduced in Microsoft Excel 2010.
Remember that Excel’s stand-alone functions (SUM, AVERAGE, etc.) return results for all rows in your data range, including the hidden ones. If you’re willing to use a few codes and learn a bit of syntax, AGGREGATE provides a shortcut path to using one function to replace 19 commonly used statistics. Furthermore, it returns these statistics only for the set of rows that are currently filtered (displayed) in your data rectangle. So, if you have a database of faculty information and only want to return values for “full professors,” just filter on that subset and use the AGGREGATE function to replace the standard statistical functions. You’ll get results only for your filtered set of rows. If you then need to do the same calculations for the other ranks, just “copy paste special values” for the “full professor” results to “freeze” the values in a neutral area of the spreadsheet, then change the filter and repeat for the other ranks. By adding another set of filters, you could quickly produce additional sets of summary statistics by gender, department, tenure status, etc.
I’ve also started to download the IPEDS data for all institutions in my Carnegie category—then I’ll set up flags in the initial columns that identify my several peer groups. With a few quick filters, I can compare my institution to all baccalaureate colleges, to several standard peer groups, or even to a custom set of peers, as requested.
In AGGREGATE, each of the most commonly used Excel statistical functions is represented by a number, as shown in the syntax table below. (AGGREGATE was introduced to improve on Excel’s old SUBTOTAL function, and adds eight vital statistics such as MEDIAN, MODE, PERCENTILE, and QUARTILE.) The function number is followed by an “option” that tells Excel what to “ignore” when calculating statistics for a filtered set. For example, you can omit just hidden rows, error values, nested subtotals, or combinations of these. Excel’s drop-down function tips will prompt you about the syntax as you enter the formula, but I find it helpful to keep a printout of the syntax chart on my wall for reference. Having learned the power of AGGREGATE, I’m now using it as my “one function to replace them all.”
EXAMPLE: Using AGGREGATE function to replace standard Excel functions in a summary statistics table:
Note that I could also use AGGREGATE function #16 (PERCENTILE.INC) as an alternative to the MEDIAN (#12) (median= 50th percentile), or as a substitute for QUARTILE.INC (#17) (quartile options 1 and 3 return the 25th and 75th percentiles). I could also use the QUARTILE.INC (#17) to replace the MIN (#5) function or the MAX (#4) function (quartile 0 =minimum; quartile 4 = maximum). The “7” in the second position of the formulas tells Excel to ignore error values or any manually hidden rows (in addition to those that are “filtered.”)
Additional notes:
If "function_num" = 14, 15, 16, 17, 18, or 19, then a second "ref2" argument must be supplied, as with the Excel functions they emulate.
If a second "ref2" argument is required but not provided, then #VALUE! is returned.
If any of the arguments are 3D references, then #VALUE! is returned.
As an extra precaution, I often wrap an IFERROR function around my AGGREGATE formula to replace any untrapped error messages with a “Null.” Example: =IFERROR(AGGREGATE(1,7,L$6:L$274),"")
To help you remember the appropriate function and “option” codes, Excel provides a drop-down “hint” menu in the formula bar as you enter the AGGREGATE formula: