​Using REXCEL to Reshape or Aggregate Data

By Ihsuan Li, Associate Professor of Economics, Minnesota State University-Mankato

This tip builds on a past Tech Tip by Barry Nagle (Installing and Using REXCEL), published in eAIR in April 2014. As previously introduced, R is a powerful data management and analytics software. Its data-management capabilities can be integrated into Microsoft Excel through an Add-Inn: REXCEL.

This tip covers two features that combine the efficiency of R and the familiarity and ease of Excel. The features are reshaping a dataset (long to wide) or aggregating data (by factor).

Reshaping Data

In data structure lingo, data have two general formats: long (each row represents one observation belonging to a cross section unit, i.e., student name or ID number), and wide (each row represents a different variable for which a value is assigned to an individual or factor, i.e., age, sex, high school GPA, etc.).

Let’s suppose your data (data1) is saved in text (.txt) format, such as the one shown below. The dataset contains information on student ID (stdid), the student’s high school GPA (hsgpa), the term under which the student entered (entryterm), the credited hours on admission (hours), and the intended major (intmajor).

R1.JPG

Data1 is shaped in long format. The IDs are cross section units. The variables are: hsgpa, entryterm, hours, intmajor. Suppose we wish to summarize the data so that each student’s information is shown under its own column. Before you can reshape the data, you need to read it or make it an active data frame on R.

Steps:

1. Read in the data (data1.txt): click on Data, then select Import Data, and choose From Text File. Notice you can import data saved in other formats as well (SPSS, SAS, STATA, etc.).Read in the data (data1.txt): click on Data, then select Import Data, and choose From Text File. Notice you can import data saved in other formats as well (SPSS, SAS, STATA, etc.).

R2.JPG

2. Name the data that will be saved in the active data frame in R (I named it again data1). If your data file has variable names on the first row, check “Variable names in file.” Then click OK.

R3.JPG

3. On the Excel sheet, choose the Add-In tab on the ribbon, and REXCEL should be on the shortcut command bar as shown in #4 below.

4. Click (activate) cell A1, then click RExcel, choose Get R Value, and then Active Dataframe. Click (activate) cell A1, then click RExcel, choose Get R Value, and then Active Dataframe.

R4.JPG
The data from the text file should be automatically pasted as shown below.

R5.JPG
There are two approaches to reshaping our data: using R package reshape2 that you must download and install, or using the Aggregate Data function in RExcel (point and click).

Frst Approach: Type the syntax on the script window on R Commander. Frst Approach: Type the syntax on the script window on R Commander.

The user must first install the reshape package to use the transpose function. You can install the package by typing on the Script Window of the RCommander using the following command: install.packages (“reshape2”), then click Submit.

Type library (“reshape2”) to save it in the R library. To view your data, type data1 (which is what we named the file).

R6.JPG

The data should look like this on the output window:

R7.JPG
To reshape into wide format, type t (data1). The reshaped data on the output window is shown below: each student’s information is listed under the student’s ID (column).

R8.JPG

Second Approach: Use the RExcel in the Excel environment with drop-down menu. First, click on cell A1 to active it. Click Data, choose Active Data Set, and then choose Aggregate Variables in Active Data Set.

R9.JPG

The window will appear and you will be asked for input; pick what you want to aggregate (hours and hsgpa), and then pick what you want to aggregate it by (intmajor). Choose the statistic you wish to report (the mean), then click OK.

R10.JPG

The aggregate/reshaped data is shown below:

R11.JPG

REXCEL is a tool that combines both the ease and familiarity of Excel with the efficiency of R. Using either the script (R Commander) or the Aggregate function (Excel), you will be able to extract information from a long formatted dataset to one in wide format.

 

 

 Comments

 
To add a comment, Sign In
Total Comments: 8
 
Lee posted on 1/15/2015 12:10 PM
Thanks Ihsuan! This is very interesting and looks like it could significantly expand Excel's capabilities. I haven't really used R but this example has piqued my interest.
David posted on 1/15/2015 12:42 PM
This is an interesting exercise. While R may be overkill for this type of task (this can be done with Power Pivot withing Excel), for more sophisticated analyses this technique is indispensable!
Keith posted on 1/15/2015 1:18 PM
This looks like a fantastic tip, but I am using a 64-bit version of Excel and cannot get REXCEL to install properly. Any help on this would be appreciated.
Lisa posted on 1/15/2015 1:26 PM
Thank you for sharing. I have never worked with R but I've heard much about it. I'm going to download and experiment. It never hurts to have too many tools in the arsenal!
Shawn posted on 1/15/2015 4:11 PM
Thank you for the simple and well explained example. Sparing users the time of restructuring or aggregating via pivot tables, formulas, or going through a separate software package is a definite plus.
Ishuan posted on 1/20/2015 12:17 PM
Hi Keith: it looks like your version of R is not compatible with the RExcel version. See http://rcom.univie.ac.at/download.html
David posted on 2/13/2015 3:41 PM
If the data are, say, in .txt as the example, why not just copy/import them into Excel, select the full table, and then do a Copy-Paste-Transpose. That interchanges rows for columns and vice versa. Snap. Done. It would be less work that getting involved in R, which is fabulously powerful as a stats language, but I think the Excel solution is probably easier.
Ishuan posted on 2/16/2015 12:19 AM
Hi David: Sure, nothing beats Excel's copy/paste/transpose. For those who are looking to expand their RExcel skills, the two ways to import/handle data to RExcel are useful skills in themselves. The transpose and aggregation tips hopefully add something to our repertoire. For those highly skilled in R, the arsenal is deep and wide!