Excel CONCATENATE Function: Creating Long STATA Commands

By Ainur Satekova, Manager for Institutional Research, Nazarbayev University

There are some STATA commands that you need to type multiple times in a STATA do-file. For instance, you need to type the same command for each distinct observation in the data set. The most common commands that I use in the data cleaning processes are “replace,” “rename,” “label variable,” etc. I had been in the habit of typing each command individually in the STATA do-file until my supervisor showed me the easiest way of doing it using the CONCATENATE function in Excel. In this Tech Tip, I explain the technique of using the CONCATENATE function in Excel to create a series of STATA commands without having to type them again and again. While STATA is used in the example, the CONCATENATE function can be used for a number of purposes.  

Recently, I was working on a data set for which I needed to translate country names (i.e. citizenship) from Russian to English. To do that, I needed to type the “replace” command in my STATA do-file for each individual country name. However, I was able to avoid doing this and saved time. I’ve used this experience as an example in the following Tech Tip.

About the CONCATENATE function in Excel

The word concatenate is just another way of saying "to combine," "to join," or “to merge.” The CONCATENATE function allows users to combine text from two or more different cells into one cell. In my example, I use it to combine the text from five different columns to create a combined name in a new column.

This is desired result in my do-file editor:

TTJan2017-1.JPG

And this is how I do it using Excel:

TTJan2017-2.JPG

In the first column (column A), I have the raw data, which is the list of country names in Russian.

In the next column (column B), I have country names translated into English, typed manually in Excel.

Columns C, D, and E contain STATA command pieces. I typed these items only on the first row (C2:E2), then I copied and pasted across multiple columns by dragging the fill handle in Excel. To do that, select the cells that you want to copy, then rest your cursor in the lower right-hand corner of the last right cell so that it turns from a clear plus sign to a solid one (+) (see picture below).

 
TTJan2017-3.JPG

Please note that at the beginning of the text in column D, a space is included so the command comes out correctly.

Once it changes to the solid plus sign, you can hold and drag the fill handle down the column over the cells where you want to copy the text. The text will automatically fill in and change (see picture below).

 
TTJan2017-4.JPG

Writing the function

Now you can enter the function into cell F2. As usual, we start with the equal sign (=) followed by the function name and an open paren:

=CONCATENATE(

Within the parens, we need to enter our arguments. The arguments tell the CONCATENATE function what cells to combine. In my example, I want to combine the text in cells C2, B2, E2, D2, A2 and E2 again (keeping the order), to make each of those an argument.

=CONCATENATE(C2,B2,E2,D2,A2,E2)

When you run the function, this is the result.

TTjan2017-5.JPG

To apply the function to the entire column, you can select the cell and double-click the plus sign instead of dragging it down. Double-clicking the plus sign automatically copies the function down (see the screenshot below).

TTJan2017-6.JPG

The final step

Finally, select the entire column F, and copy and paste it into the STATA do-file editor. You will have the result seen in the first picture at the beginning of this Tech Tip. The more you use the CONCATENATE function, the more you will see how it can make your life easier.

 

 

 

 Comments

 
To add a comment, Sign In
Total Comments: 1
 
Marlene posted on 1/30/2017 11:22 AM
This is wonderful, thank you for sharing! As you point out, an especially helpful tip for small IR offices and/or ones that can not afford something like Tableau.