Extract delimited data with Microsoft Excel Power Query

Image: Renan/Adobe Stock

Very often we receive data in the form of characters strung together. For our purposes, we may not need the full set, but only a portion. For example, you may receive a list of transaction numbers, which partially contain the customer identification number. Furthermore, you only need the customer section to create a relationship between that customer and a table containing the customer name.

SEE: Google Workspace vs. Microsoft 365: a side-by-side analysis with checklist (TechRepublic Premium)

In this Excel tutorial, I’ll show you how to use Power Query’s Extract and Split Column functions to extract bounded strings in their components. I use Microsoft 365 Desktop and Power Query in Microsoft Excel. Power Query is available in older versions through Excel 10. You can download the Microsoft Excel demo file for this tutorial.

Why you should use Excel’s Power Query

You can use Excel String Functions, Text to Columns, or Flash Fill, but here are reasons why you might not:

  • Text to Columns writes over the original data.
  • Your data might not be in Excel because although Power Query is available in Excel, Power Query can import data from many sources, not just Excel.
  • The source data contains more rows that you can import into Excel.
  • You may need to use Power Query for something much more complex, and extracting a subset of the original item is just the first step.

If the data is in Excel, you can use functions or formulas, but unless you are an expert it will take some time. Most of us can’t just spell out the necessary syntax and get it right the first time. Power Query is fast and requires no specialized knowledge of Excel functions.

How to get the data in Power Query

We are working with a simple Excel sheet with a few separated strings in a table called TableCustomerID. You don’t need to replace the default table name, but meaningful names are easier to work with if you have multiple tables.

Let’s say you have a list of customer identification numbers with three sections each. Furthermore, a hyphen serves as a separator between the three sections (Image A). You want to use the middle component of each string, because that’s the section that identifies each customer. The other two components identify the region where the customer lives and a transaction number.

Image A

Load the Excel data into Power Query.
Load the Excel data into Power Query.

The first step is to load the data into Power Query as follows:

1. Click anywhere in the table.

2. Click the Data tab.

3. In the Get Data & Transform group, click From Table/Range.

That is it. The simple table shown in Image A is now in Power Query.

You can extract sections using the data in Power Query.

Extract delimited strings with Extract options in Power Query

There are actually two ways to extract data in Power Query. We’ll start by using Extract options, which return a subset of the original value. To do this, click the Transform tab and then click the Extract drop-down list in the Text Column group.

As you can see in Figure B, there are several options and for the most part they all speak for themselves. Let’s look at the separator options so you can see what they all do.

Figure B

There are several Extract options.
There are several Extract options.

After clicking the header of the Customer ID field to select the column, click the Transform tab, if necessary, and then click Extract in the Text Column group. Choose the Text before separator option. In the resulting dialog box, enter the hyphen (Figure C) and click OK.

Figure C

Enter the hyphen as a separator.
Enter the hyphen as a separator.

As you can see in Figure Dthis option returns only the first letter(s) before the separator.

Figure D

This option returns only one character for each string.
This option returns only one character for each string.

To get the original data back, delete the Extracted Text Before Delimiter step in the Applied Steps pane shown in Figure E.

Figure E

Just select it and press Delete.
Just select it and press Delete.

Now let’s do the same with the next option, Text after separator. If prompted, enter the hyphen and click OK to see the results in Figure F. This time, Power Query removes the first two characters, the first number and the first hyphen.

Figure F

This option returns all characters after the first hyphen.
This option returns all characters after the first hyphen.

Claim the original data again by removing the extract step and then choose the Delimited Text option. This time, Power Query asks to provide two separators. In this case they are both the hyphen (Figure G).

Figure G

Enter both the start and end separators.
Enter both the start and end separators.

Click OK to see the results displayed in figure H.

figure H

The last option returns the characters between the two hyphens.
The last option returns the characters between the two hyphens.

Now we’ll look at another way of dividing the three sections of each string, but we won’t be taking pieces out of the strings, we’ll be splitting the strings. Claim the original data before proceeding.

Extract delimited strings with Split Column in Power Query

Power Query’s split column allows you to return more than one piece of the string. For example, let’s say you want three columns of data, one for each section. To achieve this, use Split Column as follows:

1. After selecting the column, click the Home tab.

2. In the Transform group, click Split Column.

3. Click on the first option, On Separator.

4. In the resulting dialog box, you don’t need to do much because Power Query can distinguish your needs well. Make sure that Power Query selects Every event of the separator in the Split by section (Figure I).

Figure I

Choose the option that uses all separators.
Choose the option that uses all separators.

5. Click OK to see the results in figure J.

figure J

This option divides the three sections into three columns.
This option divides the three sections into three columns.

This option separates each string into three columns using the separator to determine where each section begins and ends.

As you can see, both Extract Column and Split help you quickly separate data. You will likely come across it for both uses.