How To Find, Highlight & Remove Duplicates In Excel

When you use Excel to organize a lot of information, you’re bound to end up with duplicate data at one point or another. And if you miss duplicates or choose not to remove them, they could greatly affect the accuracy of your data. Depending on what you’re trying to do in Excel, this could make it difficult or even impossible to get the results you want.

The biggest hurdle is actually finding any duplicated data in Excel. Fortunately, once you know the correct functions to use, finding duplicates is a breeze. But once you find them, what should you do with them? This largely depends on the reasons the duplicates have occurred, but you can either choose to highlight duplicates for future reference or remove them from your spreadsheets entirely.

In today’s guide, we will help you learn how to find, highlight, and remove duplicates in Excel. But first, let’s examine how and why duplicates pop up in Excel at all:

What Causes Duplicates In Excel?

The underlying cause of duplicated data is important to understand, as it may affect how you treat duplicates and what you ultimately decide to do with them. You may have duplicate data incorporated into your spreadsheet intentionally or naturally. For example, let’s say that you’ve created a spreadsheet to study your monthly income from side gigs for the first 6 months of the year. It might look something like this in Excel:

Column A Column B
January $412
February $695
March $506
April $695
May $438
June $720

As you can see, Column A shows the months and Column B shows the corresponding income from side gigs. However, the income from February and April are both the same. This is what we might consider naturally occurring duplicates, or data that appears to be duplicated but is actually just matching by coincidence.

On the other hand, you might intentionally include duplicate data in your spreadsheet with a specific reason in mind. For example, let’s say that you want to use your spreadsheet as part of a presentation on your coworkers’ involvement in recent projects. You might have a spreadsheet that looks something like this:

Column A Column B Column C Column D
Project 1 Project 1 Project 2 Project 2
Mark 4 hours Mark 3 hours
Stephanie 6 hours Stephanie 3 hours

While this data could be better displayed in a simplified format or as part of a graph, it does accurately display relevant information. Column A shows the names of the coworkers for Project 1, while Column B shows the corresponding number of hours each worker dedicated to Project 1. Columns C and D show the same data, but as it pertains to Project 2. In this situation, several duplicate entries are intentionally included to properly label the data. More specifically, Project 1, Project 2, Mark, and Stephanie are all duplicated in the table above.

Erroneous and Unintentional Duplicates

It’s important to note that if you’re looking for duplicates in your data, you’re probably not looking for data points that match by coincidence or duplicates that you’ve intentionally included in your spreadsheet. Instead, you’re looking for erroneous or unintentional duplicates. There are various causes for these kinds of duplicates, including:

  • Manual Data Entry – This is one of the most common causes of duplicates in Excel. While entering data manually into your spreadsheet, you simply made a mistake and inserted the same data more than once.
  • Importing – When you import or copy and paste data into your spreadsheet from another source, it is possible to accidentally introduce more than one copy of the incoming data. This could even result in large batches of duplicated data.
  • Merging Spreadsheets – When you merge two spreadsheets in Excel, there’s always the chance that some of your data will overlap. For example, you may have spreadsheets that cover data from set time periods. Spreadsheet 1 includes 2017 through 2020, while Spreadsheet 2 includes 2019 through 2022. The overlap in years could result in duplicated data.
  • Inconsistent Formatting – Duplicates can occur even when the data doesn’t look exactly the same. For instance, the same person might be represented in different ways (Mark Smith, Mr. Smith, Mark, Mark Smith Jr., etc). Excel would treat each one as a unique entry, even though they all refer to one person.

How To Find Duplicates In Excel

Now that you know a little more about the potential causes of duplicates, you can start to identify duplicates in your own spreadsheets. Thankfully, the process for finding duplicates is pretty straightforward; you will just need to learn how to use the COUNTIF function. Here is the step-by-step process for using COUNTIF to show duplicates in Excel:

  1. Choose the range of cells that you want to check for duplicates (let’s say B1 through B20)
  2. Select an empty cell and type this formula: =COUNTIF($B$1:$B$20,B1)>1
  3. The results will show TRUE for duplicates and FALSE for non-duplicates
  4. (Optional) If you want to run the COUNTIF function and exclude the first occurrence of a potentially duplicated value (let’s say the value of cell B1), use this formula instead: =IF(COUNTIF($B$1:$B1,$B1)>1,”Duplicate”,””)

The above example applies to a specific range, but you can also use COUNTIF to search for duplicates in columns (ex: =COUNTIF(B:B,B1)>1) or in rows (ex: =COUNTIF(B1:B10,B1)>1).

How To Highlight Duplicates In Excel

Once you’ve found duplicated data in your spreadsheet, you’ll need to make sure that you can keep track of any duplicates going forward. This is especially valuable if you don’t want to delete the duplicates. The easiest way to keep track of your duplicates is by highlighting them with conditional formatting:

  1. Choose the range of cells you want to check for duplicates
  2. Click the “Home” tab
  3. In the “Styles” group, click “Conditional Formatting”
  4. From the dropdown menu, select “Highlight Cells Rules” and then “Duplicate Values”
  5. In the dialog box, choose the formatting style you want to use (make sure “Duplicate” is selected in the dropdown menu)
  6. Pick the highlight color
  7. Click “OK” to finalize your selections

How To Remove Duplicates In Excel

If you’re looking for duplicates in Excel, there’s a good chance that you want to get rid of them. As previously mentioned, finding duplicates is usually the most complicated step. Once you’ve found them, you can either take them out manually or delete your duplicates using Excel’s built-in tools. If you use the automated process below, keep in mind that it will permanently delete duplicates from your worksheet, so you’ll want to back up your data beforehand. Here are the steps to eliminate duplicates from your Excel spreadsheet:

  1. Select the range of cells that include the duplicates
  2. Click the “Data” tab
  3. In the “Data Tools” group, select “Remove Duplicates”
  4. In the dialog box, you can choose to remove duplicates in select columns; alternatively, you can choose to leave all columns checked
  5. Click “OK” to finalize your selections

This process will remove all of the duplicates in the columns and rows of the selected range. Excel will tell you exactly how many duplicate values it removed and how many remain in the spreadsheet (if applicable).

Need help from dependable software experts? Reach out to Geeker today for on-demand IT and software solutions!