How to Show Duplicate Values in Tableau
Trying to find duplicate records in your Tableau data set is a critical step for ensuring your analysis is accurate. Duplicates can quietly inflate sales numbers, misrepresent customer counts, and lead to flawed conclusions. This article will walk you through a few practical, step-by-step methods to identify and highlight these duplicate values directly within Tableau.
Why Finding Duplicates in Tableau Matters
Before jumping into the "how," it’s important to understand the "why." Duplicate data isn't just a minor annoyance, it can seriously undermine the integrity of your reports and dashboards. Here are a few common problems caused by duplicates:
- Inflated Metrics: Imagine a master sales log where the same high-value order appears twice due to a system glitch. Suddenly, your total revenue, average order value, and sales-per-rep metrics are all artificially inflated.
- Skewed Customer Analysis: If a single customer exists under two slightly different email addresses or IDs, you might miss their true lifetime value. You might unknowingly target them with conflicting marketing campaigns or miscalculate customer retention rates.
- Wasted Resources: In marketing or sales operations, duplicate lead or contact records can lead to redundant communications, annoying prospects, and wasting your team's valuable time.
Catching these issues inside Tableau allows you to clean your data at the source or at least filter out the noise, ensuring your stakeholders are making decisions based on accurate information.
The Key Concept: Using Level of Detail (LOD) Calculations
The most robust way to find duplicate values in Tableau is with a Level of Detail (LOD) calculation. While they may sound intimidating, the concept is quite straightforward. An LOD expression lets you compute an aggregation at a different level of detail than what's currently in your view.
For finding duplicates, we primarily use the FIXED LOD. A FIXED calculation answers a question like, "For each customer email, count how many total records we have, regardless of any other filters or dimensions in my chart." This is powerful because it lets you "tag" every single row with a count of its peers.
Think of it like this: You have a big spreadsheet of orders. A FIXED LOD calculation goes through, looks at every Order ID, and puts a sticky note on each row saying, "This Order ID appears 3 times in total." That number on the sticky note doesn't change, even if you later filter the view to only show orders from California.
Method 1: Create a Duplicate Counter with a FIXED LOD
This is the foundational method that you'll use to both highlight duplicates and filter for them. We'll start by creating a calculated field that counts how many times each record appears based on a unique identifier.
Step 1: Identify Your Unique Key
First, decide what field (or combination of fields) should be unique. This is your key. In some datasets, it's a single field:
Order IDCustomer EmailTransaction ID
In other cases, a combination of fields defines a unique record. For example, a student should only be enrolled in a specific class once, so the unique key would be a combination of Student ID and Class ID.
For this example, let's say we're analyzing a customer list and want to find duplicate entries based on Customer Email.
Step 2: Create the Calculated Field
Now, let's build the LOD calculation to count the records for each email.
- Right-click anywhere in the Data pane on the left and select Create Calculated Field.
- Name the field something clear and descriptive, like
Count of Records per Email. - Enter the following formula:
{ FIXED [Customer Email] : COUNT([Customer Email]) }Let's break that down:
{ FIXED [Customer Email] : ... }: This tells Tableau: "For every distinct value in the[Customer Email]field, perform the calculation that follows."COUNT([Customer Email]): This is the calculation to perform. For each group of identical emails, it counts the total number of records (rows) in that group.
Click OK. You now have a new field in your Data pane. If you drag this onto a view next to the Customer Email, you’ll see a number for each row. Any row where this number is greater than 1 is part of a duplicate group.
Method 2: Visually Highlighting Duplicates in a Table
Now that you have your counter, you can easily use it to make duplicates pop out in a table. This is perfect for quick visual sanity checks and exploratory analysis.
Step 1: Build a Basic Table
Create a new worksheet in Tableau. Drag the dimensions that help you identify the record onto the Rows shelf. For our example, we'd drag Customer Email and maybe Customer Name.
Step 2: Add the Counter to Your Visualization
Drag your new calculated field, Count of Records per Email, from the Data pane onto the table next to the others. You could place it on the Text mark or just on the Rows shelf. At this point, you'll see the count for each row.
Step 3: Use Color to Highlight Duplicates
The fastest way to make the duplicates stand out is with color.
- Drag the
Count of Records per Emailfield again, but this time drop it on the Color mark in the Marks card. - By default, Tableau will likely apply a continuous color gradient. We want something more distinct. Click on the Color mark, then click Edit Colors....
- Change the palette to a Stepped Color palette and set the number of steps to 2.
- This creates a clear break. You can assign one color (like gray or light blue) to values of 1 (the unique records) and a sharp, attention-grabbing color (like red or orange) to values of 2 and above (the duplicates).
After clicking Apply, your table will be instantly transformed. All rows associated with a duplicate email will be vividly highlighted, making them impossible to miss.
Method 3: Creating a Dedicated 'Duplicates Only' Report
Sometimes you don't just want to highlight duplicates - you want to isolate them completely for further investigation or to create a list for your data team to fix.
Step 1: Create a Boolean (True/False) Filter
While you could filter directly on the Count of Records per Email field, creating a separate True/False calculation is often cleaner and easier for others to understand.
- Create a new calculated field and name it
Is Duplicate?. - Enter the formula:
[Count of Records per Email] > 1This simple formula will return 'True' for any row where the count is greater than one and 'False' otherwise.
Step 2: Filter the View
- Create a new worksheet.
- Drag your new
Is Duplicate?field from the Data pane directly onto the Filters shelf. - A dialog box will appear. Select True and click OK.
That's it! The entire worksheet is now filtered to show only rows that are part of a duplicate group.
Now you can build out a full report for analysis. Drag Customer Email, Customer Name, Sign Up Date, and any other relevant fields onto the Rows shelf. The resulting table will only list the records you need to investigate, which you can easily export to share with your team.
An Alternative: Using Table Calculations (SIZE() Function)
For quick, ad-hoc analysis within a specific view, table calculations can also identify duplicates. This method is dependent on the structure of your visualization, unlike an LOD which is tied to the data model.
The main function we'll use is SIZE(), which returns the number of rows in a partition.
Step 1: Build Your View
Place the dimension you want to check for duplicates on the view. For instance, put Order ID on the Rows shelf.
Step 2: Create a Table Calculation Filter
Create a calculated field named Is Duplicate (Table Calc) with the formula:
SIZE() > 1Step 3: Apply and Configure the Filter
- Drag this new calculated field onto the Filters shelf, and select True.
- Right-click the field on the Filters shelf and select Compute Using > Order ID. This is the critical step. You're telling Tableau to count the number of rows for each unique
Order IDin your view.
Your view will now be filtered to show only the Order ID values that appear more than once. The main drawback here is that if you change the structure of your visualization (e.g., remove Order ID from the view), the filter will break. LODs are generally more reliable for building permanent dashboards and reports.
Common Pitfalls & Best Practices
- Correctly Define "Duplicate": Before you start, be certain what constitutes a duplicate in your context. Is it just a repeated order ID? Or is it a customer name combined with a transaction date? If your key requires multiple fields, simply add them to your FIXED calculation:
{ FIXED [Customer Name], [Transaction Date] : COUNT([Record ID]) }- Handling NULLs: Data sets often contain rows with NULL or blank values for your chosen identifier. Be aware that Tableau may group all these NULLs, potentially flagging them as a large group of duplicates. You'll often want to filter out NULL values from your key field first.
- Watch Performance on Large Datasets: LOD calculations can be resource-intensive on huge datasets with hundreds of millions of rows. If your workbook is slow, consider adding a context filter (right-click a filter and select "Add to Context") to reduce the amount of data the LOD calculation has to process. For example, filtering for just the last year of data and adding it to the context will speed things up significantly.
Final Thoughts
Identifying and visualizing duplicates in Tableau is a manageable task that moves an essential data validation step out of spreadsheets and into your BI tool. Using FIXED LOD calculations gives you a powerful and reusable way to find, highlight, and filter out duplicate records, paving the way for more accurate and trustworthy analysis.
Before you can even begin thinking about tasks like finding duplicates, the biggest hurdle is often just getting your data into one place. Manually exporting CSVs and stitching together reports is a frustrating time-sink that gets you no closer to actual insights. At Graphed, we simplify this first, biggest step. By connecting directly to your marketing and sales sources - like Google Analytics, Shopify, and social ad platforms - we automate the entire data pipeline. This lets you build real-time dashboards and ask questions in plain English, putting high-value analysis right at your fingertips. Discover a faster way to get answers with Graphed.
Related Articles
What SEO Tools Work with Google Analytics?
Discover which SEO tools integrate seamlessly with Google Analytics to provide a comprehensive view of your site's performance. Optimize your SEO strategy now!
Looker Studio vs Metabase: Which BI Tool Actually Fits Your Team?
Looker Studio and Metabase both help you turn raw data into dashboards, but they take completely different approaches. This guide breaks down where each tool fits, what they are good at, and which one matches your actual workflow.
How to Create a Photo Album in Meta Business Suite
How to create a photo album in Meta Business Suite — step-by-step guide to organizing Facebook and Instagram photos into albums for your business page.