When to Use Union in Tableau?
Knowing whether to combine datasets in Tableau with a join or a union is a fundamental skill for clean and accurate analysis. While joins add more columns to your data, unions are all about adding more rows. This guide provides a clear look at what a union is, how it works, and the practical scenarios where it is the perfect tool for the job.
What Exactly is a Union in Tableau?
In the simplest terms, a union in Tableau is the process of appending rows of data from one table to another. Think of it like stacking Lego bricks on top of each other. You start with one block (your first table) and stack another identical block right on top (your second table). The result is a taller structure (a single table with more rows) that maintains the original width and shape.
The golden rule for a successful union is that the tables you're combining should have the same structure. This means they generally need to have:
- The same number of columns.
- Columns with the same names.
- Columns that contain the same type of data (e.g., a "Sale Date" column should be a date in all tables, not a text string in one and a date in another).
When you perform a union, Tableau also adds a helpful new column to your data source, often named Table Name or Sheet. This column acts as a label, telling you which original file or spreadsheet tab each row of data came from. This is incredibly useful for filtering or comparing the original datasets after you’ve combined them.
Union vs. Join: The Critical Difference
One of the most common points of confusion for those new to data prep is the difference between a union and a join. Mixing them up can lead to messy data and incorrect analysis. The distinction is straightforward once you grasp the core purpose of each.
Here’s the difference at a glance:
- Unions add rows. Use a union when you have tables with the same columns that you want to stack vertically. The goal is to make your table longer.
- Joins add columns. Use a join when you have tables with different columns that share a common field (a key) and you want to combine them horizontally. The goal is to make your table wider.
A Practical Example of a Union
Imagine you have separate Excel files for sales data from each quarter of the year: Q1_Sales.csv, Q2_Sales.csv, Q3_Sales.csv, and Q4_Sales.csv. Each file has the exact same columns: OrderID, SaleDate, Product, and Amount.
To analyze the entire year's sales, you would union these files. The result would be a single, long table containing all the rows from all four quarterly files, allowing you to see sales trends throughout the year.
A Practical Example of a Join
Now, let’s say you have two different tables. One is a Customers table with the columns CustomerID and CustomerName. The other is an Orders table with the columns OrderID, CustomerID, and OrderTotal.
You want to see which customer placed each order. To do this, you would join these two tables using the common CustomerID field. The result is a new, wider table that includes columns from both tables: CustomerID, CustomerName, OrderID, and OrderTotal.
When to Use a Union: 4 Common Scenarios
The "when" is more important than the "how." Knowing the right situations to use a union will make your data preparation workflow much faster and more effective. Here are four scenarios where a union is the ideal solution.
1. Combining Time-Based Data Files
This is arguably the most common use case. Many reporting systems and manual processes export data in time-based chunks. You might have daily, weekly, monthly, or yearly files that you need to piece together for long-term analysis.
- Example: A marketing team downloads a Google Ads performance report on the first of every month. After a year, they have 12 separate files (
Jan-Ads.csv,Feb-Ads.csv, etc.). To build a dashboard showing year-over-year ad performance, they use a union to stack all 12 files on top of one another.
2. Consolidating Data from Different Regions or Departments
Larger organizations often segment their data by department, business unit, or geographic region. Each division might maintain its own sales, inventory, or operations data in separate files, but they all follow a standardized company format.
- Example: A retail company has sales data for its North American, European, and Asian markets in three separate spreadsheet tabs:
NA_Sales,EU_Sales, andAPAC_Sales. To create a global sales dashboard for the executive team, the analyst uses a union to combine the data from these three tabs into a single dataset.
3. Data Split Across Multiple Spreadsheet Tabs
Sometimes, data isn't in separate files but simply in different tabs within the same Excel workbook. This is common with survey results, event registration lists, or manually tracked logs, where each tab might represent a different month, event, or category.
- Example: An event manager tracks registrant information in a single Excel file, with a separate sheet for each event: "Spring Conference," "Summer Webinar," and "Fall Workshop." All sheets have the same columns (Name, Email, Company, Registration Date). Using a union in Tableau lets them easily combine all registrants into one master list for comprehensive analysis.
4. Using "Wildcard" Unions for Efficiency
What if you have hundreds of daily log files to analyze? Dragging and dropping each one would be a nightmare. This is where Tableau’s wildcard union shines. A wildcard union lets you automatically union all files in a folder that match a specific naming pattern.
- Example: A web server generates daily traffic logs named
log_2023-10-01.csv,log_2023-10-02.csv, and so on. Instead of adding each file manually, you can set up a wildcard union to find all files that matchlog_*.csv. Tableau will automatically grab every file that fits this pattern and stack them together. If a new file is added to the folder tomorrow that matches the pattern, Tableau will include it the next time the data is refreshed.
A Step-by-Step Guide to Creating a Union in Tableau
Bringing tables together with a union in Tableau Desktop is a straightforward, visual process. Here’s how you do it.
Step 1: Connect to Your Data
First, open Tableau and connect to your data source. This could be a Microsoft Excel file, a collection of text (.csv) files in a folder, a Google Sheet, or another supported connector.
Step 2: Drag Your First Table to the Canvas
From the Data Source page, you'll see your available sheets or tables in the left-hand pane. Drag the first file (e.g., Sales_January) onto the logical canvas where it says "Drag sheets here."
Step 3: Add More Tables to the Union
Now, drag your second table (e.g., Sales_February) from the left pane and hover it directly below the first table. You will see an orange drag-and-drop target that says "Drag table to union." Release the table there. Tableau will "stack" the data, and an orange union icon will appear on your combined table to show it's a union.
You can repeat this process for any additional tables or sheets you need to include.
Step 4: Review the Combined Data
Once you’ve unioned your tables, look at the data grid at the bottom of the screen. You should notice a new column called Table Name (or Sheet) has been added. This column identifies the original source of each row, which is essential for verifying your work and for later analysis.
An Easier Way: Using the Wildcard Union
For folders with many consistently named files:
- On the Data Source tab, find the "New Union" button in the left-hand pane and drag it onto the canvas.
- A dialog box will appear. Select the "Wildcard" option.
- Define your search criteria. You might include all sheets in a workbook or search for files in a directory that match a pattern, like
Sales_*-2023.csvto get all of the sales files from 2023. - Click "OK," and Tableau will automatically find and union all matching files.
Common Union Problems and How to Fix Them
While unions are powerful, you may encounter a few common snags. Here’s what to look out for.
- Mismatched Column Names: If
Sales_Januaryhas a column named "SaleAmount" andSales_Februaryhas "Amount_Sold," Tableau will treat these as two separate columns, leading to null values. The fix: Rename the columns to be identical either in the source files or directly within Tableau's Data Source pane before creating the union. - Different Data Types: If one file's "Date" column is stored as text and another's is a proper date format, your analysis will break. The fix: The best solution is to standardize the data type in the source files. If you can't, you can try changing the data type directly in Tableau after the union is made.
- Unintentional Duplicates: A union stacks everything. If a few rows of data from January accidentally appear in your February file, the union will create duplicate records. The fix: Be mindful of your source data. You may need to clean it before bringing it into Tableau or use calculated fields and filters later to handle potential duplicates.
Final Thoughts
Unions are essential for vertical data integration in Tableau, allowing you to stack datasets with the same structure into a single, comprehensive table for analysis. Use them whenever you're faced with data split across different files or sheets by time, region, or category, and remember they make your tables longer - unlike joins, which make them wider.
Of course, prepping data and making sure all your sources are correctly combined is often the most time-consuming part of analytics. We built Graphed to remove this manual work. You can connect all your sales and marketing platforms in seconds - whether it's Shopify, Google Analytics, or Salesforce - and then just describe what you want to see. For instance, ask "Combine my Facebook and Google Ads data and show me total spend vs. conversions by campaign," and Graphed builds a live dashboard for you automatically, saving you from the hassle of both joins and unions.
Related Articles
How to Enable Data Analysis in Excel
Enable Excel's hidden data analysis tools with our step-by-step guide. Uncover trends, make forecasts, and turn raw numbers into actionable insights today!
What SEO Tools Work with Google Analytics?
Discover which SEO tools integrate seamlessly with Google Analytics to provide a comprehensive view of your site's performance. Optimize your SEO strategy now!
Looker Studio vs Metabase: Which BI Tool Actually Fits Your Team?
Looker Studio and Metabase both help you turn raw data into dashboards, but they take completely different approaches. This guide breaks down where each tool fits, what they are good at, and which one matches your actual workflow.