How to Make a Sankey Diagram in Tableau with AI

Cody Schneider

Creating a Sankey diagram in Tableau is a powerful way to visualize flow, but it's also famous for being one of the trickiest charts to build. The process often involves complicated data preparation and a maze of table calculations. This article will walk you through a much simpler method, showing you how to leverage AI to handle the difficult data prep so you can focus on building the actual visualization in Tableau.

What is a Sankey Diagram and Why Should You Use One?

A Sankey diagram is a type of flow chart where the width of each stream is proportional to the quantity of the flow. Think of it like visualizing a river system, wider rivers carry more water. In business analytics, instead of water, you're tracking customers, revenue, website traffic, or budget allocations.

They are exceptionally good at answering questions about complex processes, showing how a single value is distributed through different stages. Here are a few practical examples:

  • Website Analytics: Visualize how visitors arrive on your site (e.g., Organic Search, Social Media, Paid Ads) and which pages they navigate to from there. This helps you see which channels are driving traffic to your most important content.

  • Customer Journey Mapping: Track how customers move through your sales funnel, from their first touchpoint (like an ad) through to conversion (like a purchase). You can instantly spot drop-off points or the most valuable paths.

  • Budget & Expense Tracking: Show how a company-wide budget is allocated across different departments and then further broken down into specific projects or expense categories.

Unlike a pie chart or bar chart that shows simple distributions, a Sankey tells a story about movement and transformation, making it a favorite for process analysis.

The Old-School Challenge: Building a Sankey Manually

If you've ever tried to build a Sankey in Tableau without a guide, you likely know the frustration. Tableau doesn't offer a "Sankey" chart in its "Show Me" panel. To create one manually, you have to reshape your data in a very specific way. This process, often called "data densification," involves:

  • Duplicating your dataset to create points for an origin and a destination.

  • Creating dozens of extra rows (called "padding") between each actual data point to give Tableau enough points to draw a smooth curve.

  • Writing complex calculations in Tableau like the Sigmoid function (to plot the 'S' curve), T-values (to position points along the curve), and various nested table calculations to rank and size the flows correctly.

This is a time-consuming and error-prone process. It requires a deep understanding of advanced Tableau features and a lot of patience. One small error in your Excel prep or a slight mistake in a Tableau calculation can break the entire visualization.

Simplifying the Process: Using AI for Data Preparation

The single biggest hurdle in building a Sankey diagram is preparing the data. This is where AI completely changes the game. Instead of spending hours in spreadsheets or with Tableau Prep, you can use an AI data analysis tool to automate the entire data restructuring process with simple, natural language prompts.

The goal is to turn a simple source-destination table into a densified table that Tableau can use to plot the curves. Here's what that process looks like:

Step 1: Start With Simple Raw Data

Your initial dataset only needs to contain three things: a starting point (your source or dimension 1), an ending point (your destination or dimension 2), and a value that determines the size of the flow. For a website traffic analysis, it would look something like this:

Source_Channel

Landing_Page

Sessions

Organic Search

/blog

1200

Paid Social

/pricing

850

Email Marketing

/features/new

600

Direct

/homepage

1500

Organic Search

/pricing

900

Step 2: Ask an AI Tool to Reshape the Data

With a modern AI analytics tool, you can simply connect your data source and describe your goal in plain English. For example, you might ask:

“Reshape this data for a Sankey diagram showing the flow of sessions from Source_Channel to Landing_Page.”

The AI understands the logic behind data densification. Behind the scenes, it performs the complex steps that you used to have to do manually:

  • It duplicates each record to establish start and end points for the flow.

  • It generates all the "padding" rows needed. For each original row, it might create 49 or 99 new rows to form a smooth curve.

  • It automatically creates a modeling field, often called T or Padded, which contains the values (-6 to 6, or 1 to 49, for example) needed for the sigmoid function calculation within Tableau.

Step 3: Export Your Tableau-Ready Data

The AI tool's output is a new, perfectly structured table that is ready to be loaded directly into Tableau. It will look something like this, expanding our simple 5-row example into hundreds of rows:

Source_Channel

Landing_Page

Sessions

Path

T

Organic Search

/blog

1200

1

-6.0

Organic Search

/blog

1200

1

-5.75

...

...

...

...

...

Organic Search

/blog

1200

1

6.0

Paid Social

/pricing

850

2

-6.0

...

...

...

...

...

Every calculation you'd normally slave over is already done. All the complex data preparation has been handled in seconds, letting you jump straight into the fun part: building the chart in Tableau.

Step-by-Step Guide to Building the Chart in Tableau

Once you have your AI-prepared data, creating the Sankey diagram in Tableau becomes remarkably straightforward. Follow these steps.

1. Connect to Your Data

Open Tableau and connect to the new data file generated by your AI tool. You should see all the fields, including your dimensions (e.g., Source_Channel, Landing_Page), your measure (Sessions), and the special fields created by the AI (like T).

2. Create the Necessary Calculated Fields

You still need a few calculated fields within Tableau to draw the curves and define their size, but these are simple formulas now that the hard work is done.

Create the following four calculated fields:

i. Index

This will help us position elements. It's a simple calculation.

ii. Sigmoid_Curve

This formula uses the T field from your prepped data to create the 'S' shape of the flow lines.

iii. Flow_Size

This calculation sets the thickness of each flow based on your measure (in this case, Sessions).

Note: This Level of Detail (LOD) expression calculates the percentage of the total for each flow, which helps in scaling the lines appropriately.

iv. Ranked_Position

This is a more advanced calculation that positions each curve correctly on the Y-axis so they don't overlap. It ensures thinner streams are stacked properly on top of thicker streams.

3. Build the Visualization

Now, let's assemble the calculated fields on the worksheet.

  1. Drag your T field from the Data Pane to Columns. Right-click it and ensure it's set to Dimension.

  2. Drag your Sigmoid_Curve calculated field to Rows.

  3. Change the chart type in the Marks card from 'Automatic' to Line.

  4. Drag your source and destination dimensions (Source_Channel and Landing_Page) to the Detail button on the Marks card. This will create a separate line for each flow. You should see a set of straight lines.

  5. Drag your Ranked_Position calculated field onto the Rows shelf, right next to Sigmoid_Curve.

  6. Right-click on the Ranked_Position pill in Rows, go to Compute Using, and select your destination field (Landing_Page). Do the same for the Sigmoid_Curve pill.

  7. Drag your Flow_Size field to the Size button on the Marks card. Right-click it and select Compute Using Landing_Page. The lines should now have varying thicknesses. Voilà! You have the core of a Sankey diagram.

4. Add Source and Destination Bars (Nodes)

A true Sankey needs bars on the left and right to represent the total flow in and out of each category.

  1. Drag Ranked_Position to the Rows shelf again to create a second chart.

  2. In the Marks card for this new chart (you’ll see Ranked_Position (2)), change the chart type to Gantt Bar.

  3. Remove T from the Columns shelf for this Gantt chart view only.

  4. Create one final calculated field, let’s call it Neg One: -1. Drag this new field to the Size mark on the Gantt chart marks shelf.

  5. Right-click the second Ranked_Position pill on the Rows Shelf and select Dual Axis. Synchronize the axes by right-clicking the right-side axis and selecting Synchronize Axis.

  6. Now, in the Gantt chart marks shelf, add your Source_Channel and then Flow_Size to the Label Marks card. This shows what the flow values are. Then, to create an end to your Sankey, drag the Sigmoid_Curve to your columns. Right click on the Filter and find the Special Pane, select "non-null values".

  7. Finally, polish your visualization by hiding headers, removing grid lines, and adding colors to your line chart based on your Source_Channel field to make it easy to follow the flows. Adjust your tooltips to show useful information on hover.

Final Thoughts

Sankey diagrams offer unparalleled insight into flow-based data, but their technical complexity in Tableau has historically made them inaccessible for many. By offloading the cumbersome data densification and preparation steps to an AI tool, the entire process becomes faster, more accurate, and far less intimidating. You get to skip straight to the visualization and analysis.

Automating away this kind of manual reporting work is exactly why we created Graphed. We connect directly to your marketing and sales data sources (like Google Analytics, Salesforce, Shopify, and others) and allow you to prepare data, build reports, and create entire dashboards using simple, conversational language. Instead of spending hours wrangling data for one visualization, you can get a complete, real-time-updated dashboard answering your business questions in just a few seconds.