How to Winsorize Data in Excel

Cody Schneider6 min read

Dealing with outliers in your dataset can skew your analysis, leading to misleading conclusions. Instead of just deleting these extreme values, you can use a technique called Winsorizing to tame their impact without throwing away the data. In this tutorial, you'll learn a simple, step-by-step method to Winsorize your data directly in Microsoft Excel.

GraphedGraphed

Build AI Agents for Marketing

Build virtual employees that run your go to market. Connect your data sources, deploy autonomous agents, and grow your company.

Watch Graphed demo video

What is Winsorizing and When Should You Use It?

Winsorizing is a statistical method for handling outliers. Instead of removing extreme values from your dataset (a process known as trimming), you replace them with a less extreme value. Specifically, you cap the outliers at a certain percentile of the data.

For example, if you perform a 90% Winsorization on your data, you're doing two things:

  • Any data point below the 5th percentile is replaced by the value at the 5th percentile.
  • Any data point above the 95th percentile is replaced by the value at the 95th percentile.

The middle 90% of your data remains unchanged. This approach keeps the dataset the same size but pulls the most extreme observations in, reducing their ability to distort statistical measures like the mean and standard deviation.

Why Not Just Delete Outliers?

Sometimes an outlier is a genuine, albeit unusual, data point. Removing it might mean losing valuable information. Other times, an outlier is the result of a measurement or input error. Winsorizing offers a conservative middle ground. It acknowledges the presence of the extreme value but smoothly lessens its influence, leading to more robust statistical analysis, especially in smaller datasets where every data point counts.

You should consider using this technique when:

  • Working with financial data, where a single extreme stock return or sales event can skew monthly averages.
  • Analyzing survey results where some respondents may have given unusually high or low answers.
  • Cleaning experimental data that might contain measurement errors.

Free PDF · the crash course

AI Agents for Marketing Crash Course

Learn how to deploy AI marketing agents across your go-to-market — the best tools, prompts, and workflows to turn your data into autonomous execution without writing code.

Getting Your Data Ready in Excel

Before you begin, make sure your data is organized in a single column. It’s also a great practice to work on a copy of your dataset. Create a duplicate of your worksheet or copy your data column to a new sheet to ensure your original data remains untouched. This gives you a safety net in case you make a mistake or want to refer back to the raw numbers.

For this tutorial, let’s imagine we have a dataset of 150 customer order values in column A, from cell A2 to A151.

How to Winsorize Data: The Step-by-Step Method

We'll use a combination of the PERCENTILE.INC and IF functions to accomplish this. This method is compatible with almost all versions of Excel.

Step 1: Decide on Your Winsorization Level

First, determine what percentage of data you want to cap at the top and bottom. A common choice is 5% on each end (for a 90% total Winsorization), but you might choose 1% or 10% depending on how extreme your outliers are.

For our example, we will stick with capping the bottom 5% and top 5% of our data.

GraphedGraphed

Build AI Agents for Marketing

Build virtual employees that run your go to market. Connect your data sources, deploy autonomous agents, and grow your company.

Watch Graphed demo video

Step 2: Calculate the Percentile Thresholds

Next, we need to find the specific values in our dataset that correspond to our chosen percentiles. Excel's PERCENTILE.INC function is perfect for this. This function includes the 0th and 100th percentiles, which aligns well with our goal of capping data at the boundaries.

Click on an empty cell (let's say D2) to calculate the lower bound (the 5th percentile). Use this formula:

=PERCENTILE.INC(A2:A151, 0.05)

  • A2:A151 is the range containing our data. Be sure to adjust this to match your own data range.
  • 0.05 represents the 5th percentile.

Now, in another empty cell (like D3), calculate the upper bound (the 95th percentile) with this formula:

=PERCENTILE.INC(A2:A151, 0.95)

You now have the exact values that will serve as the "floor" and "ceiling" for your dataset.

Step 3: Apply the Winsorizing Logic with an IF Formula

Now we will create a new column with the Winsorized data. The logic we need to apply for each cell is as follows:

  1. Check if the order value is less than our lower bound (the 5th percentile value).
  2. If it is, replace it with the lower bound value.
  3. If it's not, then check if the value is greater than our upper bound (the 95th percentile value).
  4. If it is, replace it with the upper bound value.
  5. If it is neither, then just keep the original value.

We can translate this logic perfectly using a nested IF statement. In cell B2 (right next to your first data point), type the following formula:

=IF(A2< $D$2, $D$2, IF(A2> $D$3, $D$3, A2))

Important Note: The dollar signs ($) in $D$2 and $D$3 create an absolute reference. This locks those cell references, so when you drag the formula down, the references to the threshold values won't change.

Step 4: Create the Winsorized Data Column

With the formula entered in cell B2, simply click the small green square (the "fill handle") at the bottom-right corner of the cell and drag it all the way down to the end of your dataset (cell B151 in our example). You now have a complete, Winsorized dataset in Column B, ready for analysis.

Free PDF · the crash course

AI Agents for Marketing Crash Course

Learn how to deploy AI marketing agents across your go-to-market — the best tools, prompts, and workflows to turn your data into autonomous execution without writing code.

Verifying Your Results

A quick sanity check is always a good idea. You can compare some basic descriptive statistics between your original and new datasets to see the effect of Winsorizing. Let’s calculate the Minimum, Maximum, Average, and Standard Deviation for both columns.

In some empty cells, you can use these formulas:

  • =MIN(A2:A151)
  • =MIN(B2:B151)
  • =MAX(A2:A151)
  • =MAX(B2:B151)
  • =AVERAGE(A2:A151)
  • =AVERAGE(B2:B151)
  • =STDEV.S(A2:A151)
  • =STDEV.S(B2:B151)

You should notice a few things:

  • The minimum value in your Winsorized data (Column B) is now equal to your 5th percentile threshold (the value in D2).
  • The maximum value in your Winsorized data is now equal to your 95th percentile threshold (the value in D3).
  • The average in Column B will likely have shifted slightly toward the center of your data.
  • The standard deviation (a measure of spread) in Column B will be smaller than in Column A, as you've reduced the impact of the most extreme values.

Seeing these changes confirms that the Winsorization process worked correctly.

Final Thoughts

Winsorizing your data is an effective way to manage outliers in Excel without resorting to deleting potentially useful observations. By using a straightforward combination of PERCENTILE.INC and nested IF functions, you can quickly create a new dataset that is more resilient to extreme values, paving the way for more accurate and reliable analysis.

Cleaning and preparing data in a spreadsheet is a crucial first step, but it’s often just the beginning. The real goal is to turn that information into insights. To help automate that process, we built Graphed. After tidying up your data, you can connect your spreadsheets — or live data sources like Google Analytics and Shopify — and start building dashboards just by describing what you want to see. Instead of manual chart-building, you get real-time reports in seconds, letting you move straight from clean data to clear insights.

Related Articles