How to Run Statistical Analysis in Excel
Excel is more than just a tool for organizing data in rows and columns, it’s a surprisingly powerful platform for running real statistical analysis. You can use it to uncover trends, test hypotheses, and make data-driven decisions without needing specialized software. This article will show you exactly how to unlock and use Excel's statistics features, covering everything from basic descriptive summaries to more advanced regression analysis.
Getting Started: Activating the Analysis ToolPak
Before you can perform most statistical analyses, you need to enable a free, built-in Excel add-in called the Analysis ToolPak. It contains the tools you'll need for everything we're about to cover. Most users don't have it turned on by default, but activating it takes less than a minute.
For Windows Users:
Open Excel and click on File in the top-left corner, then select Options at the bottom of the left-hand menu.
In the Excel Options window, click Add-ins on the left.
At the bottom of the window, you'll see a dropdown menu next to "Manage." Make sure it says "Excel Add-ins," and then click Go...
Check the box next to Analysis ToolPak and click OK.
For Mac Users:
Open Excel and go to the Tools menu at the top of your screen.
Select Excel Add-ins... from the dropdown menu.
In the window that appears, check the box next to Analysis ToolPak and click OK.
Once activated, a new "Data Analysis" button will appear in the Data tab on your Excel ribbon. This is where you'll find all the statistical tests.
Descriptive Statistics: Your Data’s Fingerprint
The first step in any analysis is understanding the basics of your dataset. Descriptive statistics summarize the main features of your data, giving you a high-level overview in just a few clicks. It gives you values like the mean (average), median, mode, and standard deviation.
Let's use an example. Imagine you have monthly data for an e-commerce store:
Month: Jan, Feb, Mar, etc.
Ad Spend ($): The amount spent on ads.
Sales ($): Total revenue generated.
You want to quickly understand the general performance of your monthly sales. Descriptive statistics are perfect for this.
How to Run Descriptive Statistics:
Click the Data tab, then click the new Data Analysis button on the far right.
A menu will pop up. Scroll down and select Descriptive Statistics, then click OK.
Input Range: Click into this box, then click and drag to select the data you want to analyze. In our example, you'd select the "Sales ($)" column, including the header.
Grouped By: Since our data is in a single column, you'll leave this as "Columns."
Labels in First Row: Check this box since you included the "Sales ($)" header in your selection. This tells Excel not to treat your header as a number.
Output Options: Choose where you want Excel to place the results. A "New Worksheet Ply" is usually the cleanest option. Give the new sheet a name if you like.
Check "Summary statistics": This is the most important part!
Click OK.
Interpreting the Results
Excel will instantly generate a table of results. Here’s what the most important metrics mean:
Mean: The average sales per month. This is your typical monthly performance.
Median: The middle value. If you have some outlier months (like a huge Black Friday sale), the median can give you a better sense of a "normal" month than the mean.
Standard Deviation: This measures how spread out your data is. A low standard deviation means your sales are pretty consistent month-to-month. A high standard deviation means your sales fluctuate significantly.
Minimum & Maximum: Your best and worst sales months on record.
Count: The number of data points (in this case, months) analyzed.
With just this one action, you’ve moved beyond looking at a plain list of numbers and can now formally describe your sales performance. You can quickly answer questions like, "What was our average monthly revenue?" and "How consistent have our sales been?"
Analyzing Relationships with Correlation and Regression
Descriptive statistics are great for looking at one variable. But what if you want to know how two variables relate to each other? For example, "Does our ad spend actually impact our sales?" This is where correlation and regression come in.
Correlation: Finding the Connection
Correlation tells you if two variables move together and in which direction. The result is a number between -1 and 1 (the correlation coefficient).
A value near +1: Indicates a strong positive correlation (as one variable goes up, the other tends to go up). Example: As ad spend rises, sales rise.
A value near -1: Indicates a strong negative correlation (as one variable goes up, the other tends to go down). Example: As customer complaints increase, customer retention decreases.
A value near 0: Indicates little to no correlation.
To run a correlation analysis:
Go to Data > Data Analysis.
Select Correlation and click OK.
For the Input Range, select all your numeric data columns at once, including their labels. For our example, you'd select both the "Ad Spend" and "Sales" columns.
Ensure Labels in First Row is checked.
Choose your output location and click OK.
Excel will generate a small table showing the correlation coefficient between your variables. If the value for Ad Spend and Sales is 0.85, you can say there is a strong positive relationship between them.
Linear Regression: Predicting Outcomes
Regression analysis takes it a step further. It doesn’t just tell you if two variables are related, it helps you model that relationship and make predictions. It tries to find the best-fitting straight line through your data points to describe how a dependent variable (the outcome you want to predict) changes as one or more independent variables (the inputs) change.
In our example, "Sales" is the dependent variable (because it depends on ad spend), and "Ad Spend" is the independent variable.
How To Run A Linear Regression
Navigate to Data > Data Analysis.
Select Regression and click OK.
For the Input Y Range, select your dependent variable's data (the "Sales ($)" column). The 'Y' is what you're trying to predict.
For the Input X Range, select your independent variable's data (the "Ad Spend ($)" column). The 'X' is what you're using to make the prediction.
Make sure Labels is checked.
Choose your output location and click OK.
Interpreting the Regression Output
The regression output can look intimidating, but you only need to focus on a few key numbers to start:
R Square: This tells you how much of the variation in your dependent variable (Sales) is explained by your independent variable (Ad Spend). It’s shown as a value between 0 and 1. An R Square of 0.75 means that 75% of the fluctuation in your monthly sales can be explained by changes in your ad spend, which is a very strong fit.
Coefficients: In the bottom table, look for the coefficient of your independent variable ("Ad Spend"). This number tells you the relationship's size and direction. If the coefficient is 5.5, it means that, on average, for every additional $1 you spend on ads, you can expect sales to increase by $5.50.
P-value: This value tells you if your results are statistically significant. A common rule of thumb is that a P-value below 0.05 is significant. If the P-value for Ad Spend is less than 0.05, you can be reasonably confident that the relationship you've found is not just due to random chance.
Comparing Averages with T-Tests
A t-test is used to determine if there is a significant difference between the means (averages) of two groups. Marketers often use it to analyze the results of A/B tests.
For example, did a new email subject line lead to a higher open rate than the old one? Or did a new landing page design have a better conversion rate than the original? A t-test can answer that definitively.
Types of T-Test:
There are a few types of t-tests, but two of the most common used in marketing are tests that look for differences among paired and unpaired groups.
Paired test: A "paired" two-sample test compares data points from the same sample under different conditions. For example, are students' scores higher before or after a tutoring session?
Unpaired t-test (Two-Sample Test): An "unpaired t-test" compares data points from two different groups. For example, do users click more on link one or link two?
An Example Unpaired T-Test Scenario: A/B Testing a Hypothesis:
A landing page had its CTA color updated - is it actually working?
Test: Comparing two versions of a landing page (Version A vs. Version B)
Null hypothesis: A statement or default position that there is no significant difference between the two versions
Data Collected: Daily conversion rates gathered for each version over 15 days
Our Goal: Determine if the average conversion rate for Version B is greater than Version A
Running the T-Test
Navigate to Data > Data Analysis.
Choose "t-Test: Two-Sample Assuming Unequal Variances."
Input Ranges: Assign ranges for both versions as Variable 1 and Variable 2.
Choose your output location and click OK.
Interpreting the T-Test Results
Excel creates an analysis summary with lots of details. Here’s what to focus on:
Mean: Look at each sample's mean value to assess the hypothesis.
P-value: This is the ultimate answer to your test question. A p-value below 0.05 is typically considered significant, allowing you to reject the null hypothesis and conclude that the new CTA is indeed driving different results.
Beyond the ToolPak: Useful Statistical Functions
Excel also has many statistical functions built-in that you can use directly in any cell:
AVERAGE(range): Calculates the mean of a range.
STDEV.S(range): Calculates the standard deviation based on a sample.
CORREL(range1, range2): Calculates the correlation coefficient between two data sets.
COUNTIFS(range, criteria): Counts the number of cells within a range that meet a specified condition.
Final Thoughts
From summarizing performance to running predictive models, Excel’s Data Analysis ToolPak offers easy access to tools usually found in specialized statistical programs. This allows anyone to turn a plain data sheet into actionable insights, helping to back up business decisions with real numbers.
Manual reporting and analysis are great for one-off projects. However, they often lead to more time spent wrangling spreadsheets than focusing on what the metrics mean. Graphed solves this by connecting all your marketing and sales data sources and letting you ask questions using plain English. We turn data wrangling into three-second conversations, updating dashboards automatically in real-time - no more CSV exports needed.
Hope this helps!