How to Backfill Google Analytics Data in BigQuery
So Google Universal Analytics finally shut down, and now all your historical website data is locked away, separate from your new GA4 reports. This split makes year-over-year analysis and long-term trend spotting a serious headache. The best solution is to get your old Universal Analytics (UA) data into Google BigQuery, Google's data warehouse, so it can live alongside your ongoing GA4 data. This article will show you exactly how to backfill that historical UA data, ensuring you never lose sight of your long-term performance.
Why Backfill Universal Analytics Data to BigQuery?
On July 1, 2023, Universal Analytics properties stopped processing new data. While you could access the interface for a while longer, a permanent solution is needed to preserve that data for future analysis. Because GA4 uses an entirely different event-based data model, you can't just "import" your old session-based UA history into the GA4 interface. This is where BigQuery comes in.
By exporting your historical UA data and your current GA4 data into the same BigQuery project, you create a single source of truth for your web analytics. Here's why that's so valuable:
Uninterrupted Long-Term Analysis: The biggest benefit is the ability to analyze performance trends over many years. You can query data from 2020 through today without having to switch between different platforms or stitch together disconnected spreadsheets.
Escape Data Sampling: If you had a high-traffic website, you likely ran into data sampling in the Universal Analytics interface or via its API, where reports were generated based on a subset of your data. The export process into BigQuery allows you to retrieve raw, unsampled data, giving you a more accurate picture of your historical performance.
Own Your Data: Once the data is in your BigQuery project, it's yours forever. You no longer have to worry about Google eventually closing off access to the Universal Analytics platform. It's stored securely in your own Google Cloud environment.
Unlock Advanced Insights: BigQuery allows you to join your historical analytics data with other business data. You can blend website behavior data from UA with sales data from your CRM, ad spend from Facebook Ads, and subscription data from Stripe to get a full 360-degree view of your customer journey.
Getting Started: Prerequisites
Before you begin the backfill process, you'll need a few things in place. Don't worry, this part is pretty straightforward.
A Google Cloud Platform (GCP) Account: BigQuery is a part of GCP. You'll need an account with billing enabled. While BigQuery has a generous free tier for storage and queries, large data exports to and from the platform can incur costs, so having billing active is essential.
Editor Access to Your BigQuery Project: You'll need sufficient permissions within your GCP account to create new datasets and tables in BigQuery.
Access to your Universal Analytics Property: You'll need viewer or editor permissions for the historical UA property you want to export data from.
An ETL/ELT Tool or a Google Sheet with an Add-On: This is the magic ingredient that moves the data. We’ll cover both approaches below, one automated and one more manual.
Method 1: The Automated Way with an ETL Tool
If you're not a coder and want the most reliable, hands-off method, using a third-party ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) tool is your best bet. These services specialize in connecting to APIs (like a Universal Analytics property) and piping data into a destination (like BigQuery) automatically.
Tools like Fivetran, Stitch, or Supermetrics are popular choices. They handle the complexity of API requests, data pagination, rate limits, and schema creation for you.
Step-by-Step Guide with an ETL Tool
Choose and Sign Up for a Tool: Select an ETL provider that lists Universal Analytics as a source and BigQuery as a destination. Most offer free trials, which may be more than enough to perform a one-time historical backfill.
Create a New Data Pipeline: Inside your chosen tool's dashboard, you’ll start by setting up a new "pipeline" or "connector." This is just the connection between your source (UA) and destination (BigQuery).
Connect Your Universal Analytics Source: You'll be prompted to authenticate your Google account that has access to Universal Analytics. Select the specific UA Account, Property, and View that you want to export.
Connect Your BigQuery Destination: Next, you'll authenticate your Google Cloud Platform account and authorize the tool to write data to your BigQuery project. You'll be asked to specify the BigQuery project ID and a new or existing dataset location for the data.
Configure Your Report and Date Range: This is the key step. Since you are doing a backfill, you can select the dimensions and metrics you care about most (e.g., Sessions, Users, Transactions, Bounces, Source / Medium, etc.). Crucially, you will set the date range to cover your entire history, for instance, from
2015-01-01to2023-06-30.Start the Historical Sync: Once everything is configured, you'll start the initial sync. The ETL tool will then begin fetching all the data from the UA API and loading it, batch by batch, into a new table in your specified BigQuery dataset. Depending on how many years of data you have, this could take anywhere from a few minutes to several hours. The tool handles the entire process in the background.
The main advantage here is simplicity and reliability. The main disadvantage is that these tools are subscription-based, so there's a cost involved, although it's often small compared to the engineering hours it would take to build a custom solution.
Method 2: The Semi-Manual Way with Google Sheets
If you're on a tight budget and are comfortable working in a spreadsheet, you can use the official Google Analytics add-on for Google Sheets as an intermediary. It's a great free option but requires more manual effort, especially for websites with years of data.
The main limitation is that you'll likely hit API quotas or sampling limits with a single report. You often have to break your backfill into chunks (e.g., pull one year or one quarter at a time).
Step-by-Step Guide with Google Sheets
Install the Add-on: Open a new Google Sheet. Go to Extensions > Add-ons > Get add-ons and search for "Google Analytics." Install the official add-on by Google.
Create a New Report: Once installed, go to Extensions > Google Analytics > Create new report. A sidebar will appear. Give your report a name (e.g., "UA_Backfill_2022") and select the correct UA Account, Property, and View.
Configure Metrics and Dimensions: In the configuration fields, choose the data points you want to export. For example:
Metrics:
ga:sessions,ga:users,ga:transactions,ga:pageviewsDimensions:
ga:date,ga:sourceMedium,ga:campaign,ga:landingPagePath
Set Your Date Range: As mentioned, you may need to do this in batches. Set the "Start Date" and "End Date" for a specific period, such as
2022-01-01to2022-12-31.Run the Report: Click "Create Report" and a new tab named
Report Configurationwill appear. From the main Extensions dropdown again, select Google Analytics > Run reports. The add-on will now query the API and populate a new sheet with the requested data. An important column to check for is "Contains Sampled Data," as this will tell you if your data is accurate or not.Save as CSV and Upload to BigQuery: Once the data is in your sheet, go to File > Download > Comma-separated values (.csv). Then, in the BigQuery UI, navigate to your dataset, click "Create Table," and choose "Upload" from your computer as the source. Select your CSV file, give the table a name (e.g.,
ua_data_2022), and use BigQuery's "Auto-detect schema" feature.Repeat: You will have to repeat steps 3-6 for each time period you need to backfill. Afterwards, in BigQuery, you can combine these tables into one master "universal_analytics" table.
Putting It All Together: Querying UA and GA4 Data
After your backfill is complete, your BigQuery dataset will contain your historical UA tables and the daily export tables from your native GA4-BigQuery link.
Because the schemas are different, you can't just query them directly together. You need to use SQL to normalize the column names and formats first, then join them with a UNION ALL statement. The goal isn't to perfectly match metrics (since a "session" is defined differently), but to create directionally consistent trendlines.
Here is a simplified example of how you might combine daily sessions from UA history and GA4 data:
This query creates two temporary tables (CTEs), one for UA and one for GA4, normalizes the date and session counts for each, and then stacks them on top of each other. Running this would give you a single, continuous, day-by-day count of sessions from the beginning of your UA history up to yesterday's GA4 data.
Final Thoughts
Preserving your historical data by backfilling Universal Analytics into BigQuery is one of the most important data hygiene tasks you'll tackle this year. It's the only reliable way to enable long-term, year-over-year reporting in a post-UA world. Whether you choose an automated ETL tool for simplicity or the manual Google Sheets method for cost-effectiveness, the result is having all your critical website data securely under your control.
Of course, once your data is consolidated, the challenge shifts from collection to analysis. Manually writing SQL queries and building visualizations can be time-consuming. We built Graphed to simplify this last step. After connecting to your data sources like BigQuery, you can ask questions in plain English - like "Show me sessions by month for the last three years in a line chart" - and get sharable, real-time dashboards instantly. It automates the reporting drudgery, letting you focus on the insights, not just the setup.