How to Import XML Data in Google Sheets

Cody Schneider8 min read

Bringing live data from websites directly into Google Sheets can save you hours of manual copy-pasting, and the IMPORTXML formula is the perfect tool for the job. It lets you pull specific, structured information from any XML or HTML page - like product prices, blog post titles, or stock market data - straight into your spreadsheet. This article will walk you through exactly how the IMPORTXML function works, from the basic syntax to practical examples you can start using today.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

What Exactly is XML?

Before diving into the formula, it helps to understand what you're working with. XML stands for eXtensible Markup Language. Its primary purpose isn’t to display data, but to store and transport it in a structured way that both humans and machines can read.

If you're familiar with HTML, you'll find XML looks similar - it uses tags enclosed in angle brackets (<,>,). The key difference is that HTML has predefined tags that tell a browser how to display content (like <h1>, for a heading or <p>, for a paragraph). In contrast, XML tags are defined by the author and are used to describe the data. They don't have a predetermined function, they simply act as labels.

For example, an XML file for a simple bookstore might look like this:

<pre><code>&lt,bookstore&gt, &lt,book category="FICTION"&gt, &lt,title&gt,The Great Gatsby&lt,/title&gt, &lt,author&gt,F. Scott Fitzgerald&lt,/author&gt, &lt,year&gt,1925&lt,/year&gt, &lt,/book&gt, &lt,book category="SCIENCE"&gt, &lt,title&gt,Cosmos&lt,/title&gt, &lt,author&gt,Carl Sagan&lt,/author&gt, &lt,year&gt,1980&lt,/year&gt, &lt,/book&gt, &lt,/bookstore&gt, </code></pre>

Here, tags like <,book&gt,, <,title&gt,, and <,author&gt, give structure and meaning to the information. To extract this information with IMPORTXML, you need a way to tell Google Sheets which specific piece of data to grab. That’s where XPath comes in.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

Understanding XPath: The GPS for Your Data

XPath (XML Path Language) is a query language used to navigate through the elements and attributes within an XML document. Think of it as an address or a file path for your data. You use it to pinpoint exactly which element (or set of elements) you want to extract.

Writing an XPath query can seem intimidating, but the basics are straightforward. Here are a few common expressions you'll use:

  • //tagname - This is the most common and useful one. The double slash // means "find all," regardless of where it is in the document. So, //title would select every single <title>, element from our bookstore example.
  • //tagname[@attribute='value'] - This selects tags that have a specific attribute. For instance, //book[@category='FICTION'] would select the book element for "The Great Gatsby" but ignore "Cosmos."
  • //tagname/anothertagname - The single slash / is used to navigate to a direct child. For example, if you only wanted to get the titles of fiction books, you could write //book[@category='FICTION']/title.

Mastering these basics will allow you to pull almost any information you need from a webpage.

Anatomy of the IMPORTXML Formula

Now that you understand the pieces, let's put them together. The IMPORTXML formula in Google Sheets is surprisingly simple and has only two arguments:

<pre><code>=IMPORTXML(url, xpath_query)</code></pre>

  • url: This is the link to the webpage or XML file containing the data you want to import. The URL must be enclosed in quotation marks, such as "https://www.example.com", or it can be a reference to a cell that contains the URL, like A1.
  • xpath_query: This is your XPath expression (also in quotation marks or a cell reference) that tells the formula exactly which data to find and extract from the URL.

Step-by-Step Guide: How to Use IMPORTXML

Let's walk through a real-world example to see it in action. A common marketing task is to retrieve all of the URLs from a website's sitemap. Sitemaps are perfect for this because they are cleanly formatted XML files.

Step 1: Find a URL with XML or HTML Data

First, find the sitemap of a website. Most are located at domain.com/sitemap.xml. For our example, we'll use the sitemap for Google's own blog: https://blog.google/sitemap-posts.xml.

Step 2: Inspect the Data's Structure

Open the URL in your browser. You'll see the XML structure. In this sitemap, each URL is contained within a <loc> tag, which itself is inside a <url> tag.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

3: Write Your XPath Query

To grab every webpage link, we need to select all the <loc> tags. Based on what we learned, the XPath query to do this is simply: "//loc"

4: Put the Formula in Google Sheets

Now, open a blank Google Sheet. In cell A1, or any other cell, type the complete formula:

<pre><code>=IMPORTXML("https://blog.google/sitemap-posts.xml", "//loc")</code></pre>

When you press Enter, Google Sheets will visit the URL, find every <loc>, tag, and populate the column with all the URLs it finds. It’s that easy!

Practical Use Cases for IMPORTXML

Once you're comfortable with the basics, you can use IMPORTXML for all kinds of data collection tasks across different fields.

Use Case 1: Scraping a Website's Social Links

Let’s say you want to find all the social media links in the footer of a competitor’s website. You can achieve this with a clever XPath query.

  1. Use your browser's "Inspect" tool (right-click the element on a page and choose "Inspect") to find the HTML structure around the social links.
  2. You might notice all the relevant links contain facebook.com or twitter.com.
  3. You can write an XPath query to look for all link (<a>) tags whose href attribute contains a specific string.

<pre><code>=IMPORTXML("https://www.competitor.com", "//a[contains(@href,'twitter')]/@href")</code></pre>

This formula finds all anchor tags where the href attribute contains the word "twitter" and returns the URL itself.

Use Case 2: Tracking Product Prices

If you're running an e-commerce store, you can monitor competitor pricing or supplier pages. Find the product page and inspect the element a price is in. It's often in a <span> or <div>, with a specific class, like "product-price".

<pre><code>=IMPORTXML("http://www.productpage.com", "//span[@class='product-price']")</code></pre>

Quick tip: Be sure to check the website's terms of service first, as some sites prohibit automated scraping.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

Use Case 3: Pulling All Headings from an Article

Want a quick outline of a webpage or a long article? You can pull all the H2 subheadings in a snap.

<pre><code>=IMPORTXML("https://www.example.com/article", "//h2")</code></pre>

This is a great trick for analyzing content structure for market research or SEO purposes.

Troubleshooting Common IMPORTXML Errors

Sometimes, your IMPORTXML formula won't work perfectly on the first try. Here are the most common issues and how to fix them:

  • #N/A Error ("Imported content is empty"): This is the most frequent error. It means Google Sheets successfully accessed the URL but your XPath query didn't find any matching elements. Double-check your XPath for typos and ensure it correctly matches the website's HTML/XML structure. JavaScript-heavy sites can also cause this issue because IMPORTXML may not be able to "see" content that loads after the initial HTML.
  • #REF! Error: This can happen if your XPath is fundamentally broken (e.g., incorrect syntax). Again, carefully review your query.
  • Data Looks Messy: Sometimes the data comes in with extra spaces or formatting. You can wrap your IMPORTXML formula in other functions like TRIM() to clean it up or TRANSPOSE() to flip the data from a row to a column.

A Quick Way to Find XPath

If you're struggling to write an XPath from scratch, your browser can give you a starting point. Here's how to do it in Google Chrome:

  1. Right-click on the element you want to scrape and select "Inspect."
  2. The developer console will open with the exact HTML element highlighted.
  3. Right-click on the highlighted code.
  4. Go to Copy > Copy XPath. This will give you a very specific XPath that points directly to that single element. It may be overly complex (e.g., //div[1]/section/div/p), but you can often simplify it to something more robust like //p[@class='intro-text'].

Final Thoughts

The IMPORTXML formula is a powerful addition to your Google Sheets toolkit, turning your spreadsheet from a static document into a dynamic dashboard that pulls live information directly from the web. By mastering basic XPath and the formula's simple syntax, you can automate many data-gathering tasks and keep your reports updated automatically.

While IMPORTXML is fantastic for grabbing data from individual web pages, connecting and analyzing data from dozens of your own sources - like Google Analytics, Shopify, Facebook Ads, and Salesforce - can quickly turn your spreadsheet into a tangled mess of formulas. To fix this, we've made it our mission to make connecting your data painless. With Graphed, you can link all your marketing and sales accounts in just a few clicks and use simple, natural language to build dashboards in real-time. Instead of building pivot tables, you can just ask, "Show me which Facebook campaigns are driving the most Shopify sales this month," and get an instant visualization.

Related Articles

How to Enable Data Analysis in Excel

Enable Excel's hidden data analysis tools with our step-by-step guide. Uncover trends, make forecasts, and turn raw numbers into actionable insights today!