Artilces

How to scrape a Shopify store into Google Sheets

In this article, we’ll review the possibilities and limitations of scraping Shopify and other eCommerce stores with the Google Sheets functionality. Also, we’ll look at an alternative solution allowing you to get complete product data for your specific business task.

Using Google Sheets IMPORTXML formula

A little bit of theory about the IMPORTXML function from the Google Help Center: “IMPORTXML imports data from any of various structured data types including XML, HTML, CSV, TSV, and RSS and ATOM XML feeds.” Sample usage:

IMPORTXML("https://en.wikipedia.org/wiki/Moon_landing", "//a/@href")

IMPORTXML(A2,B2)

The IMPORTXML formula is formatted with two arguments:

  • URL of the webpage you intend to extract data from.
  • XPath of the element in which the data is contained.

XPath stands for XML Path Language and can be used to navigate through elements and attributes in an XML document. Further in the article we’ll show you how to identify the XPath of the page element (product name, price, description, etc.)

Now we’ll show you how to apply this formula in practice to extract product details from a Shopify store.

Steps to scrape a Shopify store to Google Sheets

Here is a product page we’ll try to scrape:

Shopify Store to be Scraped to Google Sheets

Let’s begin with retrieving product title:

  1. Create a Google Sheet.
  2. Add a column name and paste the URL.
  3. Copy the Shopify Store URL to a Google Sheet

  4. Get XPath for the product title. For that, select the title, right click to bring up the menu and click “Inspect”:
  5. Select the Shopify product title extraction and in the context menu click “Inspect”

    In the Chrome dev tool, you’ll see the highlighted text with tags. Right click here and in the menu press Copy > Copy XPath:

    Copy XPath to clipboard

    Paste it to the corresponding cell:

    Copy title XPath to Google Sheets

  6. Apply the formula =IMPORTXML (URL, XPath_query) and press Enter.
  7. *Select corresponding cells for URL and XPath_query. In our example its cell A2 for the URL, and B2 for the XPath:

    Apply the formula and press Enter to get the product title scraped to Google Sheets

Now when the product’s name is in the sheet, we can extract the product’s price and review in the same way. For that, repeat steps 2-4 for each entity. Here what we’ve got in our document:

Repeat the steps to scrape all product details

Why does the “Imported content is empty” issue occur

When we try to extract image URL in the same way, though, we get the “Imported content is empty” issue (it can also be “The imported XML content can not be parsed”):

The “Imported content is empty” error

These issues may be caused by the incorrect XPath. Try to select the element more precisely and copy its Xpath again. If the error persists, the reason may be that you are trying to extract dynamic content. And Google Sheets formula doesn’t work for dynamic websites.

Dynamic content includes such kinds of UI interaction on the page as drop-downs with product variants, “See more” sections, multiple images, pagination and other. To detect dynamic elements, try disabling Java for the page. We’ve disabled JavaScript for our source sample page and the image isn’t displayed now:

Image isn’t displayed when Javascript is disabled

That was the reason why we couldn’t fetch its URL. Thus we’ve stumbled upon one of the limitations of scraping an eCommerce website with the IMPORTXML function.

Limitations of the IMPORTXML function

The biggest limitation of the import formula is that you can’t scrape dynamic pages that load the content through Java/API.

Google Sheets and other static web scrapers can only scrape the page if its content is displayed on the initial page request, but not the content is generated after.

Other limitations you may face:

  • It’s not always easy to identify the correct XPath query.
  • IMPORTXML & Sheets cannot be used to scrape large quantities of data at scale (it will stop functioning).
  • Import formulas cannot work with data behind a password wall.

If you plan on importing the extracted data into your online store, it would also take time to clean up the data and prepare the correct file structure for import.

An alternative way of scraping Shopify and other eCommerce websites to Google Sheets

In cases when it’s needed to scrape multiple dynamic websites with different structures and big data sets, a more robust solution has to be used. For cases like these, you can rely on the eScraper data extraction service.

eScraper combines powerful data extraction possibilities and manual tweaking of the data according to the client’s specific needs.

Here is what you get:

  • Extraction of required data (specify it in the order form what fields you need in the result Google Sheet or attach a file template.)
  • Scraping of all data including dynamic elements: product variations, multiple products images, pages with pagination, etc.
  • Scraping of password-protected websites (given you’ve provided the login details).
  • Cleaned up data in the format ready for import. You can order a file for import to Shopify, WooCommerce, PrestaShop, Magento or other shopping carts, ERP systems, etc. Just specify the format and we’ll tweak scraped data accordingly.

Here is an example of data extracted from our sample website with eScraper. As you can see, eScraper could fetch all product data including multiple image URLs and image Alt-tags:

All product details can be scraped to Google Sheets by eScraper

Here is the same data formatted for Shopify import. In this file, we’ve added the columns from the Shopify import template and filled them with the scraped data. Product description comes with HTML tags as required:

Scraped data formatted for import to Shopify

Try eScraper today for free! Submit your request via the online order form and get all data extracted to Google Sheets hassle-free!

Was this article helpful?

5/5 - (1 vote)

Scrape Shopify to Google Sheets with eScraper

Try eScraper

COMMENTS
Rick
Rick • 12.29.2021 at 16:18

Your tutorial helped me to scrape some more content for my eCommerce store. Thank you so much. But frankly say for more complicated task Google Sheet and IMPORTXML is more like 5th wheel to my car)))

eScraper • 01.03.2022 at 11:45

Glad to know the article was helpful to you. More complicated tasks, though, can be solved with more advanced tools. If you have such a task, we’ll be glad to help. Fill in the request form and we’ll prepare a free test file for you.

eScraper

Recent Posts

How to extract product data from any marketplace without any technical skills with eScraper

Do you need to extract product data from any marketplace like Alibaba, Amazon, FaceBook or…

3 months ago

eScraper Smart Scan feature – extract product data with automatic fields detection

Do you need to import products from any site, and you do not have a…

3 months ago

Case study: tracking crowd-marketing activities using web scraping

Store owners most often use eScraper to extract data for their eCommerce needs like building…

1 year ago

eScraper software v.2022.11.0.11 – Press Release

November 4, 2022 - eScraper announces the release of the eScraper data extraction software tailored…

1 year ago

How to scrape and bulk download Shopify images

Are you looking for a method to not only scrape Shopify image URLs but also…

2 years ago

Web Scraping of Australian Car Marketplaces

And if you are interested in scraping Australian car marketplaces, chances are you’re looking for…

2 years ago