• eScraper price increase in five days!
    *Hurry up to order web scraping product or service at the current discounted price!
  • Home » Useful Articles » How to scrape a Shopify store into Google Sheets

    How to scrape a Shopify store into Google Sheets

    (2)
    By eMagicOne
    article-img-sect-1

    If you need to scrape a Shopify store to get product price, reviews or other data, you may do it yourself using the Google Sheets IMPORTXML formula. It is pretty easy and doesn't require programming knowledge. Though, if you need to extract all product details at large scale, you may face the limitations of the tool making it impossible to solve your task.

    article-sect-1

    In this article, we’ll review the possibilities and limitations of scraping Shopify and other eCommerce stores with the Google Sheets functionality. Also, we’ll look at an alternative solution allowing you to get complete product data for your specific business task.

    Using Google Sheets IMPORTXML formula

    A little bit of theory about the IMPORTXML function from the Google Help Center: “IMPORTXML imports data from any of various structured data types including XML, HTML, CSV, TSV, and RSS and ATOM XML feeds.” Sample usage:

    IMPORTXML("https://en.wikipedia.org/wiki/Moon_landing", "//a/@href")

    IMPORTXML(A2,B2)

    The IMPORTXML formula is formatted with two arguments:

    • URL of the webpage you intend to extract data from.
    • XPath of the element in which the data is contained.

    XPath stands for XML Path Language and can be used to navigate through elements and attributes in an XML document. Further in the article we’ll show you how to identify the XPath of the page element (product name, price, description, etc.)

    Now we’ll show you how to apply this formula in practice to extract product details from a Shopify store.

    Steps to scrape a Shopify store to Google Sheets

    Here is a product page we’ll try to scrape:

    Shopify Store to be Scraped to Google Sheets

    Shopify Store to be Scraped to Google Sheets

    Let’s begin with retrieving product title:

    1. Create a Google Sheet.
    2. Add a column name and paste the URL.
    3. Copying Shopify Store URL to a Google Sheet

      Copy the Shopify Store URL to a Google Sheet

    4. Get XPath for the product title. For that, select the title, right click to bring up the menu and click “Inspect”:
    5. Shopify Product Title Extraction Inspect

      Select the Shopify product title extraction and in the context menu click “Inspect”

      In the Chrome dev tool, you’ll see the highlighted text with tags. Right click here and in the menu press Copy > Copy XPath:

      Shopify Title Scraping Copy XPath

      Copy XPath to clipboard

      Paste it to the corresponding cell:

      Copying Shopify Product Title XPath to Google Sheets

      Copy title XPath to Google Sheets

    6. Apply the formula =IMPORTXML (URL, XPath_query) and press Enter.
    7. *Select corresponding cells for URL and XPath_query. In our example its cell A2 for the URL, and B2 for the XPath:

      Shopify Product Title Scraped to Google Sheets

      Apply the formula and press Enter to get the product title scraped to Google Sheets

    Now when the product’s name is in the sheet, we can extract the product’s price and review in the same way. For that, repeat steps 2-4 for each entity. Here what we’ve got in our document:

    Shopify Product Details Scraped to Google Sheets

    Repeat the steps to scrape all product details

    Why does the “Imported content is empty” issue occur

    When we try to extract image URL in the same way, though, we get the “Imported content is empty” issue (it can also be “The imported XML content can not be parsed”):

    Shopify Scraping to Google Sheets Error

    The “Imported content is empty” error

    These issues may be caused by the incorrect XPath. Try to select the element more precisely and copy its Xpath again. If the error persists, the reason may be that you are trying to extract dynamic content. And Google Sheets formula doesn’t work for dynamic websites.

    Dynamic content includes such kinds of UI interaction on the page as drop-downs with product variants, “See more” sections, multiple images, pagination and other. To detect dynamic elements, try disabling Java for the page. We’ve disabled JavaScript for our source sample page and the image isn’t displayed now:

    Shopify Site Disabled Java

    Image isn’t displayed when Javascript is disabled

    That was the reason why we couldn’t fetch its URL. Thus we’ve stumbled upon one of the limitations of scraping an eCommerce website with the IMPORTXML function.

    Limitations of the IMPORTXML function

    The biggest limitation of the import formula is that you can’t scrape dynamic pages that load the content through Java/API.

    Google Sheets and other static web scrapers can only scrape the page if its content is displayed on the initial page request, but not the content is generated after.

    Other limitations you may face:

    • It’s not always easy to identify the correct XPath query.
    • IMPORTXML & Sheets cannot be used to scrape large quantities of data at scale (it will stop functioning).
    • Import formulas cannot work with data behind a password wall.

    If you plan on importing the extracted data into your online store, it would also take time to clean up the data and prepare the correct file structure for import.

    An alternative way of scraping Shopify and other eCommerce websites to Google Sheets

    In cases when it’s needed to scrape multiple dynamic websites with different structures and big data sets, a more robust solution has to be used. For cases like these, you can rely on the eScraper data extraction service.

    eScraper combines powerful data extraction possibilities and manual tweaking of the data according to the client’s specific needs.

    Here is what you get:

    • Extraction of required data (specify it in the order form what fields you need in the result Google Sheet or attach a file template.)
    • Scraping of all data including dynamic elements: product variations, multiple products images, pages with pagination, etc.
    • Scraping of password-protected websites (given you’ve provided the login details).
    • Cleaned up data in the format ready for import. You can order a file for import to Shopify, WooCommerce, PrestaShop, Magento or other shopping carts, ERP systems, etc. Just specify the format and we’ll tweak scraped data accordingly.

    Here is an example of data extracted from our sample website with eScraper. As you can see, eScraper could fetch all product data including multiple image URLs and image Alt-tags:

    Shopify Product Details All Scraped to Google Sheets by eScraper

    All product details can be scraped to Google Sheets by eScraper

    Here is the same data formatted for Shopify import. In this file, we’ve added the columns from the Shopify import template and filled them with the scraped data. Product description comes with HTML tags as required:

    Shopify Scraped Data Import

    Scraped data formatted for import to Shopify

    Try eScraper today for free! Submit your request via the online order form and get all data extracted to Google Sheets hassle-free!

    Was this article helpful?

    Scrape Shopify to Google Sheets with eScraper

    Try eScraper
    Comments Join the discussion...
    Rick

    Rick — 12.29.2021 at 16:18 12.29.2021 at 16:18

    Your tutorial helped me to scrape some more content for my eCommerce store. Thank you so much. But frankly say for more complicated task Google Sheet and IMPORTXML is more like 5th wheel to my car)))

    Reply
    eScraper

    eScraper — 01.03.2022 at 11:45 01.03.2022 at 11:45

    Glad to know the article was helpful to you. More complicated tasks, though, can be solved with more advanced tools. If you have such a task, we’ll be glad to help. Fill in the request form and we’ll prepare a free test file for you.

    Reply
    img-form