How to use the Google Sheets ImportXML function to collect data everywhere

Video How to use the Google Sheets ImportXML function to collect data everywhere

You can import data from any web page using the ImportXML function of Google Sheets. Here's how to use the ImportXML function on Google Sheets.

By mastering the ImportXML function of Google Sheet, you'll feel like you already own a certified Sheets Wizard. ImportXML obtains information from any XML field. Thanks to that, you can download the data and metadata generated on it anywhere.

Basics of XML and HTML

The XML markup language specifies the data sets in a web page. In essence, any set of <something> and </something> - the building blocks of the web source code or a certain set of data will reside inside them. The source code of the web will have some text in the <p> aragraph tag - a paragraph, sometimes containing <b> old - in bold text and possibly <a> a link - link (followed by </ a > </b>. </p> </body> to close the entire tag).

The Google Sheets ImportXML function can find a certain XML data set and copy data outside of it. In the above example, if we want to get all the links on the page, we need to ask the ImportXML function to enter all the information in the <a> </a> tag . If you want the whole text of a web, you can start by taking everything in <body> </body> or each version of <p> </p> , then deleting the data at later stages.

How to extract a list of postcode and county in the city

The tables in Wikipedia are great ImportXML exercises. This article will take the example of downloading the entire postcode in Edmonton, Alberta. Find a list of Canadian postcodes starting with the letter T. Open that page in a new browser window to get started.

How to use the Google Sheets ImportXML function to collect data everywhere

Select a postcode, right-click on it and select Inspect to open the browser tool to view the page source. You will see each page source code is in a tag (identify a cell in the table). After that, the article will import all TD tags contained from Edmonton in them.

Create a new blank Google Sheet. The article will take all TD tag content, including <span> and link by specifying the data that you want to use XPath syntax. ImportXML takes the URL and tag you are looking for as an argument to import into Google Sheets.

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td")

You will get the following result:

How to use the Google Sheets ImportXML function to collect data everywhere

Going back to the page source, we will see the postal code in bold in the <b> </b> tag, the city name that links to the Wikipedia articles under <a> </a>. Now try to get links only in each large city box and remove other links (neighborhoods). Edit them into two key commands column A and B:

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td / span / a [1]")

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td / b [1]")

You need to refine the results a bit:

How to use the Google Sheets ImportXML function to collect data everywhere

This action helps you understand how the XPath query syntax works: a tag only provides the first version of <tag> in the <parent tag>. Therefore, td / span / a [1] gives you the first link in <span> at each <td>. Similarly, td / b [1] gives you the first bold text in each <td> or only the postal code in this case.

The great thing is that you can execute two queries in a function. Therefore, the article combines two requests with an | symbol between:

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td / span / a [1] | // td / b [1]")

However, you will not get the same previous results. It will alternate the entire request combined into a long list, instead of two columns. It has many benefits but is not necessary in this article.

How to use the Google Sheets ImportXML function to collect data everywhere

To select the postcode in the boxes containing the 'Edmonton' link. We will use this code:

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td [span / a = 'Edmonton'] / b [1]")

Put the "search" - text eligible to narrow the results in square brackets without affecting the way to bring results.

How to use the Google Sheets ImportXML function to collect data everywhere

Now to names of nearby areas. Write the appropriate importXML function in the next column, getting the following text from "Edmonton."

The article takes the entire contents of the span [1] and uses parentheses and crosshairs to divide the content, putting "Edmonton" in the first column and the neighborhood name in the following column. We can then combine the postcode with the corresponding name:

= importxml ("https://en.wikipedia.org/wiki/List_of_T_postal_codes_of_Canada", "// td [span / a = 'Edmonton'] / span [1]")

Next, use the Split function and concatenate some of the following columns to split & group the data being processed:

= SPLIT (concatenate (B2: J2), "(/)")

Finally, here is the results table with the necessary information:

How to use the Google Sheets ImportXML function to collect data everywhere

How to automatically copy email addresses from the web

How to use the Google Sheets ImportXML function to collect data everywhere

The article will guide you how to get all employees' emails on About | page Zapier. Looking at the source code, you will see that each member's email address is in the class = "email" field. When you want to specify a tag attribute, use the Google Sheets ImportXML function as follows:

= importxml ("https://zapier.com/about//", "// span [@ class = 'email']")

How to use Regex to import email addresses from the web in Google Sheets.

To get Zapier addresses using Regex's "power", we'll enter the <span> command instead of looking for the class. Now we'll perform this task in two steps: Call information from the Zapier page into the first column, then, sort the email into the second column:

= importxml ("https://zapier.com/about//", "// span")

= regexextract (A1, "[a-zA-Z0-9 _ \. \ + -] + @ [a-zA-Z0-9 - \.] + \. [a-zA-Z0-9 -] {2, 15} ")

Finally, we will have this table:

How to use the Google Sheets ImportXML function to collect data everywhere

Remember, ImportXML fills in all the columns and rows by itself depending on the data it finds. The regex query must be filled in every cell you want to get results. To put it all together, you only need to use the Regexextract command, which is an array constant formula:

= ArrayFormula (IFERROR (REGEXEXTRACT (IMPORTXML ("https://zapier.com/about//", "// span"), "[a-zA-Z0-9 _ \. \ + -] + @ [a- zA-Z0-9 - \.] + \. [a-zA-Z0-9 -] {2,15} "))))

And this is the result:

How to use the Google Sheets ImportXML function to collect data everywhere

Hope the article is helpful to you!

Sign up and earn $1000 a day ⋙

How To Find Quality And Fast Internet In Montreal

How To Find Quality And Fast Internet In Montreal

Montreal, a city brimming with culture, history, and Poutine (a famous Canadian dish), deserves an internet connection that can keep up. Whether you’re a streamer glued to Twitch, a workaholic juggling video conferences, or a casual web surfer, a reliable and fast internet plan is essential.

What is ChatGPT? Why did it create a global craze?

What is ChatGPT? Why did it create a global craze?

What is ChatGPT? Why is it creating a global fever? In recent days, the ChatGPT chatbox is creating a global craze. Not only that ChatGPT is creating

Google Drive for desktop will be discontinued in March 2018

Google Drive for desktop will be discontinued in March 2018

Google Drive for desktop will be discontinued in March 2018, Google has officially announced that it will stop supporting Google Drive for desktop and then the company will remove the application.

The latest Adobe Flash Player update contains virtual currency mining malware

The latest Adobe Flash Player update contains virtual currency mining malware

The latest Adobe Flash Player update contains virtual currency mining malware, researchers from security firm Palo Alto Networks warn users if they download the wrong update

One Ordinary Day - Movie schedule An ordinary day

One Ordinary Day - Movie schedule An ordinary day

One Ordinary Day - Movie schedule One Ordinary Day, One Ordinary Day (Vietnamese name: One Ordinary Day) is an attractive Korean film about a crime topic.

Simple ways to fix AirDrop not working error

Simple ways to fix AirDrop not working error

Simple ways to fix AirDrop not working error, AirDrop not working on iPhone, iPad, Mac what to do? This article will provide you with simple ways

Status of opening new school year 2023 - 2024

Status of opening new school year 2023 - 2024

Status for opening the new school year 2023 - 2024, TOP 50 Status for opening the new school year 2023 - 2024 to help you freely choose unique statuses and statuses for yourself!

Top photo editing applications similar to Snap Camera

Top photo editing applications similar to Snap Camera

Top photo editing applications similar to Snap Camera, Snap Camera is a free camera application that helps apply Snapchat filters. In addition to Snap Camera, you can refer to these

Ways to fix file attachment errors on Gmail

Ways to fix file attachment errors on Gmail

Ways to fix file attachment errors on Gmail, Why can't I download files in Gmail? Why can't you attach files to Gmail? Here are the causes and how

Is it safe to use public Wifi?

Is it safe to use public Wifi?

Is it safe to use public Wifi?, Public Wifi helps you connect to the Internet for free. However, is it safe to use public wifi? Join WebTech360

TOP best computer screen recording software 2024

TOP best computer screen recording software 2024

TOP best computer screen recording software 2024, Which is the best lightweight, free, high-quality computer screen recording software? Let's find out the top with WebTech360

Tips for using Google Chrome that students need to know

Tips for using Google Chrome that students need to know

Tips for using Google Chrome that students need to know, Google Chrome is the most popular web browser today. You probably already know how to use it, however, if

Please upgrade TeamViewer immediately or your computer will be hijacked

Please upgrade TeamViewer immediately or your computer will be hijacked

Please upgrade TeamViewer immediately or your computer will be hijacked. Teamviewer has just released an emergency patch that allows hackers to take control of your computer.

Should you use Tor and VPN at the same time?

Should you use Tor and VPN at the same time?

Should you use Tor and VPN at the same time? Tor and VPN both provide privacy protection but in different ways. So should privacy and security features be combined?

Signs of a fake Instagram account

Signs of a fake Instagram account

Signs of a fake Instagram account. Are you wondering whether the Instagram account you are following or contacting is genuine or legitimate? Signs below

Explore BlueStacks X on PC

Explore BlueStacks X on PC

Explore BlueStacks X on PC, BlueStacks X allows you to open Android games using cloud computing on your web browser. With BlueStacks X, you are no longer limited

Play Fish Drop game online on My Viettel and receive free data packages, voice calls, and SMS

Play Fish Drop game online on My Viettel and receive free data packages, voice calls, and SMS

Play Fish Release game online on My Viettel to receive free data packages, voice calls, SMS, Viettel's Online Fish Release program will take place from January 30, 2021 to

Unreal Engine 5: All information about the latest game development technology

Unreal Engine 5: All information about the latest game development technology

Unreal Engine 5: All information about the latest game development technology, Unreal Engine 5 is the latest version of the famous game development tool Unreal Engine of

What is an email address? How to get an email address?

What is an email address? How to get an email address?

What is an email address? How to get an email address?, What is an email address? What is email address entry? This article will answer all your related questions

Every WhatsApp shortcut for computer you need to know

Every WhatsApp shortcut for computer you need to know

Every WhatsApp shortcut for your computer you need to know, WhatsApp is one of the most popular free messaging apps today. The keyboard shortcuts below will help