Whether you are a computer native or a digital immigrant, you are most likely familiar with the fundamental functionalities of Excel. Straightforward actions such as sorting, filtering, and highlighting data, as well as creating charts based on that data, are simple to execute using Excel. With sufficiently organized data, we can even do complex data analysis in Excel, utilizing pivot and modeling purposes, which are both very powerful tools.
However, the challenge is figuring out how to extract data from a website and import it into Excel effectively. If this were done manually, it would take a long time due to the repetition of typing, searching, copying, and pasting required. So, how do we go about automating data extraction and scraping from websites and transferring it to Excel?
Worry not. In this article, we are going to show you some of the best ways through which you can easily extract data from a website to excel. So, keep reading this post until the end to learn in detail.
All right let’s get started on the process of helping you out.
How to Extract Data from Website To Excel
There are several methods of extracting data from websites to Excel that make use of programming languages such as PHP, Python, Perl, Ruby, and others. However, we’ll limit ourselves to discussing how to pull or extract data from websites and import it into Excel for those who aren’t programmers.
Here we will show you the best way to do that, which is the Excel Web Query. Below we briefly described what this feature is.
What is Excel Web Query?
The functionality is nothing more than a feature in Excel. To put it another way, it makes it possible for you to extract information from a website. How it can instantly locate tables on a web page and allow you to choose the specific table from which data is required will surprise and delight you for sure.
Due to the fact that it generates HTML content in an Excel worksheet by sending along the parameters needed by the web page’s design to show data in an Excel workbook, this function is highly handy.
Using web queries may also be beneficial in situations where a typical ODBC (Open Database Connectivity) connection would be difficult to set up and maintain for the sole purpose of retrieving data from online sites.
Extract Web Data Using Excel Web Queries
To gather data from the internet, we will be utilizing the From Web command option on the Data ribbon of Excel. Aside from personally changing data from a website page by copying and pasting it into an Excel spreadsheet, Excel Web Queries can be used to easily extract data from a typical web page and insert it into an Excel worksheet.
It has the capability of automatically detecting data included inside the HTML of a web page.
A normal ODBC (Open Database Connectivity) connection might be difficult to set up and manage in certain instances. Excel Web queries can be utilized in these scenarios. Using Excel Web Queries, you can extract a table out of any webpage and present data straight in Excel.
However, enough chitchat. Now let’s look at the procedures below on how to extract data from a website to Excel by using the web queries:
- Open Excel and create a new blank workbook.
- Click on the Data ribbon and select the From Web option in the Get External Data section.
- A browser window called New Web Query will now open.
- Type the site URL into the address bar of that browser window.
- The site will then load, and a yellow icon will appear next to the data/tables.
- Choose your most suitable table by clicking on the yellow arrow icon.
- Select the Import option on the bottom right corner.
Once you are done, you will see that the table you selected has been successfully imported from the web page to the Excel sheet. Not only that but also, the table has imported very perfectly, arranged in rows and columns that you would love.
extracting-data-from-the-web-final
Note: While browsing the site in the New Web Query window, it may take a while to load the site, and appear the yellow arrow. In my case, I had to wait for about 1 to 2 minutes. So, if you see that the arrow is not appearing next to the table, don’t panic and wait a few minutes, and it will come. But even after waiting for a few whiles doesn’t do the trick, you can try refreshing the page.
Also, after clicking on the import button, it will ask you in which cell you want to import the table. So, I chose cell A1. You can choose a different one as you wish.
And Allow me to demonstrate where Excel made a measurable difference. As I scrolled, I saw that a column was empty. (As you can see in the screenshot above.) When I went to the website, I saw that there was a graph there that Excel was unable to get from the database. As a result, you must take precautions while copying information from a web page.
However, the most essential and fascinating point to take away from this is that you are not required to get the most recent data on a regular basis. You can do that automatically by following the below method:
How to Refresh The Table for Any Update
Both manually and automatically reloading the data are available. How? Say it with me, Abracadabra! Nah, I’m just joking, of course. But this is nearly magical in its effect.
To refresh the table data, go to the Data ribbon and click on the Refresh All button, and it will be refreshed. But only if you think you should refresh all the tables altogether. If you want to refresh a specific table, then click on that table data and click Refresh by clicking on the Refresh All drop-down menu.
You may also specify a time frame for when data should be refreshed automatically. To do that, follow the below steps carefully:
- Go to the Data ribbon and click on the Properties option in the connections
- A dialog box titled External Data Range Properties will appear.
- In this box, you can name this connection as well. You should give a description of it.
- You will find the command Refresh under the Refresh Control You have the option to alter it to Every (by default, it’s selected to 60 minutes).
- Alternatively, you may use the option Refresh Data while opening the file.
choose-option-in-external-data-range-properties
After selecting the option, I pressed the OK button. As a result, when I open the spreadsheet, the information contained inside it will be updated automatically.
How to Customize The Web Query
Once you have created a Web Query, you may modify it to meet your specific requirements. Sometimes when you import a table from the internet to Excel, you may need to customize it according to your needs. In that case, MS Excel has also included a feature for that.
To do that, follow the below steps carefully:
- Open the spreadsheet where you have imported the table from the web and right-click one of the cells.
- Now select the option Edit Query from the context menu to access the Web query properties.
- The web page from where you have imported the data will now reopen. Click on the options button, which is located in the upper right corner of the window, in order to open the dialog box shown in the screenshot below.
- Now, it is possible to modify the way the query reacts to the Website page by using the options available here.
In this web query dialog box, you will find many options through which you can customize the table.
Data Analysis of the Table
An examination, cleansing, transformation, and modeling of data with the objective of identifying usable information, drawing conclusions, and assisting in decision-making is known as data analysis.
Excel is helpful because it includes instructions, functions, and features that make data analysis jobs simpler and more straightforward. Excel can save you a lot of time and effort by eliminating numerous time-consuming and/or difficult computations.
Limitation of Excel for Extracting Data from Website
Although it seems quite simple and straightforward to use Excel for extracting data from a website, there are some limitations to it. Below we have given some limitations of Excel for extracting data from a website:
- Not Scalable: You would be wise to note that although Excel is excellent at collecting HTML data from the web, it is not a real web data extraction tool or solution. Excel is not very useful if you want to collect data for business analytics since it is not capable of extracting vast amounts of data from a website. In a nutshell, you will not be able to scale it as effectively as you would with a good web scraping tool.
- Not able to scrape data that we want: It is merely possible to extract data from a table format while working with Excel. If you need to extract information from a webpage that is not in a table format, here is some bad news for you. You will need to find another method.
- Other limitations: Furthermore, when you are using Excel for web scraping, you will be unable to make use of crucial capabilities such as Ajax pagination, endless scrolling, deduplication, and other similar ones.
So, to outcome all these limitations, what should you do? Well, there is only one solution for that. Simply put, you require a web-scraping application or service that allows you to do web scraping in an efficient and scalable manner while also providing features such as Ajax pagination, endless scrolling, deduplication, and so on.
To summarize, make certain that you choose the correct sort of online scraping tool or service provider, one that offers excellent customer support and cloud-based architecture.
We can suggest you use two free web scraping tools. Those are:
- Scraper (Chrome Extension)
- Scrapy (web scraping framework)
These are free and quite simple to use.
Conclusion
So, this is the end of the article on how you can extract data from a website to Excel, change it, and how schedule when the information is automatically refreshed, all in your own custom way. We hope this article has helped you to pull data from a website and import it to Excel.