Skip to content

Scrape a web page in node.js using Cheerio

Learn how to scrap a simple web page in node.js using Cheerio and Axios node modules.

I was developing a widget in html to show weather information based on a city. I tried using WeatherWidget.io but it was customizable to some extent but not more than that. Please check this post if you want to know more. I need to choose a city from UI and based on the city I need to display the current weather information and probably the forecast too.

Let’s say we need to fetch the current weather information of Mumbai and show it in a html page.

Forecast7 displays the weather of a city in a simple format. We are going to scrape a specific div from this Mumbai weather page and use it in my html.

Forecast7 displays weather of a city in a beautiful format.

A part of the source html for the Forecast7’s Mumbai weather page looks like this.

Mumbai weather from Forecast7
Advertisements
<div class="current-weather">

	<h1 title="Mumbai, Maharashtra, India"><b>Mumbai</b> Weather</h1>

	<div class="current-icon">
		<div class="icon"><div class="w-icon iconvault clear-day"><div class="sun"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 300 300" enable-background="new 0 0 300 300"><g class="sunRays"><path d="m65.3 157.2h-20.9c-1.8 0-3.5-.7-5-2.2-1.5-1.5-2.2-3.1-2.2-5 0-1.8.7-3.5 2.2-5 1.5-1.5 3.1-2.2 5-2.2h20.9c2.2 0 3.9.7 5.2 2.2 1.3 1.5 1.9 3.1 1.9 5 0 1.8-.6 3.5-1.9 5-1.3 1.4-3 2.2-5.2 2.2"/><path d="m70.2 80.1c-3.3-3.3-3.3-6.6 0-9.9 3.3-3.3 6.6-3.3 9.9 0l14.9 14.9c3.7 3.3 3.8 6.6.3 9.9-3.5 3.3-6.9 3.3-10.2 0l-14.9-14.9"/><path d="m95.3 205c3.5 3.3 3.4 6.6-.3 9.9l-14.9 14.9c-3.3 3.3-6.6 3.3-9.9 0-3.3-3.3-3.3-6.6 0-9.9l14.9-14.9c3.3-3.3 6.7-3.3 10.2 0"/><path d="m142.8 65.3v-20.9c0-1.8.7-3.5 2.2-5 1.5-1.5 3.1-2.2 5-2.2 1.8 0 3.5.7 5 2.2 1.5 1.5 2.2 3.1 2.2 5v20.9c0 2.2-.7 3.9-2.2 5.2-1.5 1.3-3.1 1.9-5 1.9-1.8 0-3.5-.6-5-1.9-1.4-1.3-2.2-3-2.2-5.2"/><path d="m157.2 234.7v20.9c0 1.8-.7 3.5-2.2 5-1.5 1.5-3.1 2.2-5 2.2-1.8 0-3.5-.7-5-2.2-1.5-1.5-2.2-3.1-2.2-5v-20.9c0-2.2.7-3.9 2.2-5.2 1.5-1.3 3.1-1.9 5-1.9 1.8 0 3.5.6 5 1.9 1.4 1.3 2.2 3 2.2 5.2"/><path d="m229.8 219.9c3.3 3.3 3.3 6.6 0 9.9-3.3 3.3-6.6 3.3-9.9 0l-14.9-14.9c-3.7-3.3-3.8-6.7-.3-10.2 3.5-3.5 6.9-3.4 10.2.3l14.9 14.9"/><path d="m204.7 95c-3.5-3.3-3.4-6.6.3-9.9l14.9-14.9c3.3-3.3 6.6-3.3 9.9 0 3.3 3.3 3.3 6.6 0 9.9l-14.9 14.9c-3.3 3.3-6.7 3.3-10.2 0"/><path d="m260.6 145c1.5 1.5 2.2 3.1 2.2 5 0 1.8-.7 3.5-2.2 5-1.5 1.5-3.1 2.2-5 2.2h-20.9c-2.2 0-3.9-.7-5.2-2.2-1.3-1.5-1.9-3.1-1.9-5 0-1.8.6-3.5 1.9-5 1.3-1.5 3-2.2 5.2-2.2h20.9c1.8 0 3.5.8 5 2.2"/></g><path d="m195.1 104.9c-12.5-12.5-27.5-18.7-45.1-18.7-17.6 0-32.6 6.2-45.1 18.7-12.5 12.5-18.7 27.5-18.7 45.1 0 17.6 6.2 32.6 18.7 45.1 12.5 12.5 27.5 18.7 45.1 18.7 17.6 0 32.6-6.2 45.1-18.7 12.5-12.5 18.7-27.5 18.7-45.1 0-17.6-6.2-32.6-18.7-45.1m-10.4 79.8c-9.5 9.5-21.1 14.3-34.7 14.3-13.6 0-25.1-4.8-34.7-14.3-9.5-9.5-14.3-21.1-14.3-34.7 0-13.6 4.8-25.1 14.3-34.7 9.5-9.5 21.1-14.3 34.7-14.3 13.6 0 25.1 4.8 34.7 14.3 9.5 9.5 14.3 21.1 14.3 34.7 0 13.6-4.8 25.1-14.3 34.7"/></svg></div></div></div>
	</div>

	<div class="current-conditions">
		<div class="temp">31&deg;C</div>
		<div class="summary">Clear</div>
	</div>

</div>

Using Cheerio and Axios we are going to fetch the div current-weather.

See also  How to get node version programmatically?

Install Axios and Cheerio

npm install axios
npm install cheerio

Axios is a popular HTTP client for node.js which is used to perform HTTP requests. Cheerio is a JQuery implementation for the server which is used to scrape through web pages.

Make HTTP request with Axios and use Cheerio to parse it

axios.get("https://forecast7.com/en/19d0872d88/mumbai/")
    .then(response => {
      // console.log(response.data);
      let $ = cheerio.load(response.data);
}

Read the div using Cheerio

let currentWeather = $('.current-weather').html();

console.log(currentWeather);

//<h1 title="Mumbai, Maharashtra, India"><b>Mumbai</b> Weather</h1>
//<div class="current-icon">
//	<div class="icon"><div class="w-icon iconvault clear-day"><div class="sun"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 300 300" enable-background="new 0 0 300 300"><g class="sunRays"><path d="m65.3 157.2h-20.9c-1.8 0-3.5-.7-5-2.2-1.5-1.5-2.2-3.1-2.2-5 0-1.8.7-3.5 2.2-5 1.5-1.5 3.1-2.2 5-2.2h20.9c2.2 0 3.9.7 5.2 2.2 1.3 1.5 1.9 3.1 1.9 5 0 1.8-.6 3.5-1.9 5-1.3 1.4-3 2.2-5.2 2.2"/><path d="m70.2 80.1c-3.3-3.3-3.3-6.6 0-9.9 3.3-3.3 6.6-3.3 9.9 0l14.9 14.9c3.7 3.3 3.8 6.6.3 9.9-3.5 3.3-6.9 3.3-10.2 0l-14.9-14.9"/><path d="m95.3 205c3.5 3.3 3.4 6.6-.3 9.9l-14.9 14.9c-3.3 3.3-6.6 3.3-9.9 0-3.3-3.3-3.3-6.6 0-9.9l14.9-14.9c3.3-3.3 6.7-3.3 10.2 0"/><path d="m142.8 65.3v-20.9c0-1.8.7-3.5 2.2-5 1.5-1.5 3.1-2.2 5-2.2 1.8 0 3.5.7 5 2.2 1.5 1.5 2.2 3.1 2.2 5v20.9c0 2.2-.7 3.9-2.2 5.2-1.5 1.3-3.1 1.9-5 1.9-1.8 0-3.5-.6-5-1.9-1.4-1.3-2.2-3-2.2-5.2"/><path d="m157.2 234.7v20.9c0 1.8-.7 3.5-2.2 5-1.5 1.5-3.1 2.2-5 2.2-1.8 0-3.5-.7-5-2.2-1.5-1.5-2.2-3.1-2.2-5v-20.9c0-2.2.7-3.9 2.2-5.2 1.5-1.3 3.1-1.9 5-1.9 1.8 0 3.5.6 5 1.9 1.4 1.3 2.2 3 2.2 5.2"/><path d="m229.8 219.9c3.3 3.3 3.3 6.6 0 9.9-3.3 3.3-6.6 3.3-9.9 0l-14.9-14.9c-3.7-3.3-3.8-6.7-.3-10.2 3.5-3.5 6.9-3.4 10.2.3l14.9 14.9"/><path d="m204.7 95c-3.5-3.3-3.4-6.6.3-9.9l14.9-14.9c3.3-3.3 6.6-3.3 9.9 0 3.3 3.3 3.3 6.6 0 9.9l-14.9 14.9c-3.3 3.3-6.7 3.3-10.2 0"/><path d="m260.6 145c1.5 1.5 2.2 3.1 2.2 5 0 1.8-.7 3.5-2.2 5-1.5 1.5-3.1 2.2-5 2.2h-20.9c-2.2 0-3.9-.7-5.2-2.2-1.3-1.5-1.9-3.1-1.9-5 0-1.8.6-3.5 1.9-5 1.3-1.5 3-2.2 5.2-2.2h20.9c1.8 0 3.5.8 5 2.2"/></g><path d="m195.1 104.9c-12.5-12.5-27.5-18.7-45.1-18.7-17.6 0-32.6 6.2-45.1 18.7-12.5 12.5-18.7 27.5-18.7 45.1 0 17.6 6.2 32.6 18.7 45.1 12.5 12.5 27.5 18.7 45.1 18.7 17.6 0 32.6-6.2 45.1-18.7 12.5-12.5 18.7-27.5 18.7-45.1 0-17.6-6.2-32.6-18.7-45.1m-10.4 79.8c-9.5 9.5-21.1 14.3-34.7 14.3-13.6 0-25.1-4.8-34.7-14.3-9.5-9.5-14.3-21.1-14.3-34.7 0-13.6 4.8-25.1 14.3-34.7 9.5-9.5 21.1-14.3 34.7-14.3 13.6 0 25.1 4.8 34.7 14.3 9.5 9.5 14.3 21.1 14.3 34.7 0 13.6-4.8 25.1-14.3 34.7"/></svg></div></div></div>
//	</div>

//	<div class="current-conditions">
//		<div class="temp">31&deg;C</div>
//		<div class="summary">Clear</div>
//	</div>

This returns the current-weather div from the html.

See also  Proxying npm packages using Nexus

To return the weather as text, you can use the following method.

let currentWeather = $('.current-weather').text();
Advertisements

Of course, the parsed html would miss the CSS but it still can be displayed in HTML. You can also play with the parsed HTML from Cheerio to make sure you get the output in the required format.

My function in node.js would look like:

const axios = require('axios');
const cheerio = require('cheerio');

module.exports.getCurrentWeather = async function(req, res){
  
    axios.get("https://forecast7.com/en/19d0872d88/mumbai/")
    .then(response => {
      let $ = cheerio.load(response.data);

    let title = $('.current-weather').html();
    title = title.replace("svg xmlns","svg height=\"50px\" width=\"50px\" xmlns");
    title = title.replace("<h1","<h4");
    title = title.replace("</h1","</h4");
    res.send(title);
 
    })
    .catch(error => {
      console.log(error);
    })
}

The output from this function would look like:

Displaying current weather in HTML / node.js
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.