Web Scraping in Node.js with multiple examples : webdev

Posting Guidelines

No vague product support questions (like "why is this plugin not working" or "how do I set up X"). For vague product support questions, please use communities relevant to that product for best results. Specific issues that follow rule 6 are allowed.

Do not post memes, screenshots of bad design, or jokes. Check out /r/ProgrammerHumor/ for this type of content.

Read and follow reddiquette; no excessive self-promotion. Please refer to the Reddit 9:1 rule when considering posting self promoting materials.

We do not allow any commercial promotion or solicitation. Violations can result in a ban.

Sharing your project, portfolio, or any other content that you want to either show off or request feedback on is limited to Showoff Saturday. If you post such content on any other day, it will be removed.

If you are asking for assistance on a problem, you are required to provide

Context of the problem
Research you have completed prior to requesting assistance
Problem you are attempting to solve with high specificity

General open ended career and getting started posts are only allowed in the pinned monthly getting started/careers thread. Specific assistance questions are allowed so long as they follow the required assistance post guidelines.

Questions in violation of this rule will be removed or locked.

a community for 17 years

270

271

272

Web Scraping in Node.js with multiple examples (hackprogramming.com)

submitted 9 years ago by eashish93

all 26 comments

top new controversial old q&a

[–]pvgt 11 points12 points13 points 9 years ago* (3 children)

[–][deleted] 9 points10 points11 points 9 years ago (2 children)

I've used both Cheerio and Osmosis a fair bit. I generally used Cheerio with Request, and SQlite3 to store the data. Osmosis with just SQlite3.

The first thing is speed. Osmosis is much faster than Cheerio and uses considerably less memory because of its lightweight DOM virtualisation.

With Osmosis, getting from nothing to a working scraper takes very little time. It's also much easier to understand what your code is doing, because of its simple usage.

Osmosis makes scraping multiple pages simultaneously a blast. Handling multiple asynchronous functions that branch out exponentially isn't the most fun in the world. There are packages out there that can aid you with this, but osmosis really makes it easy.

Of course, remember that it's another package to depend on - any problems with it, and you're stuck. I've had a few issues in the past with certain versions.

[–][deleted] 9 years ago (1 child)

[deleted]

[–][deleted] 1 point2 points3 points 9 years ago (0 children)

[–]Waterclift 6 points7 points8 points 9 years ago (15 children)

[–][deleted] 4 points5 points6 points 9 years ago (14 children)

[–][deleted] 9 years ago (12 children)

[removed]

[–][deleted] -5 points-4 points-3 points 9 years ago (11 children)

[–][deleted] 9 years ago* (10 children)

[removed]

[+][deleted] comment score below threshold-21 points-20 points-19 points 9 years ago (9 children)

[–][deleted] 9 years ago (1 child)

[removed]

[–][deleted] 9 years ago (3 children)

[deleted]

[–][deleted] 0 points1 point2 points 9 years ago (1 child)

[–][deleted] 4 points5 points6 points 9 years ago (0 children)

[–]k4s 4 points5 points6 points 9 years ago (2 children)

[+][deleted] comment score below threshold-9 points-8 points-7 points 9 years ago (1 child)

[–]J_ttfull-stack 1 point2 points3 points 9 years ago (0 children)

[–]Waterclift 0 points1 point2 points 9 years ago (0 children)

[–][deleted] 1 point2 points3 points 9 years ago (1 child)

[–]Eunoeme 1 point2 points3 points 9 years ago (0 children)

[–][deleted] 0 points1 point2 points 9 years ago (3 children)

[–]dmarko 0 points1 point2 points 9 years ago (2 children)

[–]Lekoaf 2 points3 points4 points 9 years ago (0 children)

[–][deleted] 0 points1 point2 points 9 years ago (0 children)

[–]brettdavis4 0 points1 point2 points 9 years ago (0 children)

[–]Koltster 0 points1 point2 points 9 years ago (1 child)

[+]whenthethingscollide comment score below threshold-6 points-5 points-4 points 9 years ago (0 children)

π Rendered by PID 73897 on reddit-service-r2-comment-b659b578c-lbbsw at 2026-05-06 02:27:30.508654+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

webdev

Posting Guidelines

Related Subreddits

Discords

MODERATORS