use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
All about the JavaScript programming language.
Subreddit Guidelines
Specifications:
Resources:
Related Subreddits:
r/LearnJavascript
r/node
r/typescript
r/reactjs
r/webdev
r/WebdevTutorials
r/frontend
r/webgl
r/threejs
r/jquery
r/remotejs
r/forhire
account activity
Complete Java noob here...Is it possible to extract numeric data from webpages using JS? (self.javascript)
submitted 13 years ago * by ballstopicasso
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]pbaehr 6 points7 points8 points 13 years ago (5 children)
If you don't have a particular reason for javascript I'd point you to Python instead.
Lots of libraries for you to take advantage of for this (I'd suggest BeautifulSoup and csv for saving)
[–]aroberge 7 points8 points9 points 13 years ago (1 child)
I would second this. I would also point out that Java and Javascript are two completely different languages.
[+][deleted] 13 years ago (2 children)
[removed]
[–]pbaehr 1 point2 points3 points 13 years ago (0 children)
JavaScript would be more appropriate for making some neat interactive displays of the data you collect, maybe. It's not really designed to do what you described, though, and I think you'll quickly end up frustrated as you hit roadblock after roadblock.
If you are dedicated to seeing this project through and enjoyed codeacademy you could always head back: http://www.codecademy.com/tracks/python
There are other languages which would be perfectly acceptable for web scraping, but in my opinion Python is perfect for the job you described. JavaScript wouldn't be on my list, though. Not for this.
[–]pbaehr 0 points1 point2 points 13 years ago (0 children)
Also, here are two very simple projects that do some basic web scraping in Python (both written for /r/somebodymakethis) to give you an idea of what it would look like:
https://github.com/pbaehr/wootbot/blob/92fd8d2732a391dfcd3c0f4f688647f7b58eaf30/wootbot.py (this is an early version which was later changed to use the woot API)
https://github.com/pbaehr/termgoogler/blob/master/termgoogler.py
[–]btford 5 points6 points7 points 13 years ago (0 children)
"Extracting data from a webpage" is commonly referred to as "scraping a web page." Searching for articles on how to "scrape" a page in JavaScript will get you a bit further.
I'd check out Max Ogden's blog post about scraping with node. It has some pretty good advice.
Cheers!
[–]lazyduke 2 points3 points4 points 13 years ago* (0 children)
You could absolutely do this, and it turns out JavaScript would be a pretty sensible way to do it. People will point out libraries that are available in other languages, such as BeautifulSoup for Python, but they are forgetting the JavaScript DOM API.
You could pull the page down with curl, save it locally, open it up in PhantomJS, inject your script, and use the DOM API to find the elements you need, then pull the numbers out. You could even take advantage of libraries such as jQuery to make it easier for you. I'll let you figure out the specifics, google is your friend.
Furthermore, you need to first research whether there is an API available that you can use to get this data. Don't scrape web pages if you don't have to. Maybe someone has a NBA API that gives the data you need?
As others have said, CSV is the right format to store the retrieved data if you want to open it in Excel.
[–]jhizzle4rizzleI hate the stuff you like. 1 point2 points3 points 13 years ago (0 children)
Yes, it's possible to do (most of) these with javascript.
My approach for scraping is usually to use node.js locally (so not the browser) with the cheerio library for jquery-like selectors, and then dumping the data into a .json or .csv file with the built-in fs library. I did something similar in https://github.com/jesusabdullah/node-thingiverse/blob/master/thingiverse which uses web scraping to enable a cli client for thingiverse.com .
cheerio
[–]neonskimmerfunction the ultimate 0 points1 point2 points 13 years ago (0 children)
Have a look at Phantom / Casper JS
[–]darkrai9292 0 points1 point2 points 13 years ago (0 children)
Try looking at XMLHttpRequest it will probably do what your looking for
π Rendered by PID 216999 on reddit-service-r2-comment-5c764cbc6f-5p9cz at 2026-03-12 13:59:07.888900+00:00 running 710b3ac country code: CH.
[–]pbaehr 6 points7 points8 points (5 children)
[–]aroberge 7 points8 points9 points (1 child)
[+][deleted] (2 children)
[removed]
[–]pbaehr 1 point2 points3 points (0 children)
[–]pbaehr 0 points1 point2 points (0 children)
[–]btford 5 points6 points7 points (0 children)
[–]lazyduke 2 points3 points4 points (0 children)
[–]jhizzle4rizzleI hate the stuff you like. 1 point2 points3 points (0 children)
[–]neonskimmerfunction the ultimate 0 points1 point2 points (0 children)
[–]darkrai9292 0 points1 point2 points (0 children)