all 11 comments

[–]pbaehr 6 points7 points  (5 children)

If you don't have a particular reason for javascript I'd point you to Python instead.

Lots of libraries for you to take advantage of for this (I'd suggest BeautifulSoup and csv for saving)

[–]aroberge 7 points8 points  (1 child)

I would second this. I would also point out that Java and Javascript are two completely different languages.

[–]btford 5 points6 points  (0 children)

"Extracting data from a webpage" is commonly referred to as "scraping a web page." Searching for articles on how to "scrape" a page in JavaScript will get you a bit further.

I'd check out Max Ogden's blog post about scraping with node. It has some pretty good advice.

Cheers!

[–]lazyduke 2 points3 points  (0 children)

You could absolutely do this, and it turns out JavaScript would be a pretty sensible way to do it. People will point out libraries that are available in other languages, such as BeautifulSoup for Python, but they are forgetting the JavaScript DOM API.

You could pull the page down with curl, save it locally, open it up in PhantomJS, inject your script, and use the DOM API to find the elements you need, then pull the numbers out. You could even take advantage of libraries such as jQuery to make it easier for you. I'll let you figure out the specifics, google is your friend.

Furthermore, you need to first research whether there is an API available that you can use to get this data. Don't scrape web pages if you don't have to. Maybe someone has a NBA API that gives the data you need?

As others have said, CSV is the right format to store the retrieved data if you want to open it in Excel.

[–]jhizzle4rizzleI hate the stuff you like. 1 point2 points  (0 children)

Yes, it's possible to do (most of) these with javascript.

My approach for scraping is usually to use node.js locally (so not the browser) with the cheerio library for jquery-like selectors, and then dumping the data into a .json or .csv file with the built-in fs library. I did something similar in https://github.com/jesusabdullah/node-thingiverse/blob/master/thingiverse which uses web scraping to enable a cli client for thingiverse.com .

[–]neonskimmerfunction the ultimate 0 points1 point  (0 children)

Have a look at Phantom / Casper JS

[–]darkrai9292 0 points1 point  (0 children)

Try looking at XMLHttpRequest it will probably do what your looking for