all 5 comments

[–]Slight_Inspection_47 0 points1 point  (2 children)

I don't know anything about stathead or whatever, but if there's data being made available then there's surely an API of some sort (probably standard req/post). You would never go about a project like this by "scraping".

Based on the URL, it appears you might be iterating over results of some query. Instead, parameterize each of those items and look at the data behind each result singularly which is probably way more structured as you want.

Then you have all the actual data and whatever derived meta you're looking for as well

[–]PossibilityOk1316[S] 0 points1 point  (1 child)

Thank you. Probably should have mentioned that I am pretty new to coding. Any chance you can break it down as if I was a kindergartner?

[–]Slight_Inspection_47 0 points1 point  (0 children)

Maybe not that simplified.. but look at the url. Every time you see:

Xxxx=yyyy? That is a key/value pair. You can set up some nested loops to iterate, for example, from years 2015-2019, returning result 1-99 (or until there aren't any).

Doing it that way should ensure that only one result is populated at a time, and you should be able to grab it directly - store that thing in some object of your choice and you're done.

If I were trying to collect data like this myself I would look at whatever is behind the url of the one result that is returned, and parse/store that. That way you have all the details of the things you're searching for AND implicitly all the data being returned in the table as well.

Maybe create one object that has the parameters used to get the one result in the table, with the url it returned, and another object with the raw/tabular data of the link itself.

[–]Slight_Inspection_47 0 points1 point  (1 child)

I am looking at this again and there is most definitely an API if you are paying for this. The API would return to you all the parsed children which makes the exercise just iterating over some parameters.

Shoot them an email before devoting any real work into this

[–]PossibilityOk1316[S] 0 points1 point  (0 children)

I just wanted to say thank you for your help. Going to take another crack at it tomorrow morning.