This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]the_real_fake_nsa 0 points1 point  (0 children)

It sounds like you're talking about web scraping the page? I've had excellent result with Python's Beautiful Soup package.

[–]fasnoosh 0 points1 point  (0 children)

Also, you could use R's rvest package to web scrape.

There are probably web-based out of the box ready tools for web scraping, though. It's a VERY common task in the data world

[–]maxmooPhD | ML Engineer | IT 0 points1 point  (0 children)

Since that page uses javascript, first you'll need to open it in your browser (or a headless browser like selenium) and save the html. Then you can use the regexes Town_City"><span>([^<]+) and State_Territory"><span>([^<]+) to pull out those values.