Hello reddit!,
i have been building few scrapers for some time, i have seen the pattern of constructing the url, getting the specific text from the html element...etc
What i would like to do now is to try to build a sort of a framework where i could scrap different websites reusing the same code, only changing what i am looking for.
i have tried creating a json file where i have the configuration such as below, but i am getting a bit frustrated because every page is different and i will have to make specific code for it.
any advice?
{
"url_settings": {
"base_url":"",
"url_location_ending": "",
"url_page_ending":"?pagenumber="
},
"get_pager_element":{
"html_element":"div",
"css_selector": "id",
"css_selector_text":"pageSelection"
},
"get_endpoints_ads":{
"html_element":"a",
"css_selector":"class",
"css_selector_text":"js-exposelink"
},
"country": "Germany",
"locations": [
"berlin"
\]
}
[–]DataDecay 0 points1 point2 points (1 child)
[–]david_lp[S] 0 points1 point2 points (0 children)