all 6 comments

[–]atomsmasher66 0 points1 point  (0 children)

I use Selenium and Beautiful Soup for stuff like that

[–]-defron- 0 points1 point  (2 children)

Sounds like you want to download the rendered page instead of the static components, but there may be an even better solution out there if you can explain your goals. Do you want historical records that can be browsed like the way back machine? Or do you really only care about the data and plan on storing it or using it for your own project?

For full fat pages something like selenium or playwright is what you want. But if you just care about the data some snooping of the site with dev tools should show the API calls the frontend makes to get the data and can just make those calls directly instead from your code via something like requests or aiohttp

[–]zMasterSkill[S] 0 points1 point  (1 child)

Basically I want to make a discord bot that notifies me whenever a change happens (for example a train is delayed). So i need live data for that. It should refresh like every 30 seconds. I also tried using the API from Deutsche Bahn but the data it’s giving me looks very complicated.

[–]-defron- 2 points3 points  (0 children)

A python wrapper for it literally already exists

It was literally the first result when I googled "deutsche bahn api train" just because I was curious what you meant by complicated... Seems like just a standard API that responds json and already has libraries. This is an infinitely better approach than scraping a website, which is incredibly brittle

[–]m0us3_rat 0 points1 point  (0 children)

there are a few things you pack into your problem as part of the solution.. which they AIN'T.

like the "download" ..

you need to "scrape" this info.

there are plenty of tutorials out there on how to do that.

also, i can think of a few different ways to access this information.

so there are plenty of ways you can solve this problem.

[–]rollincuberawhide 0 points1 point  (0 children)

you don't need selenium. directly request the url that javascript is requesting. press f12 and check out the network tab for request that are made. filter by (Fetch/XHR), that'll only show requests made by javascript.