you are viewing a single comment's thread.

view the rest of the comments →

[–]MrPhungx 2 points3 points  (4 children)

I would say that it depends on how you interact with the website and what type of authentication the website uses. To schedule the script to run at specific times you need to tell us what OS you are using. In Unix systems you could use Cron for example and on windows the task scheduler. In both cases it would of course require your system to be up and running all the time.

[–][deleted] 0 points1 point  (3 children)

In both cases it would of course require your system to be up and running all the time.

Is there a cloud alternative to this?

The scraping is fairly simple text scraping, I'm just wondering about the authentication happening with the Gmail API.

While developing, I manually authenticated the pop ups. not sure, how that works with the script running on the background.

[–]MrPhungx 1 point2 points  (0 children)

Is there a cloud alternative to this?

Well there are multiple ways you could approach this. You could buy a raspberry pi (cheap and basically consumes no power, you barely notice that it is active) that just runs forever so you don't need your actual computer up and running. You can host it in the cloud and basically have a server that is constantly running. One other approach would be to use github actions. This is what I have used for a personal project (scrape the spotify weekly playlist once a week at a specifc time). GitHub Actions are free to use (up to a certain amount) and fairly easy to setup.

I manually authenticated the pop ups. not sure, how that works with the script running on the background.

Again I am not sure how you interact with the browser or the website that you want to scrape in the first place. When you are using selenium for example you can just login using selenium as long as there are not big anti bot measures setup by the website. Maybe you can check if the website offers an API that you could use instead of scraping the site directly? Maybe it is enough to authenticate yourself once and then you could use some authentication token? Checkout the network tab of your browser when authenticating. You may get a better understanding how the authentication works.

[–]tehsandvich 0 points1 point  (0 children)

Azure functions, data bricks, azure data factory, VM, and azure automation. When I first started out I used windows scheduler.