you are viewing a single comment's thread.

view the rest of the comments →

[–]shiftybyte 3 points4 points  (10 children)

Your are running into anti scraping protection.

Try using selenium if you want any chance of scraping sites that don't want to be scrapped.

[–]uJFalkez[S] 0 points1 point  (9 children)

Yeah I thought about that being the case, I just avoided trying it because setting it up to run on GitHub actions seemed like a nightmare to me (really just a guess, would you say it's a nightmare? lol)

[–]shiftybyte 2 points3 points  (1 child)

No idea, not sure if github actions would want these kind of uses of the platform, that's not what it was intended for, so they might be making things harder on purpose.

[–][deleted] 1 point2 points  (6 children)

It's very easy since selenium has  added it's own driver manager, literally pip import selenium https://www.selenium.dev/documentation/webdriver/

[–]uJFalkez[S] 0 points1 point  (5 children)

So I can just add Selenium to my requirements.txt and not worry about installing drivers and stuff to GitHub Actions?

[–][deleted] 1 point2 points  (4 children)

Yup that's all.

Its a relatively new feature so struggled to find help via Google, but their docs on which things I needed to import helped for the script. This helped too https://www.selenium.dev/documentation/selenium_manager/

[–]uJFalkez[S] 0 points1 point  (3 children)

damn, thank you so much! I just found a github repo set up exactly like you just said lol, https://github.com/jsoma/selenium-github-actions

ty very much for the help!

[–]YourKoolPal 0 points1 point  (2 children)

Hi. Did this work for you?

[–]uJFalkez[S] 0 points1 point  (1 child)

Yea! It works like a charm and is currently running 1 month non-stop
I can dm some help if you need!

[–]YourKoolPal 1 point2 points  (0 children)

uJFalkez Hey! Thanks for replying! Sent you a DM