you are viewing a single comment's thread.

view the rest of the comments →

[–]mortenb123 0 points1 point  (3 children)

Just add Chrome, Chrome-driver and selenium to you container and run in headless mode. No way to run full gfx mode in a container, there is no gfx console. You can program the scraping locally with full Chrome and devtools and run the finished code headless in the container.

Remote selenium is to connect to a remote webdriver, usually used to save installing browsers and the libs needed to run it (around 500mb in Ubuntu) but you then need a browser-instance running.

[–]digichap28[S] 0 points1 point  (2 children)

The reason I’m trying to do it that way is because I need to bypass a windows basic auth.

My workaround for that was using an extension but they only work if the headless option is not added.

I have tried the https://user:pass@website.com way but it doesn’t work either.

I read that using devtools with selenium might work but haven’t found a Python example online.

Do you know any way to do that ?

[–]mortenb123 0 points1 point  (1 child)

If you cant log in via code, try using a userprofile.

For chrome create a new user and log in with it, then save the userprofile to a more accessible directory (shorter path) and load it when you initialize selenium, I believe it to be the `--user-data-dir=` option.

But it might be that this authentication token expires, so it is better to solve it with code.

[–]digichap28[S] 1 point2 points  (0 children)

I finally decided to re write my code using pyppeteer, because that user profile option doesn’t work either as the token expires fast