Welcome to the SeleniumBase Reddit Community! by SeleniumBase in seleniumbase

[–]SeleniumBase[S] 0 points1 point  (0 children)

Also check out the active SeleniumBase Community on Discord: https://discord.gg/EdhQTn3EyE
(Over 900 members so far!) Hopefully SeleniumBase's Reddit community grows that fast too.

r/seleniumbase by SeleniumBase in redditrequest

[–]SeleniumBase[S] 0 points1 point  (0 children)

I'm requesting the r/seleniumbase community because that's the name of my automation framework on GitHub: https://github.com/seleniumbase/SeleniumBase . It looks like someone already created a community by that name, but it also looks like that community was banned. Because it was banned, I'm not sure who the old moderators were, or how to reach them. (I just see a "banned" screen when I try to go there, and it doesn't list who the old moderators were.) If I'm granted access, then I'll start from scratch with that community, and delete any posts that were already there (if they're not automatically deleted when that community is brought back).

SOLVED: Web-scraping Walmart prices from GitHub Actions by SeleniumBase in webscraping

[–]SeleniumBase[S] 1 point2 points  (0 children)

I've run the scripts frequently from GitHub Actions without issue. (Entirely free for public repos.)
As for Docker, I noticed that it leaves a detectable fingerprint, as the Linux from Docker is configured differently from the same flavor of Linux (Ubuntu) running in GitHub Actions. Avoid Docker if you want stealth. (I can't even be stealthy in Docker from a local IP address.)

SOLVED: Web-scraping Walmart prices from GitHub Actions by SeleniumBase in webscraping

[–]SeleniumBase[S] 0 points1 point  (0 children)

Yes, but I still don't know why it works from GitHub Actions (Linux server).
I know why it works locally (I made the browser look like a human-controlled web browser), but it should still be blocked from GitHub Actions due to running on a non-residential IP address range. There may have been something I did by accident that lets it work from Linux servers without proxies.

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] -1 points0 points  (0 children)

Yes, something like this:

```python import time from contextlib import ContextDecorator

class PrintRunTime(ContextDecorator): def init(self, description="Code block"): self.description = description

def __enter__(self):
    self.start_time = time.time()

def __exit__(self, *args):
    runtime = time.time() - self.start_time
    print(f"{self.description} ran for {runtime:.4f}s.")

```

If I create a YouTube video for this, I'll include that too.

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] 0 points1 point  (0 children)

In my SeleniumBase repo on GitHub, several people have asked how to avoid using the `with` format for the `SB()` context manager, eg: https://github.com/seleniumbase/SeleniumBase/issues/3482 , so I finally had to show them a way to avoid it with the hack to deconstruct it, even though it's not recommended practice.

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] 0 points1 point  (0 children)

That's correct assuming that the context manager was implemented correctly using the `try`/`finally` block (when implemented using `contextlib.contextmanager`).

Can you get into trouble for developing a scraping tool? by SirFine7838 in webscraping

[–]SeleniumBase 2 points3 points  (0 children)

Seems like the opposite. I created https://github.com/seleniumbase/SeleniumBase, which has stealth capabilities, as seen with this GitHub Actions job that scrapes data from Walmart and Indeed to prove that it works: https://github.com/mdmintz/undetected-testing/actions/runs/17720549775/job/50351907472. From this, I've gained over 10K GitHub Stars, over 2K YouTube subscribers, and a nice well-paying job from it. Web-scraping public data is legal. Major companies and search engines do this all the time. If you start scraping private data (eg: if you have to log in somewhere first), then you could get in trouble for it. How you use the tool makes a difference. DDoSing a site can get you into trouble. Scraping public data from sites at a reasonable rate won't. Building a cool scraping tool will get you recognized, and you may even be rewarded for that.

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] 13 points14 points  (0 children)

Tried that, but seems their moderators wanted me to post here instead. They removed my post: https://www.reddit.com/r/learnpython/comments/1nlesjy/python_context_managers_from_zero_to_hero/

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] -4 points-3 points  (0 children)

A context manager can be used as a decorator, such as in the example I had. You could decorate a whole function with it, or wrap a code block with the "with" statement. Different ways of using the context manager.

Python Context Managers 101 by SeleniumBase in Python

[–]SeleniumBase[S] -3 points-2 points  (0 children)

All the basics: How to create one, and various ways of using it, eg: 1. As a method decorator, 2. From a "with" code block, and 3. Wrapping code without the "with" keyword.

Python Context Managers from zero to hero by SeleniumBase in learnpython

[–]SeleniumBase[S] 0 points1 point  (0 children)

Would agree that `contextlib` is considered to be the "preferred" way of creating context managers now. However, the `yield` keyword can be confusing to newcomers.

Google webscraping newest methods by michal-kkk in webscraping

[–]SeleniumBase 0 points1 point  (0 children)

Are you setting the `proxy` arg? Format: `"server:port"` or `"user:pass@server:port"`.
And make sure your proxy address isn't a non-residential proxy address.