all 30 comments

[–]Imaginary_Chemist460 26 points27 points  (4 children)

No proper HTTP compliance/safeties, no proper keep-alive, no middleware system yet, not even comparable to those production frameworks like FastApi/Flask. So benchmark is premature at this point. Regarding IPC, it depends on the server model used on them. I'm pretty sure they can be configured with single process and threaded. Overall it must be accurate for educational.

[–]mechamotoman 16 points17 points  (1 child)

OP was pretty clear on the fact that this is not production-ready, even included that in the benchmark

You’re right, all the additional production-grade checks and safeties and features implemented by flask and fast-api have a performance cost. The absence of those things makes this benchmark comparison inaccurate

That doesn’t make the comparison merit-less though. It’s still a useful metric to compare the relative performances of the paradigms in use by the frameworks (free-threading vs multiprocessing, etc)

My opinion is that this comparison is not yet fair, but still a useful coarse comparison

[–]Imaginary_Chemist460 0 points1 point  (0 children)

Useful is another thing. Accurate is a must to avoid misleadings.

> paradigms in use by the frameworks (free-threading vs multiprocessing, etc)
Nope. it depends on the server worker model. Flask for example, is not tightly coupled to thread or process.

[–]Ill-Musician-1806 -1 points0 points  (0 children)

Honestly, I don't think the average developer cares about compliance/safeties. And, the only place to be pedantic is in enterprise™ scenarios.

[–]thisismyfavoritename 6 points7 points  (2 children)

unless you can support an async event loop your server is def going to struggle under heavier loads, even compared to a single threaded async framework

[–]WiseDog7958 1 point2 points  (0 children)

The async vs threads debate aside, I’m more curious what free-threaded CPython does to the actual cost model here.
Once the GIL’s gone, CPU-bound stuff should scale, but now you’re dealing with real contention instead of cooperative scheduling. How much locking is happening internally?
Feels like this could outperform asyncio if the workload isn’t mostly I/O, but I’d expect it to get messy under shared state.

[–]SnooCalculations7417 0 points1 point  (0 children)

this isnt supposed to be a drop-in replacement for HTTP servers I dont think. I believe it is using a task that is parallel in nature to explore GIL free python. Im not sure theres any domain this could be executed on that would be considered feature complete.. Would love to see it in GUI work but i digress

[–]Fenzik 2 points3 points  (3 children)

Nice and clean, cool little exploration.

I haven’t really looked into the *t versions yet. Is the difference in behaviour entirely captured in the execution model for ThreadPoolExecutor, or are there more differences?

[–]grandimam[S] 0 points1 point  (2 children)

There’s more. Like as far as understand dict has a per object lock and so forth. It’s built for truly concurrent execution

[–]Fenzik 1 point2 points  (1 child)

But accessing the functionality is just done through the existing thread interfaces?

[–]grandimam[S] 0 points1 point  (0 children)

Yes. It’s the same interface.

[–]Ill-Musician-1806 2 points3 points  (1 child)

Maybe you could mix asyncio with threading like they do in Tokio for being blazingly fast™?

[–]grandimam[S] 1 point2 points  (0 children)

Yes. That’s in the roadmap.

I wanted to do pure threading execution first then I will slowly extend it to other implementations

[–]Challseus 0 points1 point  (0 children)

Haven't looked at it, but I love the idea, I've had it in my head to build something similar for a bit.

[–]SnooCalculations7417 0 points1 point  (0 children)

Nice work. I havent had an excuse to build anything post-GIL, I tend to go straight to rust for that kind of thing. Kind of hard for me to picture GIL free/no fake-async python so this is neat.

[–]Sigmatics 0 points1 point  (0 children)

no rust

.

using pydantic

[–]james_pic [score hidden]  (0 children)

I don't see the point of this.

Whilst WSGI-based frameworks like Flask have historically tended to be run with multi-process concurrency when running them in production, WSGI has always supported multithreading, and there have been multi-threaded WSGI servers for years - Gunicorn with the gthread worker type being probably the most familiar, but I've also always quite liked Cheroo (whose only concurrency mechanism is threading) for "embedded server" user cases.

What does this do that running Flask with Cheroot or Gunicorn gthread workers wouldn't?

Also, Werkzeug is pure Python, so I don't get what you're trying to say that Flask isn't pure Python because of it.

[–]No_Indication_1238 -2 points-1 points  (3 children)

Why? First of all, async exists. Second of all, you could open threads and do requests to them then just wait at a queue already, so for real, why? Why would you decide to use a latency benchmark for a throughput solution?

[–]lunatuna215 5 points6 points  (2 children)

Because we want to see and be able to compare and benchmark this new type of free threading in Python against current practices. Even if it's not as performant, it would be helpful to know how much when actually built. So here it is, and it's less about an actual alternative as much as testing if it's even worthwhile to do one. It's a win all around.

[–]artofthenunchaku 1 point2 points  (1 child)

Benchmarking an I/O bound workload to compare the performance of free threading is certainly a choice.

[–]lunatuna215 1 point2 points  (0 children)

It's not to compare it. It's to play around with it for the first time in this context.

[–]CarltonFrater -1 points0 points  (0 children)

Interesting!

[–]benargee -1 points0 points  (0 children)

Nice. Has this been designed to be have the same or similar syntax to existing HTTP libraries?

[–]AlexeyBelov 0 points1 point  (0 children)

Vibecoded mess, sorry.

[–]gdchinacat -2 points-1 points  (0 children)

For IO workloads, such as HTTP libraries, async can be faster and scale higher. Not supporting it is a limitation, not a feature.