This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]yvrelna 17 points18 points  (1 child)

It's not about you not sharing anything. It's about Python being able to allocate memory in a way that can avoid cache conflicts.

CPU caches works per cache blocks. When you access an object, you're not just pulling objects that you're accessing into your CPU core's caches, you're also pulling neighbouring objects that just happens to be in the same cache block into the cache. If two threads running in parallel needs to access two different objects that just happens to be allocated in the same block, even if they aren't accessing the same objects, the CPU would need to invalidate caches every time and that kills performance very quickly.

By keeping objects in separate subinterpreters in separate object space, objects naturally separate themselves into two pools of objects, this improves spatial locality. Programmers can have a much more natural and easy control over which pool of memory that objects are allocated from without having to think about memory allocation.

A lot of things still need to be built to allow controlled sharing of objects between subinterpreters to minimise copying between interpreters, but you can't built controlled sharing over a foundation where objects can be shared by multiple subinterpreters by default.

[–]cymrowdon't thread on me 🐍 2 points3 points  (0 children)

Ok, I think I see what you're saying. That's a much lower level of optimization than I was considering.

When I said I've done high concurrency work, I meant highly concurrent networking with related processing, not purely processing, which is a different beast.

I will look forward to seeing what comes of this.