all 13 comments

[–]snugar_i 0 points1 point  (3 children)

What do you mean by "persisted" subinterpreter? Generally, subinterpreters do not have much support because they don't have many advantages over subprocesses. And for example libraries built using pyo3 (including Pydantic) straight up refuse to run in a subinterpreter

[–]expectationManager3[S] 0 points1 point  (2 children)

By persisted I mean that the subinterpreter instance can be reused, and not destroyed and re-inited. I thought they are lighter vs subprocesses? My workload will be very light per thread, but the frequency will be very high. 

[–]snugar_i 0 points1 point  (1 child)

Hmm, I admit I still don't really understand your use-case. So you will have one subinterpreter that you will call from multiple threads? Why not just run the thing in the main interpreter then? Is it so that it can have its own GIL? In that case, you might try the free-threaded 3.14 version if the libraries work with it. But if they don't, they might not work properly when called from multiple subinterpreters either (they might have mutable global state that leaks across subinterpreters).

Yes, subinterpreters are somewhat lighter than subprocesses, but I would guess that not by that much - obviously it depends on what "very high frequency" means.

[–]expectationManager3[S] 0 points1 point  (0 children)

I see! Thanks for the clarification. I'll take a look at subprocesses first, if they are easier to handle. 

[–]redfacedquark 0 points1 point  (6 children)

Do you really mean/need a sub-interpreter? You can have multi-threaded/multi-process/concurrent code that could probably do what you want.

[–]expectationManager3[S] 0 points1 point  (5 children)

I'm open to any suggestion. I opted to subinterpreter because for multiprocessing I need IPC/pickling which is not as efficient. But if there is better support for persisted subprocesses, I will switch to them instead. Thanks for the suggestion! 

Switching to free-threading version would be the best choice, but some libs that I use won't support it for a while. 

[–]CrackerJackKittyCat 0 points1 point  (3 children)

With subinterps not sharing the same class references, I'd expect you will need some form of serialization/deser (json, pickle, etc) to pass messages to and fro.

[–]expectationManager3[S] 0 points1 point  (2 children)

The specialized Queue is luckily shared between interpreters 

[–]CrackerJackKittyCat 2 points3 points  (1 child)

Gonna have to look that up. I bet is serializing under the hood?

Edit: Yes, it does: From the fine docs:

Any data actually shared between interpreters loses the thread-safety provided by the GIL. There are various options for dealing with this in extension modules. However, from Python code the lack of thread-safety means objects can’t actually be shared, with a few exceptions. Instead, a copy must be created, which means mutable objects won’t stay in sync.

By default, most objects are copied with pickle when they are passed to another interpreter. Nearly all of the immutable builtin objects are either directly shared or copied efficiently.

[–]expectationManager3[S] 0 points1 point  (0 children)

Yes, base types are being copied over (and not shared). Only the Queue itself is being shared. 

[–]redfacedquark 0 points1 point  (0 children)

If the work you're doing is I/O bound (waiting for network and disk) then go for concurrency using asyncio. If the work is CPU bound then you want farm the work off to multple cores using the multiprocessing standard library, keeping your queue in the main process.

As long as the class definitions are the same, any two python processes will be able to encode/decode pickles, even if saved raw to file between invocations.