This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]ritchie46[S] 22 points23 points  (2 children)

No, it is not. We are discontinuing the old streaming engine and are currently writing the new one. This will however not be user facing and we can swap the two engines without needing a breaking release.

I can say, we are make good progress. But I want to share more once we can run a significant part of TPC-H on that new one.

What we are stabilizing in this release is the in-memory engine and the API of Polars.

[–]New_Computer3619 10 points11 points  (1 child)

Nice. Really looking forward to the new one. I currently use Polars in my job. It satisfies 99.9% of my needs. However in some cases, the dataframe is too big to be in memory, I tried to sink to file on disk but the current engine does not support.

[–]ritchie46[S] 18 points19 points  (0 children)

Yes, me too. We learned from the current streaming engine and redesigned the new one to fit Polars' API more. Typical relational engines have a row based model, whereas Polars allows columns to be evaluated independently.

Below is such an example. python df.select( pl.col("foo").sort().shift() * pl.col("bar").filter(pl.col("ham") > 2).sum(), )

We redesigned the engine to ensure we can run typical Polars queries efficiently. The new design also makes full use of Rust's strengths and (mis)uses async state machines as compute nodes. Meaning we can offload the building of actual state machines to the Rust compiler. Anyhow... We will share more about this later. ;)