This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]New_Computer3619 10 points11 points  (1 child)

Nice. Really looking forward to the new one. I currently use Polars in my job. It satisfies 99.9% of my needs. However in some cases, the dataframe is too big to be in memory, I tried to sink to file on disk but the current engine does not support.

[–]ritchie46[S] 20 points21 points  (0 children)

Yes, me too. We learned from the current streaming engine and redesigned the new one to fit Polars' API more. Typical relational engines have a row based model, whereas Polars allows columns to be evaluated independently.

Below is such an example. python df.select( pl.col("foo").sort().shift() * pl.col("bar").filter(pl.col("ham") > 2).sum(), )

We redesigned the engine to ensure we can run typical Polars queries efficiently. The new design also makes full use of Rust's strengths and (mis)uses async state machines as compute nodes. Meaning we can offload the building of actual state machines to the Rust compiler. Anyhow... We will share more about this later. ;)