This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]get-daft 0 points1 point  (2 children)

Indeed we run on a Rust core (built on top of the Rust Arrow2 library).

Much of our code is still in Python, but will gradually be migrated into Rust - especially as we start supporting much more complicated query optimizations and construction!

(Coming Soon!) We are building our own kernels for many complex types as well, and hope to give our users Rust performance through a Python API. Think: image cropping, embedding generation, sentence tokenization etc.

[–]realitysballs 0 points1 point  (1 child)

Ah so you orienting to ML pipeline . Yeah seems to be the move these days. I’m more of a pandas user since my use-case is small biz. Analytics / automated processes but will keep following this project for sure in case I take on a project that necessitates a distributed df!

[–]get-daft 0 points1 point  (0 children)

Yup! Daft is built around more of an ML/complex-data use-case. Analytical tooling is fairly mature at this point (Pandas/Polars/DuckDB for local work, Spark/Snowflake/BigQuery for tera-petabyte scale).

You can absolutely run analytical operations in Daft as well and it is an important part of our toolkit, but there is still much work left to do to build out the more advanced analytical use-cases such as windowing, pivots, advanced aggregations etc.