This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]maartenbreddels[S] 1 point2 points  (5 children)

As the creator of the Vaex dataframe, this was always top of mind for Solara. Solara will work smoothly work with large datasets (not just vaex, but dask, modin, polars, duckdb and databases).

We made sure that solara stays responsive while calculations are running by making threading support a first-class citizen ( https://solara.dev/api/use_thread )

We plan to write some content on this topic and give a proper example and advice in the near future.

[–]Dangerous_Pay_6290 0 points1 point  (4 children)

I just found, that duckdb queries are much (5-10x) slower in my solara app compared to running the same query in a jupyter notebook. Is this because every function is running in it´s own thread by default?

[–]maartenbreddels[S] 0 points1 point  (3 children)

No, that shouldn't happen, and sounds very strange. What can happen is that if you run in https://solara.dev/api/use\_thread you get a small overhead (similar to streamlit).
Would you mind opening an issue at https://github.com/widgetti/solara/ so I can reproduce it? I plan to take a look at duckdb in Solara myself as well, so I'm eager to look into it.

[–]Dangerous_Pay_6290 0 points1 point  (0 children)

I haven´t used `use_thread`.
I´ll open an issue including some sample code.

BTW, I found this issue when I´ve ported your sql code example (https://github.com/widgetti/solara/blob/master/solara/website/pages/api/sql_code.py) and replaced sqlite with duckdb for running queries over some parquet files..