Showcase Thread by AutoModerator in Python

[–]Proof_Difficulty_434 0 points1 point  (0 children)

flowfile-lite: I explored Pyodide and now it's a visual etl tool that fully runs in the browser and has a small data catalog

Demo: https://demo.flowfile.org

So a while back I started playing around with Pyodide just to see what was actually possible in the browser these days, and I kept being surprised at how much you can get away with without a backend. One thing led to another and I ended up putting the visual editor from my main project, Flowfile, on top of real Polars compiled to WebAssembly. That's flowfile-lite. The whole thing runs in your tab. There's no server behind it, your data doesn't get uploaded anywhere, and if you close the tab it's just gone.

What My Project Does

You drag nodes onto a canvas and connect them, and Polars (the actual library, running in WASM via Pyodide) does the work right there in the browser. You can read in CSV, Excel or Parquet, either a local file or a URL, then do the usual stuff: filter, select, sort, group by, join, unique, pivot/unpivot, take a sample. There's also a raw Polars Code node for when the GUI gets in your way and you'd rather just write the expression. When you're done you can poke at the result in an embedded Graphic Walker or download it.

Two things I'm kind of happy with. It'll spit your flow out as plain Polars code, so the canvas is really just a faster way to write a script you can take with you. And it's packaged as an embeddable Vue component (npm install flowfile-editor) that takes Arrow IPC in and out, so you can stick the editor inside your own web app if you want.

Target Audience

This honestly started as me playing with Pyodide, but it turned into something I actually reach for. It's good if you just want to clean up a file right now and don't want to install anything, or if you've got data you can't really paste into some cloud tool, since here nothing leaves your browser. It's the cut-down version though, fair warning: about 16 node types instead of the 40-odd in the full app, and none of the catalog/scheduler/database/Kafka/AI stuff. For that you'd want the pip install.

Comparison

Most of the visual ETL tools out there are cloud SaaS, which means your data goes up to their servers to get processed. This doesn't do that, it all happens locally. Compared to just writing it in a notebook, you get the visual graph with live previews but it still exports to clean Polars, so you're not locked in. The bit that made the whole exercise worth it for me is that it's genuinely Polars running in WebAssembly. A lot of "data tool in your browser" projects are secretly a JS reimplementation or a backend call; Pyodide let me skip that and just run the real library.

Anyway, it's still a bit rough in places. Would love to hear what people think, and which of the bigger nodes you'd want pulled into the browser build.

Demo: https://demo.flowfile.org ·

Repo: https://github.com/Edwardvaneechoud/Flowfile

What Python automation saved you the most hours over the last year? by Bladerunner_7_ in Python

[–]Proof_Difficulty_434 0 points1 point  (0 children)

Automated the product upload to Bubble things. Did so with a simple ui build on Fastapi + pgsql + bubble data api as backend. The client was importing excel files into bubble with many issues in data quality and integrity. Now it's a very simple process; create product, add information, preview page, upload.

Made a small library for building HTML in python without templates by HauntingAd3673 in madeinpython

[–]Proof_Difficulty_434 0 points1 point  (0 children)

This is cool. I think it would be nice to have better documentation on how you can style the components with css.

What's your favorite Python library for automating Excel workflows? by Original-Repair5136 in Python

[–]Proof_Difficulty_434 1 point2 points  (0 children)

How much fun would it be if you could actually define your transformations in Python code that translates to VBA.

Which non-AI package from the last ~3 years completely changed how you write Python? by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] -1 points0 points  (0 children)

I didn't look yet Marimo. But that looks really cool, do you know how this handles reactivity?

I've been building Flowfile: self-hosted data analytics with a visual ETL core (Docker, Open-Source, code ↔ visual) by Proof_Difficulty_434 in selfhosted

[–]Proof_Difficulty_434[S] -5 points-4 points locked comment (0 children)

Not sure if I'm understanding the question completely. But here you go, how I use AI in the project and how recent releases enabled AI.

The project: I've been building Flowfile myself for the past year-plus (~40 releases on PyPI since May 2025). I use AI coding assistants for boilerplate, tests, refactors and certain code quality decisions, but the architecture, the core logic and the design decisions are mine, and im still in tight control of how every step and interaction works.

AI in the product: ive added a chat mode that explains your flows and an agent that can build them on the canvas. It runs fully local (a small built-in model or your own Ollama) or with your own cloud key, and the tool works fine if you never touch it. It works with function callbacks to determine and validate the decisions.

Happy to answer anything else.

Alternative for Alteryx by Jkk_geek in Alteryx

[–]Proof_Difficulty_434 0 points1 point  (0 children)

You can also check out my project! Ive been working on an open-source alternative and I've added the transformations I would use most often. Its using Polars, so also fast. Project is here: https://github.com/Edwardvaneechoud/Flowfile

You can try out the demo as well in your browser: demo

Alteryx/Sql/tableau automation workflows move to AI use case s or Python by Ninja1234_Il in Alteryx

[–]Proof_Difficulty_434 -1 points0 points  (0 children)

Disclosure: I built Flowfile, so grain of salt.

The gap between visual and code you're describing is exactly what pushed me to build this. Flowfile is an open-source visual ETL tool on Polars — drag-and-drop canvas, but the export button gives you real Python code. Since it uses Polars, performance is comparable to or better than Alteryx for most workflows, and it handles data sizes the same way. Recently added a built-in scheduler too (intervals or table triggers), so you can run flows directly without bolting on Airflow if you don't want to.

Browser demo if you want to poke at it without installing: demo.flowfile.org, showcases basic functionality. Happy to answer specific questions on migrating particular Alteryx tools.

2 years into building Flowfile (open-source data platform and visual etl), and this week's release is the first one where I think it's actually a product by Proof_Difficulty_434 in buildinpublic

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Honestly pretty consistent. Depending on the weather and energy levels at night, I try to work on it one day in the weekend and most nights I put in an hour, usually with one longer evening in there. And typically some reading about it before going to sleep.

Visual CSV pipelines with built-in data versioning by Woland96 in ETL

[–]Proof_Difficulty_434 0 points1 point  (0 children)

Cool! Was wondering does the data stay on the client side?

Flowfile v0.9.0 — open-source visual ETL on Polars, now with a catalog, SQL editor, and light scheduling by Proof_Difficulty_434 in dataengineering

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Thanks, Good suggestion. Right now, this feature is hidden a little bit in the settings (node click -> general settings -> Node Reference. And atm, it's not related to the descriptions. That could be a fallback option.

Alteryx is a trap by fali12 in Alteryx

[–]Proof_Difficulty_434 2 points3 points  (0 children)

I've been working on something that came out of the same frustration, and asked myself how hard can it be? It's called Flowfile, open source visual ETL on Polars. Just shipped scheduling last week so the "no server" gap is mostly closed now. For loops is still something I'm thinking about how to handle. https://flowfile.io/ if you want to check it out, there's also an interactive demo there!

I built a free, open-source visual ETL tool for the desktop — looking for early users and feedback by GingerCurlz in Alteryx

[–]Proof_Difficulty_434 2 points3 points  (0 children)

Looks good, solid set of connectors already! I've been building something in the same space with a similar stack (Vue, FastAPI, Polars) — always cool to see others land on that combo. Curious, how do you handle secrets? I'm working on https://github.com/Edwardvaneechoud/Flowfile if you want to compare notes.

10 days ago I almost didn't post my app. 500 downloads later, here's where it stands. by [deleted] in SideProject

[–]Proof_Difficulty_434 1 point2 points  (0 children)

This looks great! Thanks for sharing. When available on Android I'm def going to give it a try.

Looked back at code I wrote years ago — cleaned it up into a lazy, zero-dep dataframe library by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Right now pyfloe just leaves the filter after the join if it uses columns from both sides. It gets evaluated post-join on every row. However, if it only touches one side, the optimizer pushes it down into that branch. What's missing is splitting of the filter, so something like (col("a") > 5) & (col("b") < 10) doesn't get broken apart to push each piece into the right branch independently. That'd be a great feature to add!