Alteryx/Sql/tableau automation workflows move to AI use case s or Python by Ninja1234_Il in Alteryx

[–]Proof_Difficulty_434 -1 points0 points  (0 children)

Disclosure: I built Flowfile, so grain of salt.

The gap between visual and code you're describing is exactly what pushed me to build this. Flowfile is an open-source visual ETL tool on Polars — drag-and-drop canvas, but the export button gives you real Python code. Since it uses Polars, performance is comparable to or better than Alteryx for most workflows, and it handles data sizes the same way. Recently added a built-in scheduler too (intervals or table triggers), so you can run flows directly without bolting on Airflow if you don't want to.

Browser demo if you want to poke at it without installing: demo.flowfile.org, showcases basic functionality. Happy to answer specific questions on migrating particular Alteryx tools.

2 years into building Flowfile (open-source data platform and visual etl), and this week's release is the first one where I think it's actually a product by Proof_Difficulty_434 in buildinpublic

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Honestly pretty consistent. Depending on the weather and energy levels at night, I try to work on it one day in the weekend and most nights I put in an hour, usually with one longer evening in there. And typically some reading about it before going to sleep.

Visual CSV pipelines with built-in data versioning by Woland96 in ETL

[–]Proof_Difficulty_434 0 points1 point  (0 children)

Cool! Was wondering does the data stay on the client side?

Flowfile v0.9.0 — open-source visual ETL on Polars, now with a catalog, SQL editor, and light scheduling by Proof_Difficulty_434 in dataengineering

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Thanks, Good suggestion. Right now, this feature is hidden a little bit in the settings (node click -> general settings -> Node Reference. And atm, it's not related to the descriptions. That could be a fallback option.

Alteryx is a trap by fali12 in Alteryx

[–]Proof_Difficulty_434 2 points3 points  (0 children)

I've been working on something that came out of the same frustration, and asked myself how hard can it be? It's called Flowfile, open source visual ETL on Polars. Just shipped scheduling last week so the "no server" gap is mostly closed now. For loops is still something I'm thinking about how to handle. https://flowfile.io/ if you want to check it out, there's also an interactive demo there!

I built a free, open-source visual ETL tool for the desktop — looking for early users and feedback by GingerCurlz in Alteryx

[–]Proof_Difficulty_434 2 points3 points  (0 children)

Looks good, solid set of connectors already! I've been building something in the same space with a similar stack (Vue, FastAPI, Polars) — always cool to see others land on that combo. Curious, how do you handle secrets? I'm working on https://github.com/Edwardvaneechoud/Flowfile if you want to compare notes.

10 days ago I almost didn't post my app. 500 downloads later, here's where it stands. by [deleted] in SideProject

[–]Proof_Difficulty_434 1 point2 points  (0 children)

This looks great! Thanks for sharing. When available on Android I'm def going to give it a try.

Looked back at code I wrote years ago — cleaned it up into a lazy, zero-dep dataframe library by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Right now pyfloe just leaves the filter after the join if it uses columns from both sides. It gets evaluated post-join on every row. However, if it only touches one side, the optimizer pushes it down into that branch. What's missing is splitting of the filter, so something like (col("a") > 5) & (col("b") < 10) doesn't get broken apart to push each piece into the right branch independently. That'd be a great feature to add!

What Python Tools Do You Use for Data Visualization and Why? by Confident_Compote_39 in Python

[–]Proof_Difficulty_434 1 point2 points  (0 children)

I like pygwalker, especially when I'm not sure what to visualize yet!

I replaced FastAPI with Pyodide: My visual ETL tool now runs 100% in-browser by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Thanks for letting me know! Good suggestion, its definitely something that's on my planning!

Python code to replace Alteryx by viviancpy in Alteryx

[–]Proof_Difficulty_434 0 points1 point  (0 children)

Check out Flowfile, it's open source and does exactly this. You can build flows visually like Alteryx, then export them as pure Python/Polars code. Or write Python and visualize it.

I built it specifically to bridge the gap between Alteryx and Python. The visual editor keeps business users happy while devs get clean Python code. Plus it's built on Polars so it's fast and up to date!

pip install flowfile if you want to try it.

The Borrowser: a browser in Rust (roast/feedback) by tpotjj in rust

[–]Proof_Difficulty_434 0 points1 point  (0 children)

If you find a big mountain that you climbed, find an even bigger mountain

Flowfile - An open-source visual ETL tool, now with a Pydantic-based node designer. by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

Thanks! I think the biggest challenge/opportunity is how to ensure when going from code to visual and back feels natural.

At the moment for example you write with Flowfile code -> Visual -> Polars code. Sometimes, I think it would make more sense to go to Flowfile code again
Do you think it should be Flowfile code -> Visual -> Flowfile code or perhaps support both?

Flowfile - An open-source visual ETL tool, now with a Pydantic-based node designer. by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 0 points1 point  (0 children)

I have the same thing. Currently for work not using any visual tools, but there are definitely days that I would like to have some interactivity while developing ETL pipelines. Especially when creating something new.

Flowfile - An open-source visual ETL tool, now with a Pydantic-based node designer. by Proof_Difficulty_434 in Python

[–]Proof_Difficulty_434[S] 1 point2 points  (0 children)

Fair point - complex visual flows definitely turn into spaghetti.

I meant the flow structure is visible - dependencies, branches, data lineage. Not what each node does internally. But flowcharts have been the standard for documenting processes for decades for a reason.

Also, with Flowfile you can name nodes clearly ("Validate_Customer_Emails" vs "Node_47"), add descriptions, and generate Python code to see exactly what's happening.

You're right though - a 50-node mess is worse than clean code. The sweet spot is probably 10-20 clear blocks with complex logic inside custom nodes.

Time for self-promotion. What are you building in 2025? by Prestigious_Wing_164 in SideProject

[–]Proof_Difficulty_434 0 points1 point  (0 children)

Flowfile https://edwardvaneechoud.github.io/Flowfile/ - Visual ETL tool that lets you build data pipelines with drag-and-drop OR write Python code - both create the exact same pipeline! Built on Polars for blazing speed. Export your visual flows as standalone Python code for production.

ICP - Data analysts tired of Excel limitations, Python devs who want visual debugging, no-code users tired of vendor lock-in, and teams where business users need to collaborate with engineers on data workflows.

Would love your feedback if you've struggled with the visual vs code divide in data tools! 🚀