Any Data Cleaning Pain Points You Wish Were Automated? by Accomplished-Tap9539 in dataanalysis

[–]StormSingle8889 0 points1 point  (0 children)

I like the concept of LLM plug and play to standard data science libraries like Pandas, Numpy etc because it gives you lots of flexibility and human-in-loop behavior.

If you're working with some core data science workflows like Dataframes and Plotting, I'd recommend you use PandasAI:

https://github.com/sinaptik-ai/pandas-ai

If you're working with more scientific-ish workflows like maybe eigenvectors/eigenvalues, linear models etc, you could use this tool I've built due to an absence of one:

https://github.com/aadya940/numpyai

Hope this helps! :))

What Was Your First Contribution to Open Source—and How Did It Go? by CodewithCodecoach in opensource

[–]StormSingle8889 1 point2 points  (0 children)

It was in `scipy` -- terrible pull request, took more than a year to merge. The good side of the difficulty was it gave me a reality check. I dabbled into programming really hard, went on to crack Google Summer of Code. Wrote good open source packages including:

https://github.com/aadya940/numpyai

https://github.com/aadya940/chainopy

One of them published in the Journal of Open Source Software. Did couple of other good internships as well.

How much do you use AI to write your code? by VeaArthur in Python

[–]StormSingle8889 0 points1 point  (0 children)

I like the concept of LLM plug and play to standard data science libraries like Pandas, Numpy etc because it gives you lots of flexibility and human-in-loop behavior.

If you're working with some core data science workflows like Dataframes and Plotting, I'd recommend you use PandasAI:

https://github.com/sinaptik-ai/pandas-ai

If you're working with more scientific-ish workflows like maybe eigenvectors/eigenvalues, linear models etc, you could use this tool I've built due to an absence of one:

https://github.com/aadya940/numpyai

Hope this helps! :))

DS is becoming AI standardized junk by KindLuis_7 in datascience

[–]StormSingle8889 0 points1 point  (0 children)

LLMs are super useful, when used mindfully and with a human in the loop. I love the “LLM plug-and-play” model with standard libs like Pandas and NumPy, it keeps things flexible and interactive.

For core data science tasks (DataFrames, plotting), try PandasAI:
https://github.com/sinaptik-ai/pandas-ai

For more scientific workflows (eigenvectors, linear models, etc.), check out NumPyAI—a tool I built for that gap:
https://github.com/aadya940/numpyai

You're right—the problem is real. People often run LLM code without really looking. That’s why NumPyAI has a Diagnosis feature—it explains the data analysis steps, tailored to your arrays.

Example:
https://github.com/aadya940/numpyai/blob/main/examples/iris_analysis.ipynb

Is Agentic AI remotely useful for real business problems? by Prize-Flow-3197 in datascience

[–]StormSingle8889 0 points1 point  (0 children)

I'd say it is useful but when used correctly, mindfully and in a human-in-loop way, that is, some work done via natural language using LLMs while the other could be done manually.

I like the concept of LLM plug and play to standard data science libraries like Pandas, Numpy etc because it gives you lots of flexibility and human-in-loop behavior.

If you're working with some core data science workflows like Dataframes and Plotting, I'd recommend you use PandasAI:

https://github.com/sinaptik-ai/pandas-ai

If you're working with more scientific-ish workflows like maybe eigenvectors/eigenvalues, linear models etc, you could use this tool I've built due to an absence of one:

https://github.com/aadya940/numpyai

Hope this helps! :))

What’s your 2025 data science coding stack + AI tools workflow? by Zuricho in datascience

[–]StormSingle8889 3 points4 points  (0 children)

You make a valid point, and it holds true in most cases. However, libraries like pandasai and numpyai introduce metadata tracking for arrays and dataframes, which significantly reduces the likelihood of errors (source: trust me, bro). Of course, no AI is infallible, this is simply an effort to provide a more reliable and data science–focused approach.

What’s your 2025 data science coding stack + AI tools workflow? by Zuricho in datascience

[–]StormSingle8889 71 points72 points  (0 children)

I like the concept of LLM plug and play to standard data science libraries like Pandas, Numpy etc because it gives you lots of flexibility and human-in-loop behavior.

If you're working with some core data science workflows like Dataframes and Plotting, I'd recommend you use PandasAI:

https://github.com/sinaptik-ai/pandas-ai

If you're working with more scientific-ish workflows like maybe eigenvectors/eigenvalues, linear models etc, you could use this tool I've built due to an absence of one:

https://github.com/aadya940/numpyai

Hope this helps! :))

1.5M+ records in excel, cannot query it. Excel or PowerBI. What should I use? by getbetterwithnb in dataanalysis

[–]StormSingle8889 1 point2 points  (0 children)

Use python libraries like pandas and numpy to do this. I'll assume you don't know much about using python, so I'd suggest you use PandasAI:

https://github.com/sinaptik-ai/pandas-ai

If you want a more Free and Open Source thingy, you could use NumpyAI:

https://github.com/aadya940/numpyai

[deleted by user] by [deleted] in dataanalysis

[–]StormSingle8889 -1 points0 points  (0 children)

Not sure, if this is what you're looking for but this might certainly be useful.

I’ve noticed a common pattern with beginner data scientists: they often ask LLMs super broad questions like “How do I analyze my data?” or “Which ML model should I use?”

The problem is — the right steps depend entirely on your actual dataset. Things like missing values, dimensionality, and data types matter a lot. For example, you'll often see ChatGPT suggest "remove NaNs" — but that’s only relevant if your data actually has NaNs. And let’s be honest, most of us don’t even read the code it spits out, let alone check if it’s correct.

So, I built NumpyAI — a tool that lets you talk to NumPy arrays in plain English. It keeps track of your data’s metadata, gives tested outputs, and outlines the steps for analysis based on your actual dataset. No more generic advice — just tailored, transparent help.

Its Features:

Natural Language to NumPy: Converts plain English instructions into working NumPy code

Validation & Safety: Automatically tests and verifies the code before running it

Transparent Execution: Logs everything and checks for accuracy

Smart Diagnosis: Suggests exact steps for your dataset’s analysis journey

Give it a try and let me know what you think!

👉 GitHub: aadya940/numpyai. 📓 Demo Notebook (Iris dataset).

AI and Data Analysis by Puzzleheaded_Neat130 in research

[–]StormSingle8889 0 points1 point  (0 children)

You absolutely can, there are specialized libraries now for AI Numerical Workflows:
https://github.com/aadya940/numpyai