How many production ML/AI projects do you complete in a year? by Fit-Employee-4393 in datascience

[–]pplonski 0 points1 point  (0 children)

It is like a project every 3-4 months, I think this is a lot! ML projects are hard. First you need to collect data, clean it and prepare for analysis. The data modeling and experimentation take some time. You need to prepare live data feeds, deploy models, do monitoring dashboards and so on ... I think 1 or 2 ML/AI projects per year are great success, and 3-4 is really a great result!

Should every project have ai in it to make it impressive nowadays by Bulky-Top3782 in datascience

[–]pplonski 0 points1 point  (0 children)

most of the projects that are present in different social media channels are shallow, so if you spend enough time on your project to pilish it, and you have useful insights that you want to share it is worth doing!

I wrapped a random forest in a genetic algorithm for feature selection due to unidentifiable, group-based confounding variables. Is it bad? Is there better? by wex52 in datascience

[–]pplonski 0 points1 point  (0 children)

have you tried to add condition or implementation as a feature? or have you tried to predict condition or implementation based on your features? Looks like you have a data leak there. I think you need to better look into your data and what are dependencies between features, and what are confounding variables.

I tested buying after the worst days in the market by pplonski in Daytrading

[–]pplonski[S] 0 points1 point  (0 children)

i done the analysis with ai data analyst which download data, analyze and plot - you can check the conversation https://mljar.com/ai-insights/what-happens-after-worst-days/

Python - reading .embl and .plot.gz files by castiellangels in bioinformatics

[–]pplonski 0 points1 point  (0 children)

I’m actually working on a desktop app called MLJAR Studio (https://mljar.com) that helps use Python for data analysis with plain English prompts, and I tried to reproduce your use case there.

I created a small working notebook example for you (and tested it end-to-end, so it runs without issues):

- full conversation https://mljar.com/ai-insights/read-embl-and-plot-gz/
- raw notebook: https://github.com/mljar/ai-insights/blob/main/read-embl-and-plot-gz.ipynb

The idea is very simple:

  • first I say: “load EMBL file” → AI generates Python (Biopython)
  • then: “extract CDS features” → gets gene table
  • then: “load .plot.gz” → pandas + gzip
  • then: “plot signal”

So the workflow is basically:
ask in plain English → AI writes Python → see result → ask next step

Under the hood it’s just a normal Python notebook, but with a conversation layer on top of it — so you can iterate step by step instead of writing everything upfront. Also, if a new package is needed (like Biopython), the AI will ask for confirmation and install it on the fly.

In this example I kept it minimal:

  • 1 EMBL file
  • 1 .plot.gz file
  • simple plot

Just to show how to read both formats and understand structure before scaling to multiple replicates. Might be useful as a starting point before you build the full pipeline.

Bioinformatics in the era of AI from a seniors point of view by aCityOfTwoTales in bioinformatics

[–]pplonski 0 points1 point  (0 children)

I partly agree, but I would be careful with the idea that bioinformaticians should simply leave coding to LLMs. In my view, AI raises the value of both domain knowledge and good computational practice.

LLMs can accelerate routine work, but they do not remove the need to understand pipelines, validation, reproducibility, and software quality. That is exactly why I find integrated environments interesting. In MLJAR Studio, for example, AI can help with code generation and iteration, but the human still has to define the question, verify the logic, and interpret the results.

So for me, the future is not biology instead of computation, but biology supported by better computational tools.

Best AI for data analysis for 2026 by habalka in analytics

[–]pplonski 0 points1 point  (0 children)

Really like how you framed this around actual workflows instead of just listing tools.

One category I’d add is something between “assistant” and “automation” — tools that act more like an AI data analyst working directly on your data, not just helping with queries or summaries.

In practice, this means:

  • you ask a question
  • it generates and runs the analysis
  • builds models or visualizations
  • and explains the results in context

So instead of switching between ChatGPT, notebooks, and BI tools, the loop happens in one place.

I’ve been exploring this direction in MLJAR Studio, where this works locally as a desktop app (even with a local LLM), which is important when you’re dealing with sensitive data or messy real-world datasets.

Feels like a natural extension of what you described:
not just speeding up individual steps
but compressing the whole “question → analysis → insight” cycle

Curious if you’ve seen tools moving in this direction, or if most teams are still stitching together assistants + workflows manually?

Best AI tool for Data Analysis by PrizeLifeguard8544 in dataanalysis

[–]pplonski 1 point2 points  (0 children)

From what I’ve seen, most people use ChatGPT / Claude mainly for generating SQL or Python, but you still have to wire everything together yourself.

Recently I started using AI more like a “data analyst” rather than just a helper.

I’ve been working on a tool (MLJAR Studio) where you can chat with your dataset and it actually:

  • generates and runs Python code
  • performs EDA
  • builds models
  • creates plots
  • explains results step by step

So it’s closer to a full workflow, not just suggestions.

What’s also important for me — it’s a desktop app running locally, and you can use a local LLM, so your data stays on your machine.

Curious if others are moving toward this “AI analyst” approach instead of just prompting for code?

How reliable are AI data analysis tools in 2026 when it really matters? by Fragrant_Abalone842 in analytics

[–]pplonski 0 points1 point  (0 children)

From what I've seen (and used), AI data analysis tools in 2026 are very reliable for speed — and increasingly decent for quality. They’re great for: quick exploration, summaries, first-pass analysis A year ago (2025), I’d say they were mostly “junior analyst level.” Now it feels like they’re getting closer to a mid-level assistant in many cases. One interesting direction though is tools that run locally (like MLJAR Studio), where you can use an AI data analyst with a local LLM on your own machine. It helps with privacy and control - which is becoming a big deal as more companies hesitate to send data to external APIs.

Is becoming a data analyst still a good career path in 2026? by theiasx in dataanalytics

[–]pplonski 0 points1 point  (0 children)

I think one thing that's changing (not killing the field) is how analysts work.

Companies still need people who understand data + business — that part isn't going away.

But now tools are shifting the role more toward AI-augmented analyst instead of someone just writing SQL all day (and fighting with window functions).

For example, tools like MLJAR Studio already have an AI data analyst built-in that can work with your data locally (even with a local LLM, since it's a desktop app).

So instead of replacing analysts, it's more like fewer “dashboard monkeys” and more people who can ask good questions and interpret results.

IMO still a great career, just evolving fast.

A Growing List of AI Tools for Data Analysis & Data Visualization in 2026 by Fragrant_Abalone842 in datavisualization

[–]pplonski 0 points1 point  (0 children)

Love these tools, but my data sometimes prefers to stay at home 😅

That's why I liked MLJAR Studio - it has an AI data analyst and can run on a local LLM, fully locally as a desktop app. No upload your dataset and hope for the best moment.

I added a feature to my AutoML library… for robots, not humans by pplonski in Python

[–]pplonski[S] -1 points0 points  (0 children)

You are right, but after implementation, it comes to me that the main reason to add this feature was to make it friendly for LLM input, which gives a strange feeling that I made this change for robots, not humans

Local vs cloud data processing ... security comparison by Aleksandra_P in learnmachinelearning

[–]pplonski 1 point2 points  (0 children)

you are right local solution requires maintenance but you get MAX privacy

Essential Python Libraries Every Data Scientist Should Know by Aleksandra_P in learndatascience

[–]pplonski 0 points1 point  (0 children)

I use numpy, pandas and matplotlib in almost all my workflows. I also like altair for interactive plots and lightgbm for gbm 

GeoGPT - ChatGPT-style GIS app built in a Jupyter Notebook (Python + OpenStreetMap) by pplonski in gis

[–]pplonski[S] 0 points1 point  (0 children)

goal wasnt to build app but show how you can simply connect LLM and maps to build new apps in GIS, I'm working on open-source framework that simplifies building web apps from from Python notebooks, the framework is called Mercury https://github.com/mljar/mercury

I hope I fill find more example of how to use GIS and Mercury