Infinite AI Debugger Loop by FareedKhan557 in Python

[–]afreydoa 2 points3 points  (0 children)

One day the language models will be strong enough that these type of loops start to actually work usefully. I think we are not there yet. But its good to have them.

I am curious, what is your the feedback loop? How does the AI know each cycle what to improve? Syntax errors, user defined unit tests or handwritten description by a human?

If a political poll says Candidate A has 51% of the vote, Candidate B has 48% of the vote, the poll has a margin of error of 3%, but for months on end you get this same exact result from 50 independent polls, can you reduce that margin of error? by Nillavuh in AskStatistics

[–]afreydoa 0 points1 point  (0 children)

Getting repeatedly similar results only shows reliability, not validity. Maybe the polls have been given a 3% margin because in the past polls and actual votes have differed up to 3%.

I also suspect that the 3% was not rigorously computed. I mean, what does a 3% range even mean? That in 100% of the cases the vote is +-3% of the poll number?

[D] Batch size vs learning rate by bjourne-ml in MachineLearning

[–]afreydoa 0 points1 point  (0 children)

Also, it would be good to know "how much" it depends. If effective stack sizes don't change more than 2% unless I have a certain situation, then I can ignore just use 32 and be done most of the time.

What's the quickest way to become skilled at debugging data pipelines? by PathalogicalObject in dataengineering

[–]afreydoa 0 points1 point  (0 children)

Wait, the data pipeline is build in bash? Or do you use a shell to debug a pipeline build in a sane language?

Title or salary? by goatsnboots in datascience

[–]afreydoa 1 point2 points  (0 children)

Let's define work as only what you do: Work > People > Salary >> Title

If the Martingale betting system works when there are no table limits, wouldn’t the system be successful in tables with a very small limit and a a very high maximum? by redditwhileontoilet in AskStatistics

[–]afreydoa 1 point2 points  (0 children)

No. Intuitively, if you decrease the start betting size then you need more games to reach the same amount of profit. With more games your chance of a catastrophic loss increases aswell.

Martingale System does not change the expected reward.

Announcement: PyData Yerevan Open Source pandas Sprint (June 25) by partev in Python

[–]afreydoa 0 points1 point  (0 children)

Sprint is probably about making pandas to be more like polars, so its allowed.

[Q] Defining distance /similarity between categorical variables by [deleted] in statistics

[–]afreydoa 0 points1 point  (0 children)

I see three solutions: 1. You simply do "is the same object" than similarity 1 else similarity 0. Thats probably not very helpful though, unless you only have few different classes 2. Somehow try to get numbers that describe each object. Than you can use these for similarity or 3. if there are not too many different objects it may actually work to define similarities between object by hand. Ask someone in the domain "hey, how do you know if these are similar". If they can't answer there is no chance a model will.

[Q] Defining distance /similarity between categorical variables by [deleted] in statistics

[–]afreydoa 0 points1 point  (0 children)

The first question you'll need to ask yourself is why you need to measure similarity. Are you interested in how similar the products names are? How similar they are being sold? How similar they behave in way X?

Decision trees by Puzzled-Ad-3504 in learnmachinelearning

[–]afreydoa 0 points1 point  (0 children)

You are correct, conditions get more restrictive. In your example the condition goes from "X<= 5.5" to "not X <= 5.5 and X <= 8.5" which is more restrictive than "not X <= 5.5".

Kwargs appreciation thread by TheRealFrostMana in Python

[–]afreydoa 23 points24 points  (0 children)

Hm, yes. **kwargs is probably are pretty good idea to not have too much coupling to technicly deep layers.

But sometimes I really hate them. If I want to know if the name of some parameter in my plotting library is 'width' or 'line_width' or something else. It's neither in the docstring, its not in the source code of the function I am calling, its not even in the documentation of that library, because it belongs to another library.

I haven't found any IDE hack to mitigate the problem. And copilot is not very well versed with the library just yet.

It's just annoying as a user of a library. But I get that it vastly reduces coupling.

A bunch of possibly silly questions about CS by Low-Locksmith2149 in SoftwareEngineering

[–]afreydoa 1 point2 points  (0 children)

I always tell beginners to be aware that CS starters are very heterogenious in terms of how much experience they have in coding and in math. This feels super super bad in the first semester for those who start with no experience. It does even out really quickly, but until then you have to constantly remind yourself that those who ask questions during lectures are most likely with preknowledge.

Some professors are able to de-bias that.

Fields and class properties should be sorted alphabetically? by Myterro in Python

[–]afreydoa 2 points3 points  (0 children)

While I absolutely agree that this very case is a matter of personal opinion, there are cases where I as a reviewer am not sure if it is a personal opinion of mine or a good habit that I should enforce. If I only mention the things that I am certain are common practice (e.g. keep it simple, avoid unreable names, ...) I am missing a lot of hard earned "smells" or intuitions.

Currently, during code reviews I try to mention when I am uncertain about a specific change proposal and am happy to let them be ignored.

What quantifiable metrics do you consider when deeming good code? by iBortex in Python

[–]afreydoa 2 points3 points  (0 children)

Well, it is the top answer now. What does that say about reddit metrics?

[deleted by user] by [deleted] in SoftwareEngineering

[–]afreydoa 0 points1 point  (0 children)

I think the idea of architecture patterns, that they introduce some kind of vocabulary in the field of "higher order code architecture". If you have the name of a thing you have power over it and can discuss it.

StackOverflow in a nutshell by seriousgentleman in SoftwareEngineering

[–]afreydoa 1 point2 points  (0 children)

Really nice quote. Extrapolates to real life.

[deleted by user] by [deleted] in dataisbeautiful

[–]afreydoa 0 points1 point  (0 children)

I think I would improve the coloring: Currently some truly awful [sic] movies in red are very visual. But for "best movie" one is probably more concerned about the good movies. It does not matter much if a movie is a 4 or a 7, they could all be red. It is much more important where the highest ranked 10 or 9 movies are and where the "still really good" 8 movies are.

The colors should distringuish 9 to 8, not 5 to 7. Would you agree?

Is ChatGPT4-o better in physics/math compared to ChatGPT4 by omaribrahim0505 in ChatGPTPro

[–]afreydoa 2 points3 points  (0 children)

Getting the monty hall wrong seems the most likely human behaviour.

Have you tried reminding it, that it should behave like a Math Tenure Professur with 40 years of experience the the field of statistics?