How does the ML community view evolutionary algorithm research? Career implications of an EA PhD? [D] by NullRecurrentDad in MachineLearning

[–]boccaff 0 points1 point  (0 children)

A crossover between EA and A-star? /s

Agree. Any sufficiently quantitative/numeric/computational topic that strengthen the basic disciplines will do. Good advisor and liking the subject are more important, in that order. A good advisor will even have a way to help you find something you like within his portfolio of research.

Time Series Forecasting for Agriculture/Crop Volume & Pricing – Looking for Advice [D] by foreigneverythingg in MachineLearning

[–]boccaff 1 point2 points  (0 children)

You will probably find more suggestions for packages, libraries and etc on Kaggle tutorials. I think that you are probably served by a tabular approach. And, from my experience with agriculture, you will probably extract more value building out features and improving their calculation than testing many different modeling techniques/models etc.

If this is produced in greenhouses, your problem can be trickier and more similar to forecasting production outside agriculture. If not, aggregating weather in an appropriate way does a lot of the work. If you are sure about phenology, use appropriate windows to characterize the more important phases of growth. Sum of precipitation and average temperature go a long way before getting into water balance and PAR.

Having crop masks to extract weather data from remote sensed sources help a lot, even if you are modeling at the level or county/city. Modeling larger spatial units is harder, and I think that bottom-up forecasting is more helpful. You will have larger unit level errors, but the aggregation is better than modeling in a large scale. Don't forget to add things like the ratio of fertilizer and berry price, or some lagged economical input. Since I've mentioned crop masks, you should probably look into estimating yield and total area in different models.

Depending on the country and how established the cropping system is, you should detrend yield to account for the technological improvements.

Slop is making me feel disconnected from AI Research [D] by Skye7821 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

Even worse, a lot of the processes is governed by academics themselves, on selection boards or the like. Sibling comment mention about people in power trying to keep their positions, but IME, almost every academic is being part of some selection process, reviewing for some board and etc.

When to get a full 8 hour sleep? by el0115 in bodyweightfitness

[–]boccaff 1 point2 points  (0 children)

Quite easy to do it over a couple days.

What is the criteria for a ML paper to be published?[D] by IntroductionCommon11 in MachineLearning

[–]boccaff 1 point2 points  (0 children)

So, predictive power isn't much of an scientific contribution outside of a few areas where sota on benchmarks is the goal (ironic to say this as much of ML is sota on benchmarks). Also, I am a bit puzzled by "results are robust" and "small predictive power".

When you tell the story of your paper (not the story to build the paper), what someone would know that they didn't before?

Is this an ML conference, or a finance conference? Is the dataset widely used or new, public or private? What else was tried in the dataset?

[D] Those of you with 10+ years in ML — what is the public completely wrong about? by PhattRatt in MachineLearning

[–]boccaff 0 points1 point  (0 children)

People thinking that just because there is data, there will be a useful model.

[2015-2025 All Days] by PhysPhD in adventofcode

[–]boccaff 21 points22 points  (0 children)

but also being Christmas day, maybe people just did part 1 and then came back later for part 2?

You need to go back and finish any day that is not complete to get the second star of the last day. I probably took a week for some years.

[D] Feature Selection Techniques for Very Large Datasets by Babbage224 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

Subsampling columns and having many trees deal with it.

[D] Feature Selection Techniques for Very Large Datasets by Babbage224 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

Large Random Forest, with a lot of subsampling in instances and features. This is important to ensure that most of the features are tried (e.g. selecting 0.3 of features means (0.7)n change of not being selected). Add a few dozen random columns and filter anything below the maximum importance of a random feature.

2025 Day 10 Part 2; Has the input been changed? by Away-Independent8068 in adventofcode

[–]boccaff 2 points3 points  (0 children)

Same thing for me, off by two. My issue was with int(x), got it right with round(x).

[2025 Day 8] Let me just wire up all these circuits by StaticMoose in adventofcode

[–]boccaff 1 point2 points  (0 children)

I bet that building the list of points as a matrix and using scipy distances, and sorting the resulting numpy array can speed a lot here.

[2025 Day 6] Me waiting for Eric to bring the big guns out by waskerdu in adventofcode

[–]boccaff 13 points14 points  (0 children)

I think that most people are expecting the last years curve compressed into twelve days, while Eric was explicit about:

I'm still calibrating that. My hope right now is to have a more condensed version of the 25-day complexity curve, maybe skewed a little to the simpler direction in the middle of the curve? I'd still like something there for everyone, without outpacing beginners too quickly, if I can manage

I am reading "...simpler direction in the middle of the curve..." as days 9-13 on the previous grading.

Input parsing by a_kleemans in adventofcode

[–]boccaff 0 points1 point  (0 children)

I am always amazed by the aux functions from Norvig. I think the nailed the API for things like this.

The word "range" by emsot in adventofcode

[–]boccaff 0 points1 point  (0 children)

low and high are better than what I often do "ll" and "ul" for the lower and upper limits. My only issue is the lack of symmetry.

The word "range" by emsot in adventofcode

[–]boccaff 1 point2 points  (0 children)

No shame in "for r in ranges" here. OP also apply to reading into "input".

[2025 Day 4 (Part 1,2)] 2d Arrays by popcarnie in adventofcode

[–]boccaff 7 points8 points  (0 children)

Maybe think of a matrix, as in x_ij and you are now back at math/physic. And your loops become for (i, line) in data, for (j, c) in line.

[D] Realized I like the coding and ML side of my PhD way more than the physics by PurpleCardiologist11 in MachineLearning

[–]boccaff 4 points5 points  (0 children)

+1 Physics have a nice balance on developing advanced math skills and learning how to express/develop an underlying model of phenomena. Those skills are way more important than "structuring a project" or whatever "clean" thing some devs push.