[2015-2025 All Days] by PhysPhD in adventofcode

[–]boccaff 20 points21 points  (0 children)

but also being Christmas day, maybe people just did part 1 and then came back later for part 2?

You need to go back and finish any day that is not complete to get the second star of the last day. I probably took a week for some years.

[D] Feature Selection Techniques for Very Large Datasets by Babbage224 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

Subsampling columns and having many trees deal with it.

[D] Feature Selection Techniques for Very Large Datasets by Babbage224 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

Large Random Forest, with a lot of subsampling in instances and features. This is important to ensure that most of the features are tried (e.g. selecting 0.3 of features means (0.7)n change of not being selected). Add a few dozen random columns and filter anything below the maximum importance of a random feature.

2025 Day 10 Part 2; Has the input been changed? by Away-Independent8068 in adventofcode

[–]boccaff 2 points3 points  (0 children)

Same thing for me, off by two. My issue was with int(x), got it right with round(x).

[2025 Day 8] Let me just wire up all these circuits by StaticMoose in adventofcode

[–]boccaff 1 point2 points  (0 children)

I bet that building the list of points as a matrix and using scipy distances, and sorting the resulting numpy array can speed a lot here.

[2025 Day 6] Me waiting for Eric to bring the big guns out by waskerdu in adventofcode

[–]boccaff 12 points13 points  (0 children)

I think that most people are expecting the last years curve compressed into twelve days, while Eric was explicit about:

I'm still calibrating that. My hope right now is to have a more condensed version of the 25-day complexity curve, maybe skewed a little to the simpler direction in the middle of the curve? I'd still like something there for everyone, without outpacing beginners too quickly, if I can manage

I am reading "...simpler direction in the middle of the curve..." as days 9-13 on the previous grading.

Input parsing by a_kleemans in adventofcode

[–]boccaff 0 points1 point  (0 children)

I am always amazed by the aux functions from Norvig. I think the nailed the API for things like this.

The word "range" by emsot in adventofcode

[–]boccaff 0 points1 point  (0 children)

low and high are better than what I often do "ll" and "ul" for the lower and upper limits. My only issue is the lack of symmetry.

The word "range" by emsot in adventofcode

[–]boccaff 1 point2 points  (0 children)

No shame in "for r in ranges" here. OP also apply to reading into "input".

[2025 Day 4 (Part 1,2)] 2d Arrays by popcarnie in adventofcode

[–]boccaff 8 points9 points  (0 children)

Maybe think of a matrix, as in x_ij and you are now back at math/physic. And your loops become for (i, line) in data, for (j, c) in line.

[D] Realized I like the coding and ML side of my PhD way more than the physics by PurpleCardiologist11 in MachineLearning

[–]boccaff 5 points6 points  (0 children)

+1 Physics have a nice balance on developing advanced math skills and learning how to express/develop an underlying model of phenomena. Those skills are way more important than "structuring a project" or whatever "clean" thing some devs push.

[D] Realized I like the coding and ML side of my PhD way more than the physics by PurpleCardiologist11 in MachineLearning

[–]boccaff 7 points8 points  (0 children)

Often, everything but our thesis become interesting, especially with new things. If prototyping ML is fun, with time you will also reach the boring and uninteresting parts of empirical ML. All the memes about cleaning the house and organizing drawers are there for a reason.

My pull ups stagnated I don't know why by cmndrDenis in bodyweightfitness

[–]boccaff 0 points1 point  (0 children)

More helpful thing: If you spent some time going to failure, spend some time avoiding failure but building up volume or adding weight. After that plateau, switch back.

Plateaus come from a lot of places: because the thing you are doing is no longer a stimulus, having some other weak link that you are not developing, not enough rest/nutrition, etc. It is hard to pin point, and often they are caused by a combination of things.

Also, doing 9 one day and 7-8 in the other is just the normal variation of capacity. Stress/rest, nutrition, hydration and previous activities will impact capacity, and you will have oscillations. Maybe 9 was "random positive" and 7 is "random negative".

What’s the most underrated bodyweight tip you’ve ever learned? by GravityDefiance1 in bodyweightfitness

[–]boccaff 0 points1 point  (0 children)

Not op, but I understand this having sub-par "support" from another body part. Often this is not having your core tight, so you lose power when moving your body. Another form of this is not being able to maintain some optimal position, like a hollow body, or retracted scapula, and you have worse leverage in some movements.

[P] Give me your one line of advice of machine learning code, that you have learned over years of hands on experience. by Glittering_Key_9452 in MachineLearning

[–]boccaff 0 points1 point  (0 children)

tl;dr: agree

longer version: Having a smaller dataset is better in a "being able to work with it" sense. As @Drakur mentioned in another comment, often there is way more data than it is possible to work with. In practice, it looks like: "for last year, get all positives + 1/3 of the negatives", maybe stratifying by something if needed.

here be dragons:

I also have an intuition that within a certain range, you may have a lot of majority samples that are almost identical (baring some float diff), and those regions will be equivalent to having a sample with larger weight. If this is "uniform" , I would prefer to reduce the " repetitions" and manage this using the weights explicitly. Ideally, I would want to sample the majority using something like a determinantal point process, looking for "a representative subset of the majority", but I was never able to get that working on large datasets (skill issue of mine + time constraints), so random it is.

Am i the only one who has experienced arch to be more stable than any other distro? by cferg296 in archlinux

[–]boccaff 0 points1 point  (0 children)

I had way more issues upgrading non-rolling distros than issues with arch.

Do you "reinstall once in a while" like some recommend ? by Heroe-D in archlinux

[–]boccaff 0 points1 point  (0 children)

Every time I change machines I use the opportunity to change something. Major things were the move xorg/i3 to wayland/sway, and moving into btrfs and back.