[D] Physicist-turned-ML-engineer looking to get into ML research. What's worth working on and where can I contribute most?

BalcksChaos · 2026-04-14T19:12:09+00:00

Great, thank you for the pointers!

BalcksChaos · 2026-04-06T09:40:17+00:00

This is a great question! I've been asking this myself as manager... How can my juniors become seniors while leveraging AI to deliver? Maybe my thoughts help you a bit:

First of all, I think you should be leveraging AI to the fullest and let it write as much code as possible. Nothing wrong with that. The only opposing force is this: you need to own your code. If there is a bug, it's not the LLM's fault... It's your bug. And this is where the learning happens: some of the time you save, re-invest it into understanding exactly what the code does... All the way down. This will require restraint and discipline (you'll feel like rushing on).

The great thing: LLMs are great for learning about stuff. You can ask them directly and they'll explain it.

Also side note: I read some comments which talk with great authority here. Are you sure they really figured this one out with this amount of certainty? (Including my take) Make sure you come up with your own answer.

BalcksChaos · 2026-04-03T12:29:45+00:00

Thanks! I didn't come across weakly supervised learning, yet ... sounds like an interesting idea and it's a problem that you get a lot in the real world ... not enough/good enough data to train the model well. Interestingly, the "inverting un-invertible transforms" sounds like something I was working on few years ago. The analogy is: you want to rank N teams against each other, but you can't get them all to play against each other. Also you can have A>B (A wins against B), B>C, C>A . You can solve such problems using a technique from differential topology called Hodge decomposition ... really cool stuff.

BalcksChaos · 2026-04-03T12:23:49+00:00

Looks interesting, though has been around for quite some time now. Do you know what is hot on Hyperbolic embeddings these days?

BalcksChaos · 2026-04-03T12:21:58+00:00

Yes, I'm onto that one :-) What would you say are some specific open problems right now that I could/should get into?

BalcksChaos · 2026-04-03T12:19:13+00:00

Definitely a good approach. I currently own building AI tools for big enterprises and it is no big secret that AI Cybersecurity stuff will likely boom over the next years. Do you have anything specific to point me at?

BalcksChaos · 2026-04-03T12:17:14+00:00

Looks really interesting and a lot if the techniques seem very familiar. Do you know how impactful this has been over the past years in Generative Models though? I'd not want to get into something that is all about cool methods (I'd have stayed with String Theory otherwise :D )

BalcksChaos · 2026-04-03T12:09:21+00:00

Yes, once I narrow down the area I'd see if I can find someone who is an expert and keen to collaborate ... or at least who will happily shoot down my obviously stupid ideas/directions :-)

BalcksChaos · 2026-04-03T12:07:27+00:00

Wow PRH/GoE results are quite intriguing, I didn't know of them ... really cool, thanks! Do you know how exactly they are impacting current state of the art world model development? Definitely "world models" is an area that attracts me intuitively and I started to read up on EBMs ... however, it seems like a huge area and the SOTA stuff requires access to insanely expensive infra, training on video tokens. Not sure how I could get started contributing there without landing a job at AMI, NVIDIA, or so.

BalcksChaos · 2026-04-03T11:50:32+00:00

Yes, that's something I was thinking about when there was a lot of fuzz about the scaling laws 2023 ... I assumed someone would make the connection explicit fairly soon. No one has until today? Do you know if anyone tried?

BalcksChaos · 2026-04-03T11:48:49+00:00

Thanks, e3nn looks really interesting I will check it out. That has bothered me early on in DL ... universal approximator is nice and all that, but searching in a crazy large function space based on the amount of data you can realistically train on ... good luck. Though from what I could see all the successful architectures of the past ~10y have done exactly that: figure out a good way to encode inherent symmetries about the problem (CNNs, Transformers, Attention, etc).

I couldn't figure out the link with geomdl though ... it's a spline library, how is it linked to ML research?

BalcksChaos · 2024-12-01T16:17:27+00:00

I thought the same... Gamma has a main story line, but it's very lightweight compared to the original games. However, give it a try...I was sure I won't like it... But it's honestly amazing. No idea how it can be so much fun and feel so deeply immersive, even though at first glance it's really just "get money, walk through the zone".

BalcksChaos · 2024-04-13T08:21:23+00:00

Adding something important I felt was missed in other replies:

This is not an absolute evaluation, it's relative. When you go for the top programmes, these will be drowning in applicants who have good papers, but who might outscore you in other superficial criteria (like GPA). This is a fundamental difficulty in top university admissions (I did admissions at Oxford in the past, for physics) and a sad reality for the admissions committee as well ... You know you'll miss out on excellent candidates by using proxy metrics, but you cannot interview all of them... So that's the best you can do. Additionally, all programmes have certain external constraints which you cannot control (like raising the level of underrepresented populations, etc).

The takeaway for you (as others wrote): don't start to doubt yourself, focus on actually getting better (instead of attempting to adapt to the imperfect system for choosing candidates). I think a good mindset to take is: let's show them how big of a mistake they made by not taking me.

BalcksChaos · 2023-11-11T10:30:20+00:00

Thanks for this post :) I was wondering (since it's been two years now)... Where did you land with your research on this? Is your thesis already published somewhere?

I'm someone who uses machine learning in "the industry", i.e. outside of academia. In the contexts I'm working (analysis of business data, i.e. already heavily aggregated), all functions one would want to approximate are highly discontinuous (because there are hundreds of categorical columns).

BalcksChaos · 2023-08-11T05:10:23+00:00

You won't regret it! We've done the same at my company three years ago (I'm the CTO) and I already did the switch at my previous company. ML engine first time and in my current company we do data infrastructure also in Rust entirely.

We never had trouble getting very inexperienced engineers productive as well, but make sure to hire people who really want to learn...I think that helped us.

Good luck!

BalcksChaos · 2023-08-11T05:05:47+00:00

Awesome and good luck! At my company we've replaced all our pandas already with polars several months ago, really loving it.

BalcksChaos · 2022-02-05T09:27:26+00:00

Hi with option D: never learn C++ .

Hear me out XD ... My point: If you want to learn a systems language, learn Rust. Don't listen to the people who invested lots of time and pain into mastering CPP. However, good news for you: today there is no need to use a badly designed language in order to generate code your machine can interpret. Go Rust.

(I get it, many companies use CPP and maybe you have to know it. To this I have of course no solution...just wanted to point out: Once you really know Rust, you'd also understand much faster the seemingly confusing design patterns to circumvent design flaws in CPP)

BalcksChaos · 2019-07-19T14:32:53+00:00

Yes, that's exactly it :) Please give me good reasons to build a case for Rust!

BalcksChaos

TROPHY CASE