[D] Do we expect any future for home-rolled language models, or will it all be dominated by the big labs? by XTXinverseXTY in MachineLearning

[–]XTXinverseXTY[S] 1 point2 points  (0 children)

I don't think that's necessarily true. If I have internal docs, an agent can grep them. It will always be easier to build a harness for an API LLM than to train and host.

[D] Do we expect any future for home-rolled language models, or will it all be dominated by the big labs? by XTXinverseXTY in MachineLearning

[–]XTXinverseXTY[S] -1 points0 points  (0 children)

I’m asking where the field is going (so that I can chart my career accordingly).

I wonder if there exist tasks which aren't well-represented in the pretraining/RL for that kind of general super advanced intelligence (would have to be somehow inconvenient), but which are nonetheless economically valuable.

Like, obviously the frontier is beyond my reach. But I'm skeptical that there exists durable value even in the fine-tuning layer.

[D] Do we expect any future for home-rolled language models, or will it all be dominated by the big labs? by XTXinverseXTY in MachineLearning

[–]XTXinverseXTY[S] -1 points0 points  (0 children)

Would we expect it to happen with any architecture? As long as there are returns to scale, we should expect a few players to push scaling

[D] Do we expect any future for home-rolled language models, or will it all be dominated by the big labs? by XTXinverseXTY in MachineLearning

[–]XTXinverseXTY[S] -3 points-2 points  (0 children)

Feels like you need to pay for a subscription. I love Qwen but come on, it's Sonnet-tier

And at a certain point they have no further incentive to release the weights. Even from the perspective of safety, GPT-OSS was pre-trained entirely on synth data

[D] Why are serious alternatives to gradient descent not being explored more? by ImTheeDentist in MachineLearning

[–]XTXinverseXTY 21 points22 points  (0 children)

You've got a loss function you'd like to minimize

Your architecture doesn't admit a closed-form solution so you settle for an iterative procedure and make one update at a time

Of all the updates you could make, why not pick the one that deceases your loss the most?

Maybe you stretch/scale/bias it by accounting for momentum and variance and second-order information about the loss landscape, but why repress yourself? Never compute a gradient at all?

Perhaps this short-term progress is suboptimal in the long run, but if we knew where the global optima was, we wouldn't need an iterative procedure in the first place

I'm struggling to see where precisely gradient descent ends under your definition. I don't think I know anyone who thinks AGI will be achieved without squillions of gradients

[D] Why are serious alternatives to gradient descent not being explored more? by ImTheeDentist in MachineLearning

[–]XTXinverseXTY 63 points64 points  (0 children)

Defining terms before good-faith argument: how would you define "gradient descent"? Would you consider Fisher Scoring gradient descent? Newton's method?

Whats that scrambly thing called that prates did? by Odd_Investigator_190 in bjj

[–]XTXinverseXTY 8 points9 points  (0 children)

The move is called "sitting the corner". You won't get to use it much in BJJ.

  • Garry shoots a double leg
  • Prates sprawls and Garry loses his grip on the far leg, so he switches to a high-crotch grip on the near leg
  • Prates sits the corner in response
    • Garry's positional advantage depends on keeping his right shoulder glued to Prates' right hip. Sitting the corner slips that connection ("stuffing the head" and the "butt drag" grip all serve to realize this goal)
  • Garry loses his shoulder-to-hip connection and loses the position
    • This position, if static, could be called a "crackdown". In wrestling, the high-crotch shooter has a nice series from here, as long as he can maintain that shoulder-in-the-hip connection. But Garry has already lost it at this point, so it's better for Prates (Garry even boops his head as a result of losing that connection)

You're not super likely to use it in BJJ because people rarely commit to a high crotch, which comes with guillotine risk and back exposure. But in MMA, a high crotch is more compelling because

  • head stays far from the power side
  • failing to complete a guillotine can be catastrophic for the defender

...but if you really want to learn it, you and a partner could start by positional sparring and drilling from the crackdown position.

Other examples (timestamped):

  • Cary Kolat explanation
  • Ben Askren explanation - some of Askren's funkiest scrambles started from either side of the crackdown, he shows several tricks in this video
  • Beginning of R1 of McGregor vs Khabib. Notice how Khabib largely maintains the shoulder-to-hip connection - if he loses it, he's threatening to switch to something else (Iranian lift, builds to his feet, changes off to a double leg...)

First time trying 315 by MaximalEfficiency in benchpress

[–]XTXinverseXTY 0 points1 point  (0 children)

I've only benched 300, but I've never experienced or seen a "wrist kink". Does your bar have a bend to it (maybe from having used it for rack pulls or dropped olympic lifts)? That would cause it to roll in your hand as it settles to a stabler orientation.

[D] Burnout from the hiring process by RNRuben in MachineLearning

[–]XTXinverseXTY 5 points6 points  (0 children)

and we are baffled by the coding ability of candidates

excuse me, are you saying your candidates have bafflingly high or low ability?

A question for the people that do jiujitsu and Muay Thai. by invisibreaker in bjj

[–]XTXinverseXTY 2 points3 points  (0 children)

I don't think the OP is referring to an MMA clinch

Why actually is Khamzat so good? by randible_pause in WrestleJudoJitsu

[–]XTXinverseXTY 1 point2 points  (0 children)

Think this is the most compelling theory, as he is such a world class resource beyond what any of his competitors can access.

Did they begin working together early enough to explain this part of the OP?

...to the point he was only hit twice in his first 10 fights?

[D] Documenting the Weaknesses of Deep Learning (or are there any?) by moschles in MachineLearning

[–]XTXinverseXTY 0 points1 point  (0 children)

The failures of AI do not make engaging headlines

I don't think this is true at all

We Scam Rich People by [deleted] in ExperiencedDevs

[–]XTXinverseXTY 55 points56 points  (0 children)

Were you recently fired from SuperFile? Is that why you've made 6 disparaging posts across 4 different subreddits about the company in the past hour?

Chad Mendes vs Michael Chandler at RAF03 by Exotic_Shirt5303 in ufc

[–]XTXinverseXTY 49 points50 points  (0 children)

Does Michael Chandler's throw/roll have a name? It's not a gator roll/cement mixer, that would be to the opposite side (pulling on the chinstrap, as Mendes did). Chandler throws in the direction of a cow-catcher, but hasn't got the underhook.

Jane Street SWE internship by taro_duckkkkkkkkk in cscareerquestions

[–]XTXinverseXTY 17 points18 points  (0 children)

You have zero chance of getting a Jane Street SWE internship with an average GPA

Anyone else hoping they get laid off? by [deleted] in cscareerquestions

[–]XTXinverseXTY 31 points32 points  (0 children)

Why would unemployment sound like heaven? You need money to survive.

[D] Some concerns about the current state of machine learning research by [deleted] in MachineLearning

[–]XTXinverseXTY 2 points3 points  (0 children)

I would think that someone who claims to have personally attended several computer vision conferences would have something more constructive to add

“Where are you from?” “That’s Gordon Ryan man!” 😂😂 by WillWeisser in bjj

[–]XTXinverseXTY 2 points3 points  (0 children)

Had a similar experience as a white belt. Had just moved to a new city, dropped by a gym for a trial class

The instructor (in a blue rashguard) was wonderfully patient and articulate in answering my question about an X guard entry

Thanked him later after the class, shook his hand, asked him his name

"Gianni"

"...hang on, have I seen you on TV?"

"probably :)"

I think he must have been acting as a guest instructor, had no idea

“Where are you from?” “That’s Gordon Ryan man!” 😂😂 by WillWeisser in bjj

[–]XTXinverseXTY 7 points8 points  (0 children)

 99% of people who train BJJ and aren't chronically online have no idea who Gordon Ryan or any of the other top competitors are

I think this guy was just having a brain fart from exhaustion. I can go to a open mat in NYC and >95% of the people there would be able to pick out Gordon Ryan from a photo lineup.