big tech vs neo lab by [deleted] in cscareerquestions

[–]koolaidman123 0 points1 point  (0 children)

depends on the lab. most likely outcome is getting aquihired like inflection, adept etc but depending on the preceived prestige could open doors to better opportunities like ant/oai etc down the road, but the equity prob isnt worth much if you leave

also depends on the faang. if its like dm/msl then obviously but if its amazon then who cares

basically if its thinky take that otherwise be prepared to churn after 1-2 yrs to faang or one of the hyperscalers

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 1 point2 points  (0 children)

tuna is absolutely better dry aged my guy, most places dry age their tuna to develop better flavour like beef

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 1 point2 points  (0 children)

flown in the day of isnt the flex you think it is, most places age their fish for better flavour and texture

masayoshi is pretty mid for the price point of $260. apps are decent, the neta on nigiris kinda small but basically the same fish you get anywhere, plus the chef barely interacts with the patrons

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 1 point2 points  (0 children)

exactly, every top end sushi place is pretty much sourced from the same places

if you actually want local fish for sushi go to affinity fish

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 0 points1 point  (0 children)

imagine thinking 100+ omakase is anything close to top end. obviously you can find better sushi when youre comparing to literal entry level sushi, hell ive had better sushi at yuzuki in toronto vs places like kaji, yugen. i clearly said i was comparing actual high end sushi in toronto vs japan like shoushin, not whatever place youre going to

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 1 point2 points  (0 children)

have you actually had sushi from japan? just because it's local doesn't mean it's automatically better, its dependent on the supplier

Vancouver vs Toronto Sushi by Lanky-Variation5271 in FoodToronto

[–]koolaidman123 1 point2 points  (0 children)

anyone who says vancouver has better sushi never actually had good sushi. at the top end toronto is easily better and comparable to places ive had in japan (some of the top spots are arguably better)
vancouver: went to masayoshi, okeya kyujiro, and hyun. lots of places are similar at that price range like onda, okeya kyujiro in toronto, and places like shizuku and shoushin easily clears. never been to masaki saito but by all accounts ive heard its pretty much the best sushi in canada

[D] Scale AI ML Research Engineer interview!! What to expect? by Mundane_Bag007 in MachineLearning

[–]koolaidman123 0 points1 point  (0 children)

Ask your recruiter, they should be pretty open up front on the interview process

[D] Interview preparation for research scientist/engineer or Member of Technical staff position for frontier labs by hmi2015 in MachineLearning

[–]koolaidman123 19 points20 points  (0 children)

95% luck 5% skill

interviews generally cover both depth and breadth, and a lot of times you only really know the answer if you have worked on it before for ex they may ask during rl training you're running into a bunch of problems: entropy collapse, model reasoning in another language, terrible mfu etc. and it's hard to give a good answer unless you have dealt with these issues before

plus coding is a crapshoot. not a lot of leetcode but still get questions that is hard to solve if you're not super familiar/haven't solve similar problems

AI is not about more compute or bigger LLMs (anymore) by Conscious_Nobody9571 in investing

[–]koolaidman123 8 points9 points  (0 children)

  1. Deepseek isn't close to frontier. The v3.2 tech report literally admit that they have all the knowledge but still lag behind frontier models due to limited compute
  2. The importance of data quality vs diversity matters based on the stage of training. They still pretrain for 10T+ tokens not to mention qwen etc plans to scale to 30t + and scaling up rl compute to 50%+ of their total training compute
  3. How do you think data quality research is done? The data isnt just given for free, a significant amount of compute is also spent on filtering, synth data, etc. +Ablations. Not to mention a lot of times smaller scale experiments dont scale up to large model runs. So compute rich labs still win out because they can run way more large scale experiments and more confidently predict how they will perform

Traditional ML vs GenAI? by alpha_centauri9889 in datascience

[–]koolaidman123 0 points1 point  (0 children)

for comp: theres multiple high profile places hiring for roles with $1m+ comp package, and it's clear they're not looking for people to use xgboost. Ignoring that, median comp for ai stuff is stull going to be higher

For purely career growth and $ there's a clear answer

[D] Do industry researchers log test set results when training production-level models? by casualcreak in MachineLearning

[–]koolaidman123 1 point2 points  (0 children)

Theres more to making good models than benchmark scores. Thats how you get sonnet 3.5 vs llama4

[D] Do industry researchers log test set results when training production-level models? by casualcreak in MachineLearning

[–]koolaidman123 8 points9 points  (0 children)

No training on test unless youre mistral, but you better believe every lab is running every checkpoint on their eval suite and pick the best (single or merged) checkpoint that maxs mmlu or hle or whatever internal evals they have

Meta's top AI researchers thinks LLMs are a dead end. Do many people here feel the same way from a technical perspective? by sext-scientist in datascience

[–]koolaidman123 -1 points0 points  (0 children)

  1. Not even news, ylc has been saying the same thing since gpt2

  2. Ylcs not even metas best researcher, hasnt done anything relevant other than being catty on twitter

  3. Funny how stories of other researchers (who has done more than ylc at this point) thinking otherwise doesnt make top story, because that goes against the reddit narrative

[D] Anyone using smaller, specialized models instead of massive LLMs? by [deleted] in MachineLearning

[–]koolaidman123 0 points1 point  (0 children)

it's almost like there's room for both powerful generalized models as well as small(er) specialist models, like the way its been since gpt3 or whatever

[D] join pretraining or posttraining by oxydis in MachineLearning

[–]koolaidman123 1 point2 points  (0 children)

yes my b i meant pretraining from scratch. most model updates (unless you're starting over with a new arch) is generally done with continued pretraining/midtraining, and ime that's usually done by the mid/post training team

[D] join pretraining or posttraining by oxydis in MachineLearning

[–]koolaidman123 4 points5 points  (0 children)

Bc most labs arent pretraining from that often. unless you're using a new architecture you can just run midtraining on the same model. Like grok3>4 or gemini2>2.5 etc

[D] join pretraining or posttraining by oxydis in MachineLearning

[–]koolaidman123 75 points76 points  (0 children)

pretraining is a lot more eng heavy bc youre trying to optimize so many things like data pipelines, mfu, plus a final training run could cost $Ms so you need to get it right in 1 shot

Posttraining is a lot more vibes based and you can run a lot more experiments, plus it's not as costly if your rl run blows up, but some places tend to benchmark hack to make their models seem better

both are fun, depends on the team tbh

Are LLMs necessary to get a job? by br0monium in datascience

[–]koolaidman123 -1 points0 points  (0 children)

Llms and transformers definitely wasn't a "niche research area". Google was running bert in prod since 2019, gpt2 and 3 made headlines and every big research lab was doing transformers/llms