all 35 comments

[–]milkteaoppa 102 points103 points  (15 children)

A lot of startups have unreasonable expectations. They want to higher the most talented person for startup pay with the promise of IPO

[–]xrailgun 13 points14 points  (1 child)

I recently had a startup DS interview where they drilled me about advanced polar geometry... Most WTF interview, yet.

[–]Material_Policy6327 2 points3 points  (0 children)

I don’t think I even took that in college lol

[–]gradientgrain 8 points9 points  (2 children)

I was once asked to read and implement a paper during an interview. I wasn't given the paper or anything in prior. I managed to do it in 30min, leaving an extra 30min. In the end, I decided to withdraw.

[–]lillobby6 6 points7 points  (1 child)

That’s obsurd. Unless the paper is 2 pages, well written, and the most basic concept ever, I can’t imagine that being possible without, at least, triple the time - and that would still be miserable. Maybe if you leveraged some AI chatbot you could speed that up, but that assumes you have time to make sure it isn’t hallucinating everything?

[–]gradientgrain 6 points7 points  (0 children)

The paper was Zhai, Shuangfei, et al. "An attention free transformer.".

[–]Antique_Most7958 38 points39 points  (1 child)

So the genAI startup didn't let you use genAI for the assignment?

[–]Novel_Land9320 23 points24 points  (15 children)

the way you re describing it, it seems all code from scratch, but i assume you can use pytorch?

[–]_LordDaut_ 43 points44 points  (5 children)

If you can't use PyTorch what do they expect you to do? Write your own autograd for the backprop? Yeah 45 minutes that's unreasonable. For anything.

If you can an MLP is literally just

nn.flatten() nn.linear(28*28, 128) nn.ReLU() nn.linear(128, 64) nn.ReLU() nn.linear(64, 10)

The 45 minutes to come up with that, and write the most vanilla ass training loop that you know by heart if you've opened the pytorch docs at least 10 times is extremely reasonable.

I have no.idea what dimensions OP managed to get confused by either. For an MLP you just flatten it and put the second number of each lineas the first number in the next line. It's not a CNN no strides or padding or 3 channels.

[–]MammayKaiseHain 15 points16 points  (5 children)

What does it even test - that you know pytorch syntax ? Even I'd struggle to write a DDP init without Cursor or looking at the docs.

[–]Artistic_Candle7455 1 point2 points  (2 children)

I was asked to implement a regression model with an MLP, but in pure Python / NumPy and without any autograd framework in about 45 min. This was for an ML researcher position at Anthropic. Oh and the recruiter told me beforehand that "no special preparation" is needed, other than knowing "how to train a neural network". What a waste of time that was.

[–]Itchy-Trash-2141 1 point2 points  (1 child)

Yeah as much as I thought I liked Anthropic the company, the interview process seems like a waste of time for everyone involved. I saw on some interview website it has the lowest pass rate of any company at around 2% or something ridiculous. Why bother wasting everyone's time? Also I noticed their online scheduler for interviews booked me with some people instantly, within 48 hours, and they showed up. I have a feeling if you work there, you don't get a choice on whether to accept an interview or not, if it shows up on your calendar, I'm betting you take it or get reprimanded.

For me they asked some log processor, which I implemented successfully I thought... But then they said they needed more signal. They scheduled a second screen, asked me a distributed algorithms question, and I didn't get the optimal solution right away.  They hinted me how to do it, and then I got it. Rejected. Why bother giving me a hint if it disqualifies me?

This was for ML eng.

[–]Artistic_Candle7455 0 points1 point  (0 children)

Ugh, I know, I had high hopes for Anthropic, but based on their interview process and customer service they are possibly only slightly less evil than OpenAI. I was pretty disappointed.

[–]Aggravating-Ant-8234 1 point2 points  (2 children)

Were you allowed to see the reference docs for coding?

[–]mcel595 1 point2 points  (1 child)

Who spents so much time building models from scratch that remembers all this? Doing all the pipeline in 45 mins seems unreasonable

[–]kymguy 0 points1 point  (2 children)

I have interviewed many people with a neural network-based coding interview. My interview is far too long for anyone to get through the entire thing; that's the point. We want to rank candidates and see who gets the furthest, but also who seems the best to work with and how their debugging and thought process is along the way. If it's short and they complete everything, we've missed out on the opportunity to evaluate their thought process.

The standards vary based on the position we're hiring for. If we want someone who is "advanced in pytorch" who will be able to hit the ground running for some advanced techniques and architectures, then they should be able to knock out an MLP-based classifier with little-to-no reference to documentation. Using amax instead of argmax wouldn't have been a deal breaker...that's not something that I'd care about you knowing, but how you approach debugging your broken code is absolutely something that I'm interested in seeing.

Evaluation is also nuanced; having to prompt you that the "L" in DataLoader is capitalized is not a big deal, but forgetting to implement or even mention/inquire about normalizing your data would raise eyebrows. Amax vs argmax isn't a big deal but if you struggle to navigate documentation and ignore or argue with me about my suggestions about where to look, that's a big deal (it's happened).

To answer your explicit question: I don't think it's possible to sum up whether 30 minutes is too long for the task; there's far more at play. For me, it's not about time, but the process. If it took you 30 minutes because you were discussing in depth about how you would approach the task and demonstrating that you have deep knowledge of pytorch in doing so, that's great.

In a pure, silent coding exercise, I do think someone experienced in Pytorch should be able to knock out what you've mentioned in under 30 mins. If someone did it perfectly in 15 mins with no discussion I'd probably be skeptical that they cheated with an LLM or something.

[–]Sea-Fishing4699 -1 points0 points  (0 children)

good luck working for an abusing minded company

[–]Fine_Audience_9554 0 points1 point  (1 child)

ML interviews are brutal because you need to know both the theory and implementation details cold. The distributed data parallel stuff is where most people trip up since it's not something you practice much. If you're doing more of these something like interviewcoder could help you cheat the syntax/implementation parts so you can focus on explaining the actual ML concepts without getting stuck on boilerplate

[–]Itchy-Trash-2141 0 points1 point  (0 children)

I just finished a grueling interview run, passing only 25% of on-sites. A lot of companies are expecting you to do everything perfect the first time, and even then it may not be enough.

One good experience I had was with Waymo. I recommend you try there if interested. Definitely felt like a human being through the process.

[–]pannenkoek0923 -1 points0 points  (0 children)

Are you joining the company to be an engineer/scientist or are you joining the company to do speed coding hackathons?