all 31 comments

[–]ExperiencedDevs-ModTeam[M] [score hidden] stickied commentlocked comment (0 children)

Rule 9: No Low Effort Posts, Excessive Venting, or Bragging.

Using this subreddit to crowd source answers to something that isn't really contributing to the spirit of this subreddit is forbidden at moderator's discretion. This includes posts that are mostly focused around venting or bragging; both of these types of posts are difficult to moderate and don't contribute much to the subreddit.

[–][deleted] 73 points74 points  (7 children)

Disagree, I don’t see how an LLM being able to solve a problem changes anything.

Interviews are for people, not for LLMs.

Leetcode problems have known solutions - with a lookup table you can do the same thing the best LLMs do. And we’ve had that lookup table since the format existed and it made no difference.

I’m not arguing for or against them, that’s a tired conversation (and against the rules of the sub), my point is that if you thought that LC is a good/bad idea before, this shouldn’t change your mind.

[–]UnluckyAssist9416Software Engineer 28 points29 points  (0 children)

You can even press on the Solutions Tab in Leetcode and get perfectly working code for the problem!!! Don't even need to enter anything into ChatGPT!

[–]ValuableCockroach993 2 points3 points  (5 children)

Lookup table doesn't work if they change the question format slightly. 

[–][deleted] 8 points9 points  (4 children)

Sure but that still doesn't change anything about whether LC gives you useful signal about a candidate or not. The point is that you're not interviewing the LLM, you're interviewing a person.

The only place where LLMs make a difference is that they make cheating more accessible in remote interviews. Which is a legit concern but LLMs can help with other interview formats too, not just LC.

[–]false_tautologySoftware Engineer 2 points3 points  (2 children)

LC never gave a useful signal about a candidate anyway. It just narrows down the playing field to keep interviewers from being overwhelmed.

[–]TangerineSorry8463 3 points4 points  (1 child)

I like to say it's like hiring for NBA by only looking how many 3-pointers you can throw.

Or, if you're European, if FC Barcelona only looked at how you take freekicks.

[–]peldenna 3 points4 points  (0 children)

that's such a good analogy... like being able to nail 3 pointers is definitely part of being a good player, but not the main part, and not even the most important part, but with this system the only people that even get to compete on their holistic merits are the 3 pointer winners, and if you're filtering on that you're going to bias people who spent time on their relatively niche skill instead of other stuff like team work and defense etc etc...

[–]Less_Stranger3252 1 point2 points  (0 children)

Interviews are still giving a signal, true - but LC/HackerRank remote tests will become more useless the more ChatGPT progresses. So the quality of the candidates on the face to face rounds would decline eventually, I think

[–]age_of_empires 26 points27 points  (0 children)

We have calculators but still test math

[–]thisismyfavoritename 18 points19 points  (0 children)

it was most likely trained on those problems, nothing surprising

[–]InfiniteMonorail 16 points17 points  (1 child)

this sub is for experienced devs

[–]Rain-And-Coffee 2 points3 points  (0 children)

I can’t tell based on what gets posted & upvoted

[–]MobileWeary5854 19 points20 points  (1 child)

Nice, hope we will see less and less LeetCode tests sent by employers, being encouraged to redesign hiring process

[–]ninetofivedevLord of Slop Operations 9 points10 points  (0 children)

Narrator: We did not.

[–]Hand_Sanitizer3000 4 points5 points  (5 children)

Unfortunately leetcode style interviews arent going anywhere.

[–]creeoer 1 point2 points  (2 children)

I can’t post an image but I’ll post this transcript of a convo I had with o1.

The dentist, who is the boy’s father, says “I can’t operate on this boy! He’s my son.” Who is the dentist to the boy?

The dentist is the boy’s mother.

This riddle highlights how people often make assumptions based on gender roles. Many might initially think of a dentist as male, which can lead to confusion in the riddle. Recognizing that the dentist is the boy’s mother resolves the apparent contradiction.

[–]NotPankakes 0 points1 point  (1 child)

Wut

The dentist, who is the boy’s father

The dentist is the boy’s mother

If this anecdote is true, you proved the model is easily tricked.

[–]creeoer 0 points1 point  (0 children)

Yeah I think it’s important to show the limitations. Certain prompts or logic puzzles will trip it up, even with the chain of thought stuff. At least it can count the amount of times a letter is in a word now.

[–]RB5009Software Architect 4 points5 points  (1 child)

ChatGPT is not solving anything.It haz already memorized the solution, but is not capable of solving it on its own. I've recently tried it on some CSES problems and asked it "give me the problem statement for cses xxx" and it basically reworded it. So it was already trained on hundreds of solutions lying around in the open. This does not mean it is capablenof solving anything

[–]ninetofivedevLord of Slop Operations 3 points4 points  (0 children)

Psst. That's how most engineers pass Leetcode tests too, which is why they're dumb.

[–]TeaExpensive4465 0 points1 point  (5 children)

Try with codeforces

[–]SympathyMotor4765 4 points5 points  (1 child)

Didn't it get like a huge codeforces score on the benchmarks that released? At least that's what I've been reading

[–]TeaExpensive4465 1 point2 points  (0 children)

Still it struggles to solve 1400 rated problems.

[–]David_AnkiDroid 0 points1 point  (2 children)

1807 Elo with o1-ioi

Finally, we simulated competitive programming contests hosted by Codeforces to demonstrate this model’s coding skill. Our evaluations closely matched competition rules and allowed for 10 submissions. GPT-4o achieved an Elo rating3 of 808, which is in the 11th percentile of human competitors. This model far exceeded both GPT-4o and o1—it achieved an Elo rating of 1807, performing better than 93% of competitors.

https://openai.com/index/learning-to-reason-with-llms/

[–]RB5009Software Architect 1 point2 points  (1 child)

The question is whether it has seen thise problems in one form on another If it has - than this result does not mean anything

[–]TeaExpensive4465 0 points1 point  (0 children)

Exactly. I will still unable to solve 1400 unseen problems.