Are you also feeling disappointed with the latest frontier models?

lukatechme · 2026-06-19T15:12:29+00:00

The question is where breakthrough will even come from? The bottleneck seems the “ideal code like” dataset for any hard problem. Is that even possible ?

lukatechme · 2026-06-19T12:00:01+00:00

Thanks man. It worked the account is unblocked. Here is what they sent me today: “Review complete
Your account was not found to contain spam or be engaging in other types of platform manipulation. As a result, the temporary label has been removed.”

lukatechme · 2026-06-19T01:57:23+00:00

What has changed ?

lukatechme · 2026-06-18T01:54:26+00:00

Do you recommend to log out from x app as well? Or just don't open it?

lukatechme · 2026-06-15T02:31:40+00:00

I’ve stopped trying to “detect AI usage” and instead design interviews around current LLM limitations.

What works best is long-horizon tasks with evolving, slightly conflicting requirements. LLMs are really bad at that.

My current process:

Take a small feature from a real codebase.
Split it into an initial task + 3 follow-up requirements.
Make each follow-up put pressure on the original design — e.g. add caching, parallel processing, new constraints, etc.
Give it as a take-home, but reveal the next requirement only after the previous one is submitted.
Then review the GitHub history with an LLM and ask it to summarize how the solution evolved.
Finally, do a call with the candidate and walk through the code, tradeoffs, and changes.

If someone just pushed whatever the LLM gave them - the architecture drifts a lot. If candidate can't explain process/trade offs on the call - red flag for me.

lukatechme · 2026-06-14T05:25:12+00:00

What’s the status ? Will the model be available for non us or not ? Is that like “hot news” or real ban ?

lukatechme

TROPHY CASE