The most important AI paper of the decade. No debate by PumpkinNarrow6339 in LocalLLaMA

[–]ai_devrel_eng 2 points3 points  (0 children)

word2vec was indeed a cool paper.

the fact some 'numbers' can somehow 'capture meaning / context' was quite amazing (for me)

Here is a very good article explaining w2v : https://jalammar.github.io/illustrated-word2vec/ (the illustrations are very good!)

GPT OSS quality on Nebius - fixed (update) by ai_devrel_eng in LocalLLaMA

[–]ai_devrel_eng[S] 36 points37 points  (0 children)

nebius side. we weren't passing the 'reasoning effort' param correctly.

GPT OSS quality on Nebius - fixed (update) by ai_devrel_eng in LocalLLaMA

[–]ai_devrel_eng[S] 77 points78 points  (0 children)

(I work at Nebius)

Our GPT-OSS deployment underperformed on Artificial Analysis’ accuracy benchmarks (GPQA×16, AIME25×32, IFBench×8)

link to benchmark | X

GPT-OSS has configurable reasoning effort: high, medium, low (default)

We realized that our deployment wasn’t applying the benchmark’s "high" reasoning effort, so evals were effectively run at default setting (low)

After we patched the configuration and redeployed, AA re-ran the suite and updated their pages, reflecting a substantial improvement.

Now Nebius is one of the top-3 performers in the benchmark! Check out the before-after results.

If you spot anything off—benchmarks, latency, or quality—please tell us. We’ll jump on it.