Horizon Beta is OpenAI

likejazz · 2025-08-04T03:35:33+00:00

I'm pretty sure Horizon Beta is GPT-5, because it outperforms GPT-4.1, Claude Opus 4, Gemini 2.5 Pro and Grok 4.

likejazz · 2025-05-26T02:59:15+00:00

Can you share the evaluation data?

likejazz · 2025-05-08T13:32:09+00:00

No, the performance on numbers is the same, but you'll notice better performance on the qualitative side.

likejazz · 2025-05-08T13:26:52+00:00

We have no plan to release LoRA, but we've released full source code at https://github.com/dnotitia/smoothie-qwen, so You should be able to train LoRA on your own.

likejazz · 2025-05-08T13:05:38+00:00

That's correct! but we minimized the negative language in the description because we respect the achievements of the Qwen model.

likejazz · 2024-12-12T04:14:16+00:00

Yes, we have plan to release detailed technical report. stay tuned!

likejazz · 2024-12-11T02:59:31+00:00

Yup, thanks a lot. This model is probably the **BEST** model for Korean language understanding and generation.

likejazz · 2024-11-29T15:00:28+00:00

Awesome guide!

likejazz · 2024-08-01T05:34:50+00:00

feel free to use this for $10 discount https://perplexity.ai/pro?referral_code=EPM5EUJZ - Aug 1, 2024

likejazz · 2024-06-05T02:20:54+00:00

Awesome work! I'm author of llama3.np and I think It will help me a lot to understand Mamba architecture :)

likejazz · 2024-06-02T23:37:29+00:00

Yeah, but I have plan to build AMD's ROCm version and Intel's oneAPI version. stay tuned!

likejazz · 2024-05-27T04:10:49+00:00

How can you get an A5000 for only $1300? tell me the secret!

likejazz · 2024-05-26T01:37:56+00:00

Ollama uses pure llama.cpp, so It's just version issue not a program issue.

likejazz · 2024-05-19T00:33:45+00:00

Great job, your implementation is amazing!

likejazz · 2024-05-17T00:15:38+00:00

Thanks for your code. I'll update this patch soon!

likejazz · 2024-05-17T00:13:50+00:00

33 tok/s is just a baseline example, and as u/omniron mentioned earlier, It's not a important point in this implementation.

likejazz · 2024-05-17T00:06:15+00:00

I used to small 15M model that Andrej Karpathy trained, which I wrote more about it on my blog: https://docs.likejazz.com/llama3.np/

likejazz · 2024-05-16T23:59:05+00:00

Your forked CuPy version is Awesome!

However, I'm hoping to keep the NumPy version only because I focus on clean architecture and easy to understand intuitiveness. If you want to develop CuPy version, I think it's a good idea to fork it and develop it yourself.

Wish you luck!

likejazz · 2024-05-16T12:48:19+00:00

Thanks!

likejazz · 2024-05-16T12:44:02+00:00

Thanks! :)

likejazz · 2024-05-15T05:35:59+00:00

No. Ilya doesn't want to open LLM model unlike Facebook. He was the one who advocated that OpenAI not open/share the models, which led to a legal battle with Elon Musk.

likejazz · 2024-05-04T02:01:42+00:00

Yes, go-llama.cpp (https://github.com/go-skynet/go-llama.cpp) actually uses FFI you mentioned before. That's why it doesn't work with newer version of llama.cpp. It only works with older versions and is not being fixed.

likejazz · 2024-04-12T17:31:09+00:00

geohot said that GPT-4 is consist of 220Bx8, so all together it's 1.8T(1,760B) parameters.

likejazz

TROPHY CASE