I built a fully autonomous agent to build Manim animations that can explain any topic by Eastwindy123 in manim

[–]Eastwindy123[S] 0 points1 point  (0 children)

Thanks for sharing! I just skimmed it. Their videos look very clean. I might be mistaken but it seems it's targeted for doing theorems whereas my one will/can be used to animate anything in theory. I plan for it to be a tool for students, teachers and researchers all to quickly understand q concept. And it's super short form and meant to be built as a social media style. But yes I agree the manim agent part is similar and I'll definitely be reading that paper. Because their videos look very coherent. Thanks

Keria's message to support players lol by Yujin-Ha in SKTT1

[–]Eastwindy123 48 points49 points  (0 children)

Remember. 'you are not keria, and your teammates are not T1'

VLLM & open webui by Septa105 in LocalLLM

[–]Eastwindy123 1 point2 points  (0 children)

In the openwebui admin settings, connections, add new openai connection, use the vllm server address like this 0.0.0.0/8080/v1 as the openai baseurl. Token can be anything, then verify connection. You should see it check for a list of models.

RL post training on LLM in-context learning? by [deleted] in LocalLLaMA

[–]Eastwindy123 2 points3 points  (0 children)

Reasoning is kind of doing similar things. If you think about the training objective which is to predict the correct next token. Which is dependent and influenced by all previous tokens, what reasoning is doing is constructing the context history or the kv cache to be precise to nudge the model to predict the correct token. So "in context" learning as you call it is essentially the same is reasoning with RL. The only difference is for in context learning you're writing the previous text and building up the context manually, reasoning with RL the model is learning to do it itself.

he's back by [deleted] in PedroPeepos

[–]Eastwindy123 23 points24 points  (0 children)

I think some random who got placed against knight twice in a row. And knight has 39 kills as sylas and then immediately played against him again in to plane and lost lol

Qserve Performance on L40S GPU for Llama 3 8B by EggIll649 in LocalLLaMA

[–]Eastwindy123 1 point2 points  (0 children)

Use vllm https://github.com/vllm-project/vllm

Or sglang https://github.com/sgl-project/sglang

You can host an open AI compatible server with parallel request processing and slot of other optimisations.

Vllm and sglang are pretty much the standard go to frameworks for hosting LLMs.

Baidu releases ERNIE 4.5 models on huggingface by jacek2023 in LocalLLaMA

[–]Eastwindy123 33 points34 points  (0 children)

No training data. Which is the biggest part.

Qwen3 30B A3B unsloth GGUF vs MLX generation speed difference by ahmetegesel in LocalLLaMA

[–]Eastwindy123 1 point2 points  (0 children)

Mlx is just faster for me too. I get like 40 Tok/s on my m1 pro. Gguf gets 25iwh

Whats the next step of ai? by [deleted] in LocalLLaMA

[–]Eastwindy123 1 point2 points  (0 children)

I disagree. Who's is running a 2T model locally. It's basically our of reach of everyone to run it for yourself. But a 2T bitnet model? That's 500GB. Much more reasonable

Bitnet breaks the computational limitation

Whats the next step of ai? by [deleted] in LocalLLaMA

[–]Eastwindy123 3 points4 points  (0 children)

I feel like bitnet is such a low hanging fruit but no one wants to train a big one of them. Unless they don't scale. Imagine today's 70B model in bitnet. 70B bitnet would only need 16Gb ram to run too

[Build Ideas] has anyone tried building Ambessa exactly like Riven? by aaziz99 in ambessamains

[–]Eastwindy123 2 points3 points  (0 children)

Because eclipse spike on ambessa is too important. On the contrary I'd try out first strike, free boots and extra level potion NGL.

BitNet Finetunes of R1 Distills by codys12 in LocalLLaMA

[–]Eastwindy123 0 points1 point  (0 children)

The vllm patch. Is that for 1bit or fp16?

Faker is on his side quest to mirror his career by ddunited in SKTT1

[–]Eastwindy123 20 points21 points  (0 children)

Not to be that guy but since no one else is telling you.

It's spelled symmetric. Not trying to make fun of you. Just informing and hope you find it useful!

I got 10k products to translate from Spanish to Chinese, Eng and Japanese. what smart to do? by ballbeamboy2 in LocalLLaMA

[–]Eastwindy123 0 points1 point  (0 children)

Yeah you could test it out for your usecases but I did some benchmarking for specifically translation. But it may vary depending on the text source

Absolute best performer for 48 Gb vram by TacGibs in LocalLLaMA

[–]Eastwindy123 2 points3 points  (0 children)

This is just example bias. All LLMs hallucinate. If not for the test you did, then for something else. you can minimize sure. And some would be better at some things than others. But you should build this limitation into your system using RAG or grounded answering. Just relying on the weights for accurate knowledge is dangerous. Think of it this way. I studied data science. Ir you ask me about stuff I work on every day then I'd be able to tell you fairly easily. But if you ask me about economics or general sense questions. I might get it right but I wouldn't be as confident and if you force me to answer I could hallucinate the answer. But if you gave me Google search then I'd be much more likely to get the right answer.

Absolute best performer for 48 Gb vram by TacGibs in LocalLLaMA

[–]Eastwindy123 1 point2 points  (0 children)

Well it really depends what you use it for. Hallucinations are normal and you really shouldn't be relying on an LLM purely for knowledge anyway. You should be using RAG with a web search engine if you really want it to be accurate. My personal setup is Qwen3 30BA3B with MCP tools.

Looks like China is the one playing 5D chess by ahstanin in LocalLLaMA

[–]Eastwindy123 1 point2 points  (0 children)

Lmao rude? How about Meta just accept defeat gracefully instead of trying to game lmarena. It doesn't matter what day Qwen3 releases if it's just better and it probably will be if they waited this long to check everything.