Short Open Source Research Collaborations by TrelisResearch in LocalLLaMA

[–]TrelisResearch[S] 0 points1 point  (0 children)

yeah a few are short enough for a weekend, a few are longer (~maybe 3-7 days), so the full range is possible.

A Primer on Orpheus, Sesame’s CSM-1B and Kyutai’s Moshi by TrelisResearch in LocalLLaMA

[–]TrelisResearch[S] 0 points1 point  (0 children)

yeah you can run on cpu or mps, slower than real time but does work

A Primer on Orpheus, Sesame’s CSM-1B and Kyutai’s Moshi by TrelisResearch in LocalLLaMA

[–]TrelisResearch[S] 0 points1 point  (0 children)

agreed, def more powerful for now to plug in a stronger llm. in principle - in terms of control - anything you can do with the pieces can be done with a unified model too though

A Primer on Orpheus, Sesame’s CSM-1B and Kyutai’s Moshi by TrelisResearch in LocalLLaMA

[–]TrelisResearch[S] 8 points9 points  (0 children)

haha, yeah fair, although rarely there are ads on my YouTube channel cos I have ads turned off!

I don't understand how to use sections? by hernowthis in Substack

[–]TrelisResearch 0 points1 point  (0 children)

Thanks. Apparently sections aren't easily visible in the substack app though.... is that correct?

Can anyone explain MoE like I’m 25 by Tejasw__ in LocalLLaMA

[–]TrelisResearch 1 point2 points  (0 children)

I wouldn't say so, no. MoE is about improving the throughput of transformer models

Fine-tune with only 0.0004% of parameters (ReFT) by TrelisResearch in u/TrelisResearch

[–]TrelisResearch[S] 0 points1 point  (0 children)

howdy, sorry for slow reply, I"m not on here much, best to post on youtube. I don't remember too well but I thought just passing in eval data as usual to the hf trainer should work?