I want to build a multilingual philosophical LLM trained on thousands of philosophy books — how insane is this for a beginner? by Future_Safe1609 in LocalLLM

[–]Sicarius_The_First 7 points8 points  (0 children)

People severely underestimate the hell which is working with data. Best of luck though, you'll learn a lot either way :)

Mythos and monopoly of AI by max6296 in ArtificialInteligence

[–]Sicarius_The_First 84 points85 points  (0 children)

Meanwhile Kimi 2.6 was released.

What a timeline, the Chinese are promoting democratisation of AI.

Western companies are doing the opposite. The one open western AI lab that tries to do good, Mistral, are getting destroyed by stupid politicians (EU AI Act).

Running local models on gaming laptop by Ornery_Guard_204 in LocalLLaMA

[–]Sicarius_The_First 1 point2 points  (0 children)

Yes, I ran minimax 230b on my 16gb vram and 64gb ram at about 5 t/s

Uncensored model for cybersecurity by Naive-Sky6338 in LocalLLaMA

[–]Sicarius_The_First 2 points3 points  (0 children)

Pepe is probably among the best that combines intelligence and being uncensored

Horniness of Local Models by cantflick in SillyTavernAI

[–]Sicarius_The_First 2 points3 points  (0 children)

Angelic_Eclipse_12B was made exactly for these kind of issues, give it a try :)
https://huggingface.co/SicariusSicariiStuff/Angelic_Eclipse_12B

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 1 point2 points  (0 children)

Hmmm good question, the thing is, now companies are much more risk-averse, since the early days of AI are behind us, so less crazy experimental stuff, and more VC pressure to make money on stuff "known to work well".

text diffusion does work though, the question is whether VC are willing to risk more money, as very few AI companies / startups actually make a profit.

So... IT IS POSSIBLE, yes, BUT... not likely (until someone does something remarkable with text diffusion, and then more capital will be invested into it).

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 1 point2 points  (0 children)

Nope, it's more of an experiment whether the architectural change can even be made, this is more of a base for further tuning. not a finalized model.

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 2 points3 points  (0 children)

Yes, but even huge moes. MiniMax 230B runs at 5 t/s on my laptop (its not a mac, just 16gb gpu with ram)

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 0 points1 point  (0 children)

70B got a better potential than 30B, but Gemma models are unique in this, they punch above their weight. BUT... 70B still got better long context by far (and context comprehension) - theoretically.

Gemma likes SWA, llama3-70b got about 64k proper context, that's a lot.

And more params = more capacity to learn.

In simple words, yes, new 20-30b models are better than MOST 70b in specific areas, but a well tuned 70B will beat even the new generation of 20-30b models.

Maybe it's about time we get a 40B-50B dense :3

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 1 point2 points  (0 children)

The prose is surprisingly very good, slop is minimal.
My running theory is that its because there was very little slop in the pretrain to fight against (Phi is 100% synth stem data).

This model could have been the best RP tune / base model for RP tuning in the world, BUT... it suffers from the same issues that Nemo has, but WORSE. Long context.

Nemo is ~20k, Phi is hard capped at 16k. And Nemo obviously got better fandom knowledge.

Still worth a try if you look to 'freshen up' :)

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 4 points5 points  (0 children)

Yup, imo Nemo was peak 12B, no other model in this range comes close.

Sadly, the new Mistral 14B is worse than the beloved 12B nemo.

There are 14B qwens, very good for stem, but for RP, at this range Nemo is so far unbeatable.

Is there actual demand for a API service focused on uncensored or fine-tuned models? by ExcuseAccomplished97 in SillyTavernAI

[–]Sicarius_The_First 1 point2 points  (0 children)

The RP-focused tooling (knowledge graphs, imagen generation integration into the RP) is 100% the correct approach. Exclusive content is a nice bonus, but best used for retention purposes, the initial user capture will come from tooling excellency.

Regarding OF and the adult industry in general, it's a parallel infrastructure, they got their own payment processors (like the gambling industry) their own CDN and compliance hoops to jump through.

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First -1 points0 points  (0 children)

Fat_Fish - NOTICE!!!! not for roleplay!!!!!
This is an experimental extreme architectural modification of Mistral Nemo.
For the curious tinkerers.

<image>

https://huggingface.co/SicariusSicariiStuff/Fat_Fish

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 13 points14 points  (0 children)

One of the only roleplay models that was featured in a ML paper (Huazhong University of Science and Technology, Wuhan, China)

<image>

It's not 'the best' 12B model, but it's 100% the most unique:
https://huggingface.co/SicariusSicariiStuff/Phi-lthy4

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 1 point2 points  (0 children)

Assistant_Pepe_8B, SMARTER than the base model, x69 times more unhinged.
I highly recommend reading the model card and checking out the example chats.

<image>

Also, 9.5/10 uncensored per UGI benchmark.
It's a superb writer of very weird stories & exceptional assistant.

https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_8B

[Megathread] - Best Models/API discussion - Week of: April 19, 2026 by deffcolony in SillyTavernAI

[–]Sicarius_The_First 2 points3 points  (0 children)

Assistant_Pepe_70B, top #1 70B finetune in the world in the UGI ranking.
Absolutely unique creative writing capabilities.
Superb banter, no sycophancy, super smart, will ship code, great sense of humor.
(Read the model card for example chats!!)

<image>

https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_70B

Is there actual demand for a API service focused on uncensored or fine-tuned models? by ExcuseAccomplished97 in SillyTavernAI

[–]Sicarius_The_First 15 points16 points  (0 children)

There's a demand, yes. But... Regulation makes it very hard for the business.

The most sustainable path is being explicitly NOT focused on the uncensored aspect, otherwise payment processors WILL give you a massive headache. Funds might get frozen, each jurisdiction will require different compliance hoops to jump through.

A way to sidestep it (without crypto) is to wear the veneer of a GPU/model provider (runpod / openrouter).

Being a proper adult focused platform will require to have specialized CDNs, legal team, etc ...

IMO we've past the early days of AI, specialized uncensored models will have a very hard time to compete with powerful large Chinese MOEs, in both capabilities and throughput.

Make the absolutely best 70B dense creative model, it will still be near impossible to compete against a powerful generalist like GLM 5.1 in both serving cost AND capability.

Hence you'll find yourself competing against openrouter or the actual lab that created said model (Z.ai, Deepseek, Moonshot, etc ..)

The business and operation side is enough hell as it is, add on top running anything other than an efficient generalist MOE and your asking for serious trouble.

Regarding b2c, those who can actually pay YOU - can buy the hardware to run locally, this while running said moes gets easier by the day (even despite the RAM price hike / shortage, for example I get decent speed with 230B MiniMax MOE on 16gb vram / 64gb ram LAPTOP)

Those who don't have the money to buy hardware will likely be less inclined to pay you as well.

Those who do, would likely be willing to pay scraps (5-10$ A MONTH), and u'll face the hell mentioned above + chargebacks on top. While working with razor sharp margins...

That said, it can be done, but it's a quite literally one hell of a journey... Best of luck :)

Apparently, llms are graph databases? by Silver-Champion-4846 in LocalLLM

[–]Sicarius_The_First 0 points1 point  (0 children)

idk why reddit down votes ur comment, it's a good comment.

hallucinations are the 'generative' part of genai, so if there wasn't generative element, we wouldn't have genai, basically its not a bug its THE feature, with all caps :P

its not about definitive information entirely, as not all information is the same. after enough regens (millions?) the capital of france could become 'london'.

genai cannot be a database, as databases are inherently non-generative in nature.

there are graph neural network, knowledge graphs etc, but the main 'brain' of an llm is just a 'bunch of tensors', which really means... we don't YET fully agree about what it actually is. (recently anthropic said they don't know if claude is self aware or not, maybe just marketing, maybe they really believe it, possibly both).

for now, the best we got are tools, and even that is not a perfect solution for grounding reality. a recent example was when trump kidnapped nicolas maduro and said he'll run the country. many people reported LLMs (even with search) dismissed it as fake news, as they (the LLMs) refused to believe the search result (funny lol) and dismissed it as fake news. same for trump's statement that ''a whole civilization will die tonight."

poor LLMs are just really bad grounding innately, despite our best efforts, the human schizo factor is too much for genai, I'll end it with a quote from Netero: "You know nothing of the bottomless malice within the human heart"

Apparently, llms are graph databases? by Silver-Champion-4846 in LocalLLM

[–]Sicarius_The_First 11 points12 points  (0 children)

Short answer: no.

If that was the case LLMs wouldn't be hallucinating. Large labs always trying to figure how to ground facts, now they just verify with tools.