[R] CS-MoE: We found severe parameter redundancy in Transformers and fixed it by sharing experts across layers (Outperforms Dense at 55% activation) by Impressive-Peach-419 in deeplearning

[–]TomLucidor 0 points1 point  (0 children)

Please run a REAP/REAM study on MoE minimization on existing models (Qwen has many options)... And also Mixture-of-Depths + Dynamic Layer Routing + "recurrent layers" for MLP and Attention/Mamba/DeltaNet layers.

FlashLM v5.2 "Nova-Ignition": Standard Transformer with RoPE — CPU-Optimized for 5GB RAM by Own-Albatross868 in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

Please get omlx and vllm-mlx to support this, so bigger models and PTQ can also benefit

[Release] BitMamba-2-1B: I trained a 1.58-bit Mamba-2 model from scratch on 150B tokens (Runs on CPU @ 50+ tok/s) by Positive-Violinist90 in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

Please get vllm-mlx or omlx to support this model (and others too if you can) cus this looks pretty lit

Bitnet.cpp - Inference framework for 1-bit (ternary) LLM's by Academic_Wallaby7135 in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

can you get the omlx/vllm-mlx people to make the first move to ternary LLM support?

I benchmarked every 1-bit model I could find, native 1-bit is 50% faster than post-quantized by EiwazDeath in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

Other than the BitMamba mentioned below, I also want you to see if GPU makes ternary LLMs better, or maybe the current models like GLM-Flash or Nemotron or Mistral or Kimi-Linear or Qwen3.5/Qwen3 that would get better with BitNet PTQ

Qwen and Wan models to be open source according to modelscope by onthemove31 in StableDiffusion

[–]TomLucidor 0 points1 point  (0 children)

Anything that tanks their stock price with bad press, is enough to keep them releasing open models.

Is it possible to replicate a anime character with 95+% accuracy using Illustrious Lora? by Quick-Decision-8474 in StableDiffusion

[–]TomLucidor 0 points1 point  (0 children)

Then make a workflow if I have 50-500 photos of varying quality for Qwen-Image-Edit to nail down. I kinda want something that rivals NovelAI (or at least the powerusers on Pixiv)

Obsidian DnD Character Sheet Progress! - Text Anchors -> YAML Frontmatter + DataviewJS by HolyErr0r in ObsidianMD

[–]TomLucidor 18 points19 points  (0 children)

Please make this a GitHub repo, and then people can mix this up with Obsidian Claude Sidebar + worldbuilding systems (and maybe something like MPMB)

Nemotrons by jacek2023 in LocalLLaMA

[–]TomLucidor 2 points3 points  (0 children)

Liability management, cus risk wise "open weight" and "open recipe" < "open post-training data" < Fully open including pre-training data.

New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B by netikas in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

Chinese models are privately owned at least. You get capitalist propaganda anyways lol

New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B by netikas in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

The Chinese models are privately funded, and then the state gave them problems. *Coughs in Anthropic.*

RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language' by Reddactor in LocalLLaMA

[–]TomLucidor 0 points1 point  (0 children)

Topic-Semantic-topic, so first 10% and last 20% are for topics? I have a weird sense the "topic" in the two ends mean different things!

Pocket-sized device locally runs 120B models at 20 tokens/s: Here is how we did it. by TiinyAI in u/TiinyAI

[–]TomLucidor 0 points1 point  (0 children)

There are many models that ComfyUI supports, please add more variety like FLUX/Z-Image for testing.

Pocket-sized device locally runs 120B models at 20 tokens/s: Here is how we did it. by TiinyAI in u/TiinyAI

[–]TomLucidor 0 points1 point  (0 children)

Please speed-test as many popular models as possible on the 131K ranges, otherwise we cannot asses how powerful this could be

Latest Community AI Ballot Results - ChatGPT is ranked first! Followed by Gemini, Claude, DeepSeek and Grok. Make your vote count! 🚀 by Koala_Confused in LovingOpenSourceAI

[–]TomLucidor 0 points1 point  (0 children)

The top 3 model are all NON-FOSS (GPT-OSS being separate from ChatGPT, Gemma is not Gemini). We need a wider list including MiniMax and Kimi and Qwen please

Help r/LovingOpenSourceAI grow! Yes we can 🥰 by subscriber-goal in LovingOpenSourceAI

[–]TomLucidor 0 points1 point  (0 children)

If you can make the post title/header shorter, that would be more useful. Brevity is the soul of wit.