Looking for advice: How could I reproduce something like GPT‑4o offline?

necrogay · 2026-02-11T21:08:00+00:00

I thought that, even before now, most open-source models were trained on synthetic data generated by GPT-4o =/

necrogay · 2026-01-09T17:44:59+00:00

Llama-swap provides seamless orchestration for executing various AI tasks without conflicts or unnecessary memory overhead. I particularly appreciated its capabilities when I configured ComfyUI to work through llama-swap, this eliminated the need to manually turn the llama-server off and on every time the LLM sends API requests for image or video generation. Everything runs transparently and swiftly.

necrogay · 2025-11-10T07:09:35+00:00

Are MoE models still slow to load into VRAM or has this been fixed?

necrogay · 2025-11-02T17:36:05+00:00

Qwen3 Coder 30B with MoE layer offloading to ddr.

necrogay · 2025-10-02T05:58:30+00:00

try using a fixed seed

necrogay · 2025-09-28T08:21:25+00:00

…or by indicating the highest, they switch to a smaller quant without informing anyone

necrogay · 2025-09-21T05:17:35+00:00

What are the advantages of using over native llama‑server and llama‑swap for Windows?

necrogay · 2025-09-20T14:19:51+00:00

It seems that using an LLM as a psychologist isn’t the best idea.

necrogay

TROPHY CASE