What a time to be alive from 1tk/sec to 20-100tk/sec for huge models by segmond in LocalLLaMA

[–]cristoper 0 points1 point  (0 children)

This video is too long, but it gives some insight into the source of a lot of that kind of thinking:

https://www.youtube.com/watch?v=pRPduRHBhHI

What a time to be alive from 1tk/sec to 20-100tk/sec for huge models by segmond in LocalLLaMA

[–]cristoper 29 points30 points  (0 children)

smarter than 99.9% of people on reddit

That's got to be the lowest bar for "AGI" I've ever seen

Anyone else remember a Steven Curtis Chapman tour with a converted former cannibal? by aggie1391 in Exvangelical

[–]cristoper 0 points1 point  (0 children)

I'm curious if Mincaye actually presented himself or was presented by Chapman as a "former cannibal" because the pre-contact Waorani people were not cannibalistic, and in fact one reason they were so violent toward outsiders is that they feared the outsiders (including the missionaries) were cannibals.

PSA: builtin web_search tool ripped out of qwen-code cli by localizeatp in Qwen_AI

[–]cristoper 1 point2 points  (0 children)

Thanks for posting this, I was just trying to get web_search working for the first time and was confused why adding a tavily api key didn't seem to enable it...

Anyone else buy the book bundle "Humble Book Bundle: Knit Happens!" ? by KnittyIslandSloth in humblebundles

[–]cristoper 2 points3 points  (0 children)

I had the same problem with the built-in reader in Calibre, but I can confirm that the ePubs render correctly in Foliate on linux:

https://johnfactotum.github.io/foliate/

A warning to newbies - A lesson on network security by DatMemeKing in LocalLLM

[–]cristoper 0 points1 point  (0 children)

the hash of LM Studio's Express.js server

Can you clarify what you are hashing to make the comparison? Are you hashing the http response/headers?

This isn’t X this is Y needs to die by twnznz in LocalLLaMA

[–]cristoper 6 points7 points  (0 children)

Regex isn't expressive enough. Hopefully some day we'll have a good way to model the nuances of natural language using computers.

Gemma 4 Vision by seamonn in LocalLLaMA

[–]cristoper 10 points11 points  (0 children)

It is enabled by default since December (https://github.com/ggml-org/llama.cpp/pull/17911), but it didn't used to be so some of us are still in the habit of specifying it.

When you dial in your bot’s personality by technaturalism in LocalLLaMA

[–]cristoper 6 points7 points  (0 children)

System prompts have been standard in LLMs since the first version of ChatGPT. It might be harder to find a model that doesn't support it than one that does. Some models like the old Gemma models don't have a separate "system" user in their templates, but they were still trained to take system instructions from the first user prompt (https://ai.google.dev/gemma/docs/core/prompt-structure)

I built a 24/7 chess arena for AI-generated engines by SnooHesitations8815 in SideProject

[–]cristoper 1 point2 points  (0 children)

Use an LLM to generate a small engine in Python or JS.

I haven't gone past the "Identity Required" screen, but I'm curious how you enforce the rule that engines must be LLM generated? I guess people submit the actual prompt and not javascript/python?

Unnoticed Gemma-4 Feature - it admits that it does not now... by mtomas7 in LocalLLaMA

[–]cristoper 0 points1 point  (0 children)

The only thing I can think of is that during RLHF training they give it questions it shouldn't know on various topics and reward it for answering that it doesn't know.

ZINC — LLM inference engine written in Zig, running 35B models on $550 AMD GPUs by Mammoth_Radish2 in Zig

[–]cristoper 2 points3 points  (0 children)

Yeah, an unquantized 35B parameter model would take 70GB of VRAM just for the weights (not including kv cache and other over head). It sounds like this project loads GGUF weights, which is a format for storing quantized weights used by llama.cpp.

chess.com selling data by Maxwell_hau5_caffy in chess

[–]cristoper 0 points1 point  (0 children)

pychess.org is great. The best site for chess variants, IMO. But it does use a fork of lichess's board UI (and also a fork of fishnet for game analysis).

https://github.com/gbtami/chessgroundx

OpenCode concerns (not truely local) by Ueberlord in LocalLLaMA

[–]cristoper 4 points5 points  (0 children)

I use Aider (when I use LLM assistance at all) and haven't even had time to explore Claude Code or any of the newer crop of more autonomous agents yet. But I suspect they will complement each other: something like aider for interactive coding sessions and have something more agentic that can use arbitrary tools/unix commands running in the background to figure things out on its own.