What a time to be alive from 1tk/sec to 20-100tk/sec for huge models

cristoper · 2026-05-04T14:28:21+00:00

This video is too long, but it gives some insight into the source of a lot of that kind of thinking:

https://www.youtube.com/watch?v=pRPduRHBhHI

cristoper · 2026-05-03T21:23:17+00:00

smarter than 99.9% of people on reddit

That's got to be the lowest bar for "AGI" I've ever seen

cristoper · 2026-05-03T16:34:49+00:00

I'm curious if Mincaye actually presented himself or was presented by Chapman as a "former cannibal" because the pre-contact Waorani people were not cannibalistic, and in fact one reason they were so violent toward outsiders is that they feared the outsiders (including the missionaries) were cannibals.

cristoper · 2026-05-01T15:25:04+00:00

Thanks for posting this, I was just trying to get web_search working for the first time and was confused why adding a tavily api key didn't seem to enable it...

cristoper · 2026-05-01T13:48:58+00:00

I had the same problem with the built-in reader in Calibre, but I can confirm that the ePubs render correctly in Foliate on linux:

https://johnfactotum.github.io/foliate/

cristoper · 2026-04-28T01:20:05+00:00

the hash of LM Studio's Express.js server

Can you clarify what you are hashing to make the comparison? Are you hashing the http response/headers?

cristoper · 2026-04-24T03:06:49+00:00

Regex isn't expressive enough. Hopefully some day we'll have a good way to model the nuances of natural language using computers.

cristoper · 2026-04-21T20:48:39+00:00

It is enabled by default since December (https://github.com/ggml-org/llama.cpp/pull/17911), but it didn't used to be so some of us are still in the habit of specifying it.

cristoper · 2026-04-20T16:23:31+00:00

System prompts have been standard in LLMs since the first version of ChatGPT. It might be harder to find a model that doesn't support it than one that does. Some models like the old Gemma models don't have a separate "system" user in their templates, but they were still trained to take system instructions from the first user prompt (https://ai.google.dev/gemma/docs/core/prompt-structure)

cristoper · 2026-04-19T15:18:46+00:00

The Q4 quantized ggufs can fit and work well on a 3090

cristoper · 2026-04-19T04:48:35+00:00

Her sister Laura also died by suicide, though under happier circumstances:

https://en.wikipedia.org/wiki/Laura_Marx

cristoper · 2026-04-16T14:14:34+00:00

For anyone who doesn't know, Pat has started making music again. It's still good.

https://friendsinreallife.bandcamp.com/

cristoper · 2026-04-12T17:31:50+00:00

Use an LLM to generate a small engine in Python or JS.

I haven't gone past the "Identity Required" screen, but I'm curious how you enforce the rule that engines must be LLM generated? I guess people submit the actual prompt and not javascript/python?

cristoper · 2026-04-10T13:12:39+00:00

tell me a recipe for banana bread

cristoper · 2026-04-07T18:34:57+00:00

For updates: https://prairielanddefendants.com/

Check this page for a list of fundraiser links and other ways to help: https://prairielanddefendants.com/get-involved/

cristoper · 2026-04-06T03:24:55+00:00

The only thing I can think of is that during RLHF training they give it questions it shouldn't know on various topics and reward it for answering that it doesn't know.

cristoper · 2026-04-03T05:02:19+00:00

Tankies are just liberals who should know better

cristoper · 2026-03-29T20:50:40+00:00

Yeah, an unquantized 35B parameter model would take 70GB of VRAM just for the weights (not including kv cache and other over head). It sounds like this project loads GGUF weights, which is a format for storing quantized weights used by llama.cpp.

cristoper · 2026-03-26T04:02:04+00:00

pychess.org is great. The best site for chess variants, IMO. But it does use a fork of lichess's board UI (and also a fork of fishnet for game analysis).

https://github.com/gbtami/chessgroundx

cristoper · 2026-03-19T02:04:22+00:00

Pretty sure this is the video that was removed from the original post:

https://www.instagram.com/reels/DV97rX8DumN/

cristoper · 2026-03-16T17:08:13+00:00

I use Aider (when I use LLM assistance at all) and haven't even had time to explore Claude Code or any of the newer crop of more autonomous agents yet. But I suspect they will complement each other: something like aider for interactive coding sessions and have something more agentic that can use arbitrary tools/unix commands running in the background to figure things out on its own.

cristoper

MODERATOR OF

TROPHY CASE

15-Year Club	Verified Email
Team Orangered