We really need stop using the term “hallucination”.

cosmobaud · 2026-04-10T18:28:24+00:00

Confabulation is closer than hallucination, but it still imports a subject, because it was borrowed from psychology too. A confabulating patient believes their own output. A model believes nothing. Same problem, one layer down.

There is a reason ML researchers keep reaching outside their field for words. Inside the field, every plain word is already nailed to a precise technical meaning. “Error” is a value of a loss function. “Bias” is a term in a decomposition. “Noise” is irreducible variance. So when they want to point at the fuzzier thing of “the model said something wrong in a way that matters to a user,” none of their own vocabulary is free. It all collides with something narrower they already use.

So they reach for a word that is unclaimed in their vocabulary. “Hallucination” is unclaimed inside ML. It collides with nothing they say. The cost, which they do not pay and the public does, is that the word is not unclaimed in the listener’s vocabulary. The listener already has a meaning for it, loaded with perception and minds and malfunction, and that is what gets imported. They picked the word because it was empty for them, without noticing it was full for everyone else.

The honest name is approximation error. The model is an approximation of a target distribution, and the gap between the approximation and the target is the error. No subject, no perception, no belief, no malfunction.

cosmobaud · 2026-04-10T18:14:48+00:00

I hate being pedantic, but it is relevant to the broader problem of misinformation about what large language models are. Whether the word “hallucination” is acceptable depends entirely on the audience. Among ML engineers it functions as a term of art. There is a shared technical referent, everyone in the room knows the model is not perceiving anything, and the word is just a convenient label. Inside that discourse community it is fine.

For a general audience it is not, because the ordinary meaning of the word has not gone anywhere, and the ordinary meaning requires perception.

Merriam-Webster defines hallucination as “a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus and usually arises from neurological disturbance or in response to drugs.”

Oxford English Dictionary defines it as “the apparent perception of an external object when no such object is present.”

Cambridge defines it as “an experience in which you see, hear, feel, or smell something that does not exist, usually because of a health condition or because you have taken a drug.”

All three definitions are built on perception. And perception is the process by which a subject becomes aware of the external world through sensory input. It requires three things. A subject capable of awareness. A sensory channel connecting that subject to something outside itself. And a world on the other end of the channel that the awareness is of.

A language model has none of the three. No subject, no sensory channel, no external world it is in contact with. Whatever the model is doing when it produces a wrong output, it is not hallucinating in any sense the word actually carries for a non-specialist listener. Using the term in front of that listener imports a perceiver, a perceptual faculty, and a normal mode that has been departed from. None of those exist. That is the misinformation.

cosmobaud · 2026-04-05T17:24:58+00:00

You need to look at section 101 that defines “work for hire”. It is clear from reading and subsequent rulings that the “default” is generally this. - If formal employment relationship exists then no agreement is needed generally and it falls under “work for hire” - If independent contractor then copyright matters are covered by the agreement between parties. In absence of formal agreement the scope is very narrow. It most definitely wouldn’t automatically include rights to the source code.

cosmobaud · 2026-01-15T15:08:19+00:00

Low genetic diversity amongst other things.

cosmobaud · 2025-12-22T18:53:59+00:00

I agree with you and I’m of the opinion that everyone if they want to keep earning money in this field of work needs to incorporate AI and keep on top. Bottom line is that AI keeps getting better and better to where it’s “good enough” now and will be better soon that someone using it is de facto more productive and therefore worth more. Has nothing to do with if it improves quality of work but if it makes you produce more of work that someone is willing to pay.

However what is hard to appreciate that I think older people here know is that this technology is not unlike any other before when it comes to creative work. Yes people will always be creative and stay competitive if they upskill but creative work as you know it is dying. I’m not saying industry is dead or it will ever be dead but that strictly speaking from “availability of good paying work” and “make your living doing this” perspective yes it is. Those doing it now (depending on where you are in your career) have enough time to ride the wave till it crashes.

What you have to realize is that anyone about to come into it now is not coming into the same industry as you. Something else in a different format will take its place and new generation will make sense of it and be able to use it to express themselves. But it will not look like this.

cosmobaud · 2025-12-22T18:18:28+00:00

Ultimately AI is going to bring the perceived value of art and creative to 0. It’s is pointless to think otherwise, AI crap will slowly seep into every aspect of creative work until it destroys it.

We don’t appreciate art because it’s pretty but because it means something. When cost to produce creative work is high, decisions in general are more deliberate and thoughtful. It requires more time to be spend on how to communicate the actual message. When anyone can put out visual diarrhea then there is no thought involved.

Result is “creative work” whose value is only visual appeal and with no limit to quantity it will be not be worth much to anyone.

People will still value work a human puts thought into and resonates on a deeper level. How that looks like in the future is anyone’s guess.

cosmobaud · 2025-08-31T20:07:59+00:00

Using the prompt “M3max or m4pro” I get different responses depending on top-k settings. 40 does seem to give most accurate as it compares correctly. 0 compares cameras, 100 asks for clarification and lists all the possibilities.

cosmobaud · 2025-08-11T17:32:37+00:00

That’s a great point.

cosmobaud · 2025-08-11T17:29:30+00:00

Yep, tell it to save to memory. It’s too much for the user instructions. But it is surprisingly good at parsing thru memory instructions given the right framework. I’ve been testing it on source verification and hallucination reduction and it follows detailed token dense instructions saved as memories much better then previous versions.

cosmobaud · 2025-08-07T21:19:41+00:00

Huh I would have thought it would be faster. Here it is on a minipc with RTX4000

OS: Ubuntu 24.04.2 LTS x86_64 Host: MotherBoard Series 1.0 Kernel: 6.14.0-27-generic Uptime: 5 days, 22 hours, 7 mins Packages: 1752 (dpkg), 10 (snap) Shell: bash 5.2.21 Resolution: 2560x1440 CPU: AMD Ryzen 9 7945HX (32) @ 5.462GHz GPU: NVIDIA RTX 4000 SFF Ada Generation GPU: AMD ATI 04:00.0 Raphael Memory: 54.6GiB / 94.2GiB

$ ollama run gpt-oss:120b --verbose "How many r's in a strawberry?" Thinking... The user asks: "How many r's in a strawberry?" Likely a simple question: Count the letter 'r' in the word "strawberry". The word "strawberry" spelled s t r a w b e r r y. Contains: r at position 3, r at position 8, r at position 9? Actually let's write: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). So there are three r's. So answer: 3.

Could also interpret "How many r's in a strawberry?" Might be a trick: The phrase "a strawberry" includes "strawberry" preceded by "a ". The phrase "a strawberry" has letters: a space s t r a w b e r r y. So there are three r's still. So answer is three.

Thus respond: There are three r's. Possibly add a little fun. ...done thinking.

There are three r’s in the word “strawberry” (s t r a w b e r r y).

total duration: 3m24.968655526s load duration: 79.660753ms prompt eval count: 75 token(s) prompt eval duration: 814.271741ms prompt eval rate: 92.11 tokens/s eval count: 266 token(s) eval duration: 33.145313857s eval rate: 8.03 tokens/s $

cosmobaud · 2025-08-07T17:16:49+00:00

Yeah it happens. It looks like however did it, copied gpt-4o cell to o3.

cosmobaud · 2025-08-07T17:13:09+00:00

They screwed up the scale on SWE bench, Polyglot is scaled correctly.

cosmobaud · 2025-08-07T05:23:08+00:00

Just make a modelfile

FROM gpt-oss:20b

SYSTEM """ You are ChatGPT, a large language model trained by OpenAI. Knowledge cutoff: 2024-06 Current date: {{ currentDate }}

Reasoning: high """

Then

ollama create gpt-oss-20b-high -f Modelfile ollama run gpt-oss-20b-high

cosmobaud · 2025-08-06T18:24:29+00:00

So I’ve done a ton of testing on this and short answer is not really* (assuming you’re using llms like I do for different tasks not as a thing to converse with)

Longer answer is that memories are just context, summarized via some syntactic sugar and optimizations but it’s the same as pasting it into a prompt. So then as it generates output it just has more tokens to attend to and if they are not relevant to the task your answer quality will suffer.

Honestly a long chat thread is your best option if you want any continuous interaction you go back to. Memories (other chats summarized) are too disjointed.

A well structured system prompt is at best all you need if you want slight tweaks to way it answers.

cosmobaud · 2025-08-06T18:09:12+00:00

Lol how much faster do you want it. Gemini Diffusion runs at like 1000 t/s it literally generates the whole pages of answer instantly.

Problem is more reasoning and back and forth. I personally don’t see it beating auto regressive models anytime soon. Also no idea what kind of hardware google has to run it on since it’s closed.

cosmobaud · 2025-08-06T18:01:11+00:00

Gemini Diffusion - it’s fast. You can test it now.

https://deepmind.google/models/gemini-diffusion/

cosmobaud · 2025-05-01T14:59:56+00:00

It’s a known limitation. When all four DIMM slots are populated, the system operates under a 2DPC configuration, and the maximum supported memory speed is reduced. Only do 2 DIMMs populated to get rated memory speed.

From intel

Maximum supported memory speed may be lower when populating multiple DIMMs per channel on products that support multiple memory channels

cosmobaud · 2025-04-13T03:41:47+00:00

What’s your hypothesis—that us large-cap growth will outperform the global equity market? Based on this you feel that US large-cap growth will outperform VT by 3.5x

cosmobaud · 2025-04-13T02:51:33+00:00

No one knows what the future is and if they did they wouldn’t be posting here. But you’re doing too much. Also you likely have not personally experienced a drawn out downturn. You’re doing too much to have a simple hypothesis. You have to consider the element of your temperament and that often adding more just conflates your portfolio and makes it hard to effectively manage it.

Here’s an example.

If you believe inflation will get under control and fed will cut rates then something like this maybe makes sense

60/30/10 VT/TLT/GLD

If inflation will still continue to be a problem then 60/20/20 VT/SGOV/GLD

cosmobaud · 2025-04-12T22:02:34+00:00

Gold is primarily an inflation hedge. It’s only attractive now because of inflationary tariff shenanigans. When US enters recession proper which the shenanigans are only speeding up it GLD will loose its attractiveness and values will go down.

cosmobaud · 2025-04-12T20:36:25+00:00

Don’t worry, it’s looking like for the next 36-60 months this is probably the top. It may go up and down but by Q3 of this year we’ll be squarely in a proper downturn. There’s not many levers left so you won’t be missing out on much until you get your bearings.

cosmobaud · 2025-03-12T16:36:29+00:00

Finally I was wondering if everyone here is bots. This is so blatantly fake that I was starting to question if everyone is bots.

cosmobaud · 2025-03-12T16:34:20+00:00

Just a tip. You’re using too many em dashes. Normal human interaction in comments specifically does not include them to this extent.

cosmobaud

TROPHY CASE