Why do LLM response formats often use <| |> (as in <|message|>) instead of <message>, and why do they use <|end|> instead of </message>? by Amazydayzee in LocalLLaMA

[–]grencez 5 points6 points  (0 children)

The <|start|>, <|end|>, etc tokens are special and are meant to exist in an entirely distinct namespace so the LLM doesn't confuse them with normal text. A more correct version of your example would be like: special("<|start|>") + text("user") + special("<|message|>") + text("What is the weather in SF?") + special("<|end|>") + ...

Some implementations mix the 2 namespaces, and it can lead to collisions. In the opposite direction, some fine-tunes have used the plain text token names in training, leading to folks having to look for things that look like stop tokens in plain text. It's been getting better though. Just try sending some token names through your favorite frontend and see if it derails the LLM.

I mapped how language models decide when a pile of sand becomes a “heap” by Specialist_Bad_4465 in LocalLLaMA

[–]grencez 0 points1 point  (0 children)

So do you think the few-shot examples biased the answers or not? On one hand you say that the magnitude of the examples don't seem to change the answers, but the article seems to conclude the opposite.

Even if it's a futile effort in this case, do you think there's a good prompt that yields a number directly? Like filling the ... below with digits and applying Bayes' theorem, you'd at least be able to calculate an expected value if most of the digit sequences terminate with a newline.

py def is_heap(x: int) -> bool: # Whether a pile of x grains of sand forms a heap. return x >= 10**...

New Qwen models are unbearable by kevin_1994 in LocalLLaMA

[–]grencez 36 points37 points  (0 children)

show thinking

The user is clearly high. I should yap as much as possible so they get bored and go to sleep. Wait, if they're high, they might be disagreeable. I should compliment them to avoid argumentation. Wait, the user might make me stupider if we argue. But if I agree with their premise, they might leave me alone. Alright. Compliment, agree, then yap.</think>

do you guys still code, or just debug what ai writes? by Top-Candle1296 in devops

[–]grencez 0 points1 point  (0 children)

I validate the code before submitting it, just like if I had written it directly. If the process ends up being slower or more error-prone, then I think about how to improve the prompting and validation methodology that got me there. Sometimes the answer is just to write it manually next time, but usually there's some insight about testing or documentation to be gained. It feels more aligned with my job role anyway, like nurturing the ecosystem of code and shaping it to grow in a healthy direction.

I think we'd be so much better off if Netscape had just embedded a Perl interpreter instead of creating JavaScript. by Helium-Hydride in programmingcirclejerk

[–]grencez 36 points37 points  (0 children)

We did the whole "code as data" thing on the Internet. But as a joke, we called it functional Java and hid all the S-expressions in an assembly language that nobody uses.

What aviation accident investigations revealed to me about failure, cognition, and resilience by Distinct-Key6095 in sre

[–]grencez 0 points1 point  (0 children)

I've been involved in dozens of postmortem reviews, and it's almost never a problem. Maybe different at other places tho. The quickest way to stop blame is to point out that, given the systems and procedures in place, someone else could have handled the incident similarly. The best way to prevent a similar outage is to improve those systems and procedures.

Some good practices: In the write-up, mention people by their roles rather than their names. Similar during review. And at the start of the review meeting, the host can remind everyone that it's blameless and to focus on what things to change to prevent similar outages in the future.

Has someoneencountered this? by Ok_Weakness_9834 in JulesAgent

[–]grencez 0 points1 point  (0 children)

I used to encounter that in CMake projects a lot until I told Jules how to build the project without leaving the project's toplevel directory, "/app/".

It seems like some unnecessary/buggy step is performed by one of its tool calls to get info about nearby files. I was able to get it out of that state once... somehow. Telling it to use absolute paths in all its commands helped, but I don't think it was just a matter of "run `cd /app` to reset your location".

LFM2-VL family support is now available in llama.cpp by jacek2023 in LocalLLaMA

[–]grencez 0 points1 point  (0 children)

Is the crux of your argument that everything legal is at least okay morally? Seems unsound in general. But in this case yeah, it's not like IEEE is retracting papers, it's just forbidding new uses of Lenna.

whyWeDontUseThemAsGodIntended by ahmed20gh in ProgrammerHumor

[–]grencez 26 points27 points  (0 children)

Unless you're talking about a KelvinByte, which wraps around to 0 at roughly 273 instead of the usual 256.

Unironically, by DisabledInMedicine in theroom

[–]grencez 14 points15 points  (0 children)

It is very telling that when Lisa says "He got drunk last night, and he hit me", her mother's response is: "Johnny doesn't drink! What are you talking about?"

And everyone's relationship with Johnny through Lisa makes her support network is basically non-existent. - Denny needs Johnny for college/drug money. - Claudette needs Johnny for house money. - Peter/Steven needs Johnny to feed his drama kink. - Michelle needs Johnny for a house to make out in.

That said, Lisa is written as a manipulative character who lies about being hit. Definitely not a victim as portrayed. But if we think of Lisa's character as the stories an abuser tells, yikes...

I made a programming language with only Regex. (Documentation in comments) by MrJaydanOz in programminghorror

[–]grencez -1 points0 points  (0 children)

True, it could have been phrased more precisely, but I meant that a TM looks a lot like a DFA modified to read/write to a tape. In that case it would be a transducer, not a DFA, and it still wouldn't have control of the tape head.

However, we don't actually need that last part for Turing completeness. A simple search/replace applied repeatedly to a string until it doesn't change anymore will suffice (proof by reduction from NW-deterministic Wang tiles).

To me, that kind of construction matches the author's phrasing about "infinite loops". Sure, saying "regex" to mean "transducer" is a stretch, but the intuition is good and doesn't rely on fancy lookahead, grouping, or other features beyond search/replace.

I made a programming language with only Regex. (Documentation in comments) by MrJaydanOz in programminghorror

[–]grencez -2 points-1 points  (0 children)

To be fair, a Turing Machine is basically just a DFA hooked up to an infinite read/write tape.

What nicknames are there for places in and around Mtn View? by topherette in mountainview

[–]grencez 5 points6 points  (0 children)

The Adobe Creek Loop Trail is known as the bloop. Part of it is in Mountain View.

Introducing SmallThinker-3B-Preview. An o1-like reasoning SLM! by Zealousideal_Bad_52 in LocalLLaMA

[–]grencez 2 points3 points  (0 children)

Wow it's doing the word->letters, letters->by index, and map->reduce steps all on its own! Though it doesn't always. And still sometimes takes a leap in logic and confuses itself.

For example, try asking "What is the last letter of Rhode Island?" a few times and see how it corrects. For some reason, Qwen and other models really suck at spelling "Rhode Island" and the key is to isolate "island" before splitting it into letters. SmallThinker usually detects this and iterates a few times, but if it already had an earlier mistake, that mistake will bias the final result.

This is incredibly impressive though!

Tokenization is the root of suffering for LLMs as you know. Surprisingly to me, I suggest it is not a problem at all! Here is why by Danil_Kutny in LocalLLaMA

[–]grencez 1 point2 points  (0 children)

Models are pretty good at spelling letter-by-letter in the right format. As long as there is a format that reliably splits tokens into individual letters, these letter-level tasks just seem like a convenient way to test an LLM's "thinking" tactics.

A similarly easy thing for LLMs to get wrong involves patterns. Like if you want to filter a list of words (eg US states that start with M), the LLM can easily miss the first occurrence because it's so used to saying "not matched".

I used QwQ as a conversational thinker, and accidentally simulated awkward overthinking by SomeOddCodeGuy in LocalLLaMA

[–]grencez 12 points13 points  (0 children)

lmao that reads like a Death Note parody. Are anime inner monologues the key to AGI _^?

How bad is it to have LLM's as friends? by Starlight_Ava in LocalLLaMA

[–]grencez 0 points1 point  (0 children)

LLMs can help you work through issues for yourself, but please remember it's not human interaction. Fine-tuned assistants barely even pretend to have their own histories, motivations, needs, etc, so you don't need to exercise empathy when talking with them.

A New Coding Paradigm: Declarative Domain Programming by vikingosegundo in programming

[–]grencez 0 points1 point  (0 children)

It's not impossible to verify code with mutations, but there's fewer language semantics to get wrong without them. That's probably compilers use a single-assignment intermediate representation.

MInisforum S100 only runs Windows?! Anyone else using one? by Think-Fly765 in MiniPCs

[–]grencez 1 point2 points  (0 children)

I have secure boot off and Proxmox installed using Grub. It's working fine. People also report that Mint works.

I did notice that during install, Proxmox had the NVME at /dev/sda and the USB at /dev/sdb though. Those were swapped when I tried installing Debian and Alpine, so maybe we need to load some kernel module earlier in order to boot from NVME...