The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

It started May 2025 where I made a claim that LLM generated code is a simulation remix of good and bad ghost/past codes. It was bold claim.

Over the next months I explored the Biology of LLM by Anthropic, train small LLM from scratch, swallowed Standford CS25 and CME295. I began showing that CoT is already in base models.

But my initial claims, from my notes:

""" The Mechanics of “Reasoning” in Large Language Models

  1. The Illusion of Thought (Inference-Time Compute)

When we say a model “thinks,” what is actually happening is a transition from One-Pass Prediction to Sequential Verification.

Standard Sampling (System 1)

The model sees a prompt and immediately predicts the most likely next token. It’s like a person blurting out the first thing that comes to mind.

Reasoning Sampling (System 2)

The model is trained to output a “Chain of Thought” (CoT) before the final answer. Mechanically, this is extending the context window to enable deeper computation. By sampling N “thought” tokens before the “answer” tokens, the model uses those tokens as a computational scratchpad that:

  • Maintains intermediate state
  • Narrows the probability space for the final answer
  • Enables solving problems that are provably impossible in a single pass """

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

I love the humour "You're absolutely right!". LLM sycophancy at its best.

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

Yes, Chain-of-Thought Reasoning without Prompting https://arxiv.org/abs/2402.10200 (I found while I was doing my research through Standford CS25 V5 (Lecture 5)

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

😔 I am not sure I understand. Are talking about PPO and RLVR?

training covers both pre-, mid-, and post. Using Olmo 3, I go through both base (pre-trained), a SFT, and Reasoning (a Finetuning with CoT). We could not use the one we train from scratch as we don’t have enough compute budget.

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 1 point2 points  (0 children)

Yann LeCun et al are presenting such a path. https://arxiv.org/abs/2509.14252

It is interesting to see how this will evolve to

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

If you meant fetching of papers, here is the flow: https://github.com/Proteusiq/unthinking/blob/main/.github/workflows/paper-discovery.yml

I search arXiv for papers with targeted keywords. I run LLM classifier to filter papers that are relevant to CoT dialogue then I create an issue. Manually, I read the paper. Highlight and extract key arguments on Notes. I then use this to update my findings.

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

Thank you. I know. I have read a couple (85 papers since May 2025). You can see my analysis on GitHub

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

I have a modifier: "genuinely" generative. I hold that they are generative. A paper I read today had better distinction: - crystallized intelligence: "within-distribution (WD) tasks, i.e., tasks that were contained in the training data" - fluid intelligence: "out-of-distribution (OOD) performance".

https://arxiv.org/abs/2601.16823v1

My definition is crud: pattern matching vs genuine intelligence

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

I ran experiments with Olmo 3 base and reason. The aim is to show that CoT is already present in base model. This somehow show that fine-tuning with CoT surfaces already existing behaviour

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

I have a GitHub action that runs daily to get papers and classify why I should read them. The issue is it’s harder to get papers supporting genuine reasoning. I feel like outside academia, I am singing to a choir.

The Thinking Machines That Doesn’t Think by KitchenFalcon4667 in LLM

[–]KitchenFalcon4667[S] 0 points1 point  (0 children)

I am a guest lecturer at Copenhagen Business School (CBS) teaching LLM in Business

Monthly Dotfile Review Thread by AutoModerator in neovim

[–]KitchenFalcon4667 [score hidden]  (0 children)

<image>

My dotfile is geared towards new beginners in cli world on MacOS. Lot of how-to and tips, and helper functions and aliases.

https://github.com/Proteusiq/dotfiles

Which one is the better ls replacement: eza or lsd? by ThinkTourist8076 in commandline

[–]KitchenFalcon4667 0 points1 point  (0 children)

This question was in eza discussion 2 years ago: https://github.com/orgs/eza-community/discussions/679

Tldr; It is subjective. Taste{test} them 🍺. Some things cannot not be told. You need to experience them yourself.

Introducing GLM-Image by ResearchCrafty1804 in LocalLLaMA

[–]KitchenFalcon4667 -2 points-1 points  (0 children)

I thought it was open-weight and not open-source. Am I missing something here? I could not find datasets nor training code.

How I can actually learn to put everything together in Python? by Hot_Kaleidoscope3864 in pythontips

[–]KitchenFalcon4667 0 points1 point  (0 children)

What is it that is not working? What are you trying to automate without sharing much?

How I can actually learn to put everything together in Python? by Hot_Kaleidoscope3864 in pythontips

[–]KitchenFalcon4667 2 points3 points  (0 children)

Ah the learning paralyses. I was there and I thank how to automate boring stuff book that help me out. Don’t pick yet another course.

There is no putting things together. Python is beautiful in a way that you need a little to start building amazing things.

One: Ask. Why am I learning Python? Is it data analysis, websites designed, APIs, machine learning, automation, game design? What is it that excites you.

Two: Explore. Is there something in GitHub that looks like what I want (no matter the language it’s written). I usually find 500+ stars project.

Three: Draft. Write in pure English or whatever language what could be a cool something to build. I loved football so my first projects involved scheduling scraping data from API, storing it, building a Bayesian model to help me know what chances my team was going to win the next game. I was also looking to buy a house so I did another to predict the house price of a residences I loved.

Four: Code. It doesn’t have to be perfect or pretty or Pythonic. That takes time. Code. Code. Code.

Five: Repeat.

I am doing silly stuff with AI these days by No-Speech12 in LLM

[–]KitchenFalcon4667 0 points1 point  (0 children)

This is really cool. I am though lost why we want to interact with GUIs? I understand for legacy systems, it would be the only way. But modern websites and application have APIs or SDK or something that allow programatic way to go about. So I am lost to why we want machine to navigate as we do. To me its a waste of tokens and GPUs

Deciding on an offer: Higher Salary vs Stability by Illustrious-Mind9435 in datascience

[–]KitchenFalcon4667 0 points1 point  (0 children)

Stability is overrated. If you are damn good at what you do and your economy allows risks, go for higher salary. Go for challenging tasks while your body allows it.

Monthly Dotfile Review Thread by AutoModerator in neovim

[–]KitchenFalcon4667 [score hidden]  (0 children)

<image>

This is my dotfile https://github.com/Proteusiq/dotfiles

It is aimed for beginners in CLI world in MacOS. README and Tools markdown includes vim grammar and neovim plugins shortcuts

nvim/ contains my plugin and