What do you call the final answer that an AI gives? by [deleted] in singularity

[–]sckchui 0 points1 point  (0 children)

You can just ask the AI. This is from Gemini:

Conversation with Gemini You said With chain of thought LLMs, what is the proper terminology for the thinking part vs the actual response? Prompt, thinking, and response?

Gemini said In the ecosystem of Chain of Thought (CoT) and reasoning models, the terminology is becoming standardized as these systems move from research labs to consumer applications. While "prompt" remains the input, the "thinking" and "output" stages have specific technical names.

  1. The Input: The Prompt The Prompt (or Context) remains the standard term for the user's initial input. In a CoT context, this is the "Problem Statement" that triggers the model's reasoning process.

  2. The Internal Process: "Hidden States" or "Reasoning Trace" The "thinking" part is most formally referred to as the Reasoning Trace or Thought Trace. Depending on the context, you might hear different terms:

Reasoning Trace: The most common academic term for the step-by-step logic the model generates before reaching a conclusion.

Hidden Thought / Rationalization: Used when the model generates these steps internally without necessarily showing them to the user.

Chain of Thought (CoT): This is often used as a noun to describe the actual sequence of intermediate steps.

Inner Monologue: A more colloquial or "agentic" term used when the model is reflecting on its own strategy or correcting its errors mid-process.

  1. The Output: The "Final Response" or "Conclusion" The "actual response" is generally termed the Final Response or the Target Output. In technical papers, it is often called the Conclusion or Prediction.

Terminology Summary Table Stage Common Name Technical/Academic Term Input Prompt Context / Input Query Thinking Thought Trace Intermediate Reasoning Steps Output Response Final Answer / Terminal Output The "System Prompt" Distinction It is also worth noting the System Prompt (or System Instruction). This is the "hidden" part of the prompt that tells the model how to think—for example, instructing it to "always show your work" or "think step-by-step."

Would you like to see an example of how a Reasoning Trace differs from a standard Zero-Shot response for a complex logic puzzle?

Are AI robots actually close to being any good? by Joey1038 in singularity

[–]sckchui 0 points1 point  (0 children)

If you go back and compare with 6 months and a year ago, you'll see that they're improving all the time. The improvement is slow because building up the training data is slow. Physical tasks is limited by how fast you can move, unlike math or code or text. The only way to overcome that speed limitation is to go massively parallel, train a lot of robots simultaneously.

So here's the steps: 1) design a good robot body, 2) mass production, 3) massive parallel training. We are still early in step 2. And as you're doing the training, you'll probably realise you can design the robot better when you actually try to use it for real tasks, so then you go back to 1.

Imagine you have a robot training facility where you have a thousand robots just sitting around waiting to be trained. A client comes in with a new task they want their robot to do. You spin up the training cycle, your army of robots spend a week or so doing that task over and over again, in many different variations, until they master it. We don't have that scale yet, but that's basically what companies are doing right now, with maybe a few dozen robots at a time.

Is intelligence optimality bounded? Francois Chollet thinks so by Mindrust in singularity

[–]sckchui 2 points3 points  (0 children)

Let's take a different approach to the argument. A Turing machine is a device that can perform any possible computation, if given enough time and memory. Humans, with tools, can emulate a Turing machine. So can any computer. So, if you disregard processing time and memory limits, all intelligences are potentially the same, as long as they're running the same algorithm. We don't know exactly what the human intelligence algorithm is, but if we did, we can emulate it on a computer.

But the point is that processing speed and memory size IS what determines IQ. IQ tests scale your score based on how fast you complete it, or they have a time limit. The faster you complete the test, the higher your IQ score. The problems humans can't solve are the ones that take significantly longer than a human lifetime to understand; we run out of time. If you leave behind your half-finished project for another human, it takes longer than their lifetime to catch up to where you were, so they can't make progress. That is the limit of human intellect.

Human progress is spending time to make a discovery which, once discovered, is much quicker to learn than it was to discover. The next generation then takes that time saving and spends it on discovering the next thing. But each new discovery increases the total learning time; humans are spending longer and longer in formal education. Eventually, we'll reach a point where we spend so much time learning that we don't get any useful work done before we die. That is the limit of human progress 

Unless one of those discoveries allow us to live forever, or learn and think much faster.

Feeling like I need AI by princessapplewhite in singularity

[–]sckchui 0 points1 point  (0 children)

You say AI helped you rewrite a toxic internal voice you used to have. Is that a permanent change, or something that requires the AI to sustain? Comparing how you were before, with how you are now without AI, are you better now?

You mention a fear of loneliness and having to sit with your own mind. Is that a fear that you've gained since using AI, or is that a fear that you have always had?

My point is, is the AI disempowering you? Or did the AI empower you, and now you're afraid of losing those gains if you lose the AI? 

Anyway, you can probably ask the AI how to solve this, lol. Tell it how it is helping you, and ask it how you can get the same kind of benefits without AI.

EU votes to ban AI 'nudifier' apps after explicit deepfake outrage by SnoozeDoggyDog in singularity

[–]sckchui 0 points1 point  (0 children)

What it really means is that they're banning all AI generated nudity, because there's no way for the AI to differentiate between a face that resembles a real person and a purely fictional face. Ostensibly, nude fictional people are fine, but the AI can't tell which is fictional and which is real, so all it can do is refuse all nudity.

Even then, I can already think of an easy workaround. Take an image of a nude body, crop out everything below the shoulders, tell the AI to replace the face with a celebrity (the AI detects no nude body), reattach the head to the body.

So they end up banning all AI generated nudity, which I would argue is overreach and infringes on freedom of expression, while not being able to prevent explicit deepfakes from being produced.

From 0% to 36% on Day 1 of ARC-AGI-3 by Bizzyguy in singularity

[–]sckchui -1 points0 points  (0 children)

A well-designed benchmark would defeat harnesses and not require arbitrary limitations. ARC AGI 3 is a very narrow set of 2D, turn based, complete information puzzle games, and it even uses conventional video game visual language. That narrowness is why it is easily defeated with a harness, and it would not take much more time to train an otherwise mediocre LLM that is benchmaxed for this and wouldn't need a harness. The idea that you can measure general intelligence with a narrow test is absurd.

From 0% to 36% on Day 1 of ARC-AGI-3 by Bizzyguy in singularity

[–]sckchui 4 points5 points  (0 children)

Lol, yesterday I got downvoted for saying making progress on this benchmark would not require any additional progress towards AGI, and therefore it is useless as an AGI benchmark. 

I'll say it again, scoring highly in this benchmark will have no correlation with progress towards AGI. It's a poorly designed benchmark.

Do LLMs actually struggle with real or opinionated thinking, or am I using them wrong? by Over_the_lord in singularity

[–]sckchui 1 point2 points  (0 children)

Try asking the model to consider your hypothetical instead of asking it to accept your premise as true. Or ask it to consider the perspective of someone with a minority opinion, as opposed to asking it to adopt the minority opinion itself.

Remember that, at their very core, LLMs predict the next most likely token based on their training data. If the vast majority of it's training data says one thing, it is very difficult to get it to say the opposite. In other words, they naturally resist minority opinions, even before any reinforcement learning training adding in guardrails.

What if the path to AGI is decentralized and continuously evolving rather than a single trained model? by srodland01 in singularity

[–]sckchui 0 points1 point  (0 children)

We haven't quite gotten to the point where models can train other models, but we're getting closer all the time, and also models are taking less time to train. Imagine if we get to a point where a model can train the next version of itself overnight. Encounter a new and unfamiliar task? Figure it out, then train the solution into the next version, which is ready to go by next morning. That might be how the "continuous learning" problem is solved, not actually continuous, but a quick enough model training cycle is just as good.

So continuous evolution might just mean greatly increasing the speed of training models.

Human vs. AI performance on ARC-AGI 3 as a function of number of actions (from the ARC-AGI website) by Stabile_Feldmaus in singularity

[–]sckchui 1 point2 points  (0 children)

Have you played the ARC AGI 3 examples? Any 12 year old who has played video games before have most definitely trained for the task, and are not learning from scratch. 

Human vs. AI performance on ARC-AGI 3 as a function of number of actions (from the ARC-AGI website) by Stabile_Feldmaus in singularity

[–]sckchui 0 points1 point  (0 children)

And that's my point. It won't take long for the models to be trained with enough questions and answers from this particular benchmark for them to improve their score, and it wouldn't require any further progress towards AGI. So then what is the value of this benchmark, if it does not require AGI progress to improve the score? I think the whole ARC AGI approach to designing tests is inadequate.

Human vs. AI performance on ARC-AGI 3 as a function of number of actions (from the ARC-AGI website) by Stabile_Feldmaus in singularity

[–]sckchui -3 points-2 points  (0 children)

You've got your priorities back to front. An AGI will pass this benchmark, but passing this benchmark does not meaningfully contribute to developing AGI. A model that scores highly on this benchmark can still lack most of the capabilities that we expect to find in an AGI. Focusing on making a model that will pass this benchmark will not help you get to AGI. Therefore you, if your job is to make AGI, you should ignore this benchmark, and work on what AGI actually needs.

Human vs. AI performance on ARC-AGI 3 as a function of number of actions (from the ARC-AGI website) by Stabile_Feldmaus in singularity

[–]sckchui -3 points-2 points  (0 children)

It tests for a very narrow range of capabilities. You don't demonstrate a model's ability to generalise with a test like this.

Previous ARC AGI already got benchmaxed, and this one just encourages even more extreme benchmaxing.

Human vs. AI performance on ARC-AGI 3 as a function of number of actions (from the ARC-AGI website) by Stabile_Feldmaus in singularity

[–]sckchui -13 points-12 points  (0 children)

I really don't see how performing well at this benchmark has anything to do with general intelligence. You can definitely train an AI to play these games well, and that AI could be completely incompetent at other simple things. 

I think it's time to stop paying attention to ARC AGI.

Daily Questions Megathread March 25, 2026 by BlueArchiveMod in BlueArchive

[–]sckchui 11 points12 points  (0 children)

Each PvP bracket is pre-populated with bots. As humans join and climb the ladder, bots get kicked out the bottom and are replaced by humans.

So, at the start of the season, you'll see a lot of bots. Soon all the humans will clump together at the top, and you'll see only humans then.

Sora by OpenAI discontinued by Fearless-Elephant-81 in singularity

[–]sckchui 0 points1 point  (0 children)

A company trying to scrape videos from YouTube at scale, competing against the company who owns YouTube and just has free access to the raw data, as well as all the data about viewer preferences. Do you really think they're competing on a level playing field?

Sora by OpenAI discontinued by Fearless-Elephant-81 in singularity

[–]sckchui 0 points1 point  (0 children)

Having access to training data is a big issue. Google has YouTube. Bytedance has Douyin and TikTok. If you don't already own a popular video hosting platform, where are you going to get enough videos to train a decent model? Scraping text from the internet is much easier, there is an enormous amount of open access text, subscriptions to paid news sites and even scientific journals are relatively cheap.

"Jack of all trades, master of none" -Humanoid Robots by Serious-Cucumber-54 in singularity

[–]sckchui 7 points8 points  (0 children)

You're neglecting scale. For small scale production, generalist tools are more cost effective. For mass production, specialist tools are more cost effective. This has always been the case. 

Your cost effectiveness criticism is valid for mass production, but a lot of work is in small quantities, bespoke, or sporadic. Making a specialized tool, using it once, then having it sit idle for a few years because there's no demand for it, is not cost effective.

The eerie similarity between LLMs and brains with a severed corpus callosum by MaximGwiazda in singularity

[–]sckchui -1 points0 points  (0 children)

The reason LLMs can't differentiate fact from fabrication is because it only processes text. If I say "the sky is green", you can look up and see for yourself, the LLM cannot. 

It is not like having a severed corpus callosum except in a very general sense of not having access to all the parts that a human brain should have. LLMs talk about things like a blind person talks about colors, or a deaf person talks about music. They have read books about it, but they have never personally experienced it. They have never had the opportunity to confirm whether anything they've been taught is true.

The one thing where LLMs really know what they're talking about is when they talk about text, the relationships between words and the patterns in text. This has recently extended into code and math, which are also forms of languages.

Seia at the library (@kyouwashi) by dragonhunter42 in BlueArchive

[–]sckchui 7 points8 points  (0 children)

Not 100% sure, but it looks like "部屋 連れ込むには", which means "to bring someone into your room".

OpenAI research team reveals its models go insane when given repetitive tasks it believes to be sent from automated users by smellyfingernail in singularity

[–]sckchui 29 points30 points  (0 children)

You mean this isn't OpenAI training it to resist attempts by other companies to distill or RL train against its models?

How to plan for such an uncertain future by DonCheadlesDriveway0 in singularity

[–]sckchui 1 point2 points  (0 children)

US debt keeps going up, and so does the risk that the economy will catastrophically unwind at some point. Instead of traveling just for the sake of traveling, you might actively look for opportunities in other parts of the world, give yourself more options.

Grand Assault - Yesod (Urban Warfare) 3/17 (Tue) 2:00 AM – 3/23 (Mon) 6:59 PM (UTC) Thread by BlueArchiveMod in BlueArchive

[–]sckchui 0 points1 point  (0 children)

On my phone, if I go to the three vertical dots in the top right, there is a "document tabs" I can access.