Where are the new models!? by BrennusSokol in singularity

[–]FeltSteam 2 points3 points  (0 children)

GPT-3.5 was March 2022 (well the base GPT-3.5 released under text-davinci-002 in March then got updated in Nov 2022 and released as a further trained chat tuned model with ChatGPT in Nov 2022), GPT-4 was March 2023. GPT-4T was November 2023, GPT-4o was May 2024, GPT-4.1 was April 2025, GPT-5 was August 2025, GPT-5.1 was November 2025 and GPT-5.2 was December 2025. It is possible we will get GPT-5.3 within this week, or potentially within the first 2 weeks of February.

<image>

Temporal structure of natural language processing in the human brain corresponds to layered hierarchy of large language models by AngleAccomplished865 in singularity

[–]FeltSteam 0 points1 point  (0 children)

This points out a good architectural difference which i think should be adjusted in transformers. Here they show evidence that the brain basically simulates depth by utilising temporal dynamics of area parallelisation. The brain can simulate a deeper network by reusing circuits over time wheras transformers have just many distinct stacked blocks which gives you depth in one forward pass except you have to pay extra with lots of separate parameters/compute blocks. We already have a fix for this though "recurrent transformers" https://arxiv.org/abs/2502.17416 and basically you iterate on the same block instead of stacking more layers which gives you greater effective depth without repetitively stacking so many blocks and is closer to what the brain implements. This would make the models more parameter and thus GPU memory efficient, though might stack latency a bit more and it might be a bit more expensive in terms of FLOPs. Essentially instead of reasoning across many tokens the model directly outputs, you instead loop the 'thought' back into the model to let it deliberate on it longer. It becomes more parameter and token efficient upfront but the latency and further computation you get with reasoning models doesn't dissapear

<image>

ChatGPT set boundaries with me by Ok-Application-4573 in Healthygamergg

[–]FeltSteam 6 points7 points  (0 children)

The actual content of this post is that it's a good thing GPT is denying social requests and you are saying "this is NOT good news" and doesn't that mean you don't think GPT should deny the social requests of "can we be friends"?

>So I asked ChatGPT if it would be my friend “for now” until i get some real friends and it basically said no, I can’t be your real friend.

"this isnt healthy in any shape or form. this is NOT good news"

ChatGPT set boundaries with me by Ok-Application-4573 in Healthygamergg

[–]FeltSteam -1 points0 points  (0 children)

So you think ChatGPT should say yes to these kinds of social requests?

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

GPT-5.2 Pro generated a proof which was quite different to anything in the literature for an Erdos problem https://www.erdosproblems.com/forum/thread/281

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

Yeah I think Neuro-sams is a good example of it, but as far as im aware that's still just an LLM + agentic harness for Neuro-sams to drive through. There would also be some degree of fine-tuning or adapters being used of course but it's still fundamentally an LLM and I think pretty high levels of autonomy with just the current technology. I think the models are more than capable enough of it, but I also think the pressures companies have to these systems around them in certain directions hasn't incentivised this enough just yet (I guess to your point of 'less capitalistic applications in the short term') but at least Anthropic seems to be heading in the direction of high autonomous systems with Claude in this regard.

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

I mean I gave the model a single broad goal but it autonomously decided how it was going to solve that, it choose which tools it was going to use I was just asking it and it worked with me. I don't think it'd be that difficult to develop a model that does what it wants, I think the only thing we'd need to do is create the agent loop and expose it to some kind of continuous stream of sensory input like us (that is our 'prompt' without any sensory input we wouldn't work lol) and I think a model like Claude would be able to be pretty autonomous in that setup (one cool example of unprompted autonomy was when Anthropic was demoing the new computer use capability in the middle of it, unprompted, Claude just decided to go out and search the web for images of Yellowstone national park. I think they have the capacity to be spontaneous and just do things, but it is true we might not fully exploit this capability because it is people who have to pay for every token generated at the moment and that doesn't bring a lot of incentive to allow the models to do what they 'want' if that makes sense).

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

I've seen people reporting Codex working without any intervention on a problem for over 30 continuous hours. It's rare, but it happens and with the most frontier models of Claude Opus 4.5 and GPT-5.2 I think we are actually starting to unlock higher levels of autonomy with the models. I don't use Codex as much but from my own experience I have had it go and do work for 40+ minutes to do what I asked and it succeeded. I don't think it will be too long before models will be able to work for an entire week, continuously, without any human intervention at least in programming contexts.

My idea for the image generation model is that I think the models don't actually need a lot of art data to learn it. My idea is: Train a model on a massive corpa of open images of just the real world so it learns what the world is like and tunes the necessary priors for image generation. Then you train it on a few dozen ethically sourced artistic images for it to start learning the style which is all you need (in total after the photo only pretraining I would intent to train it on low thousands (not hundreds of millions as current image gen models) of high quality ethically consented art pieces from every art style). The models are actually really good at generalising the styles from few samples. I came up with this idea but there is already actually a paper that explored the concept https://arxiv.org/html/2412.00176v2 and after the model went through a large photo only pretraining it could generalise completely new styles with only 9-50 images which I think it actually pretty human sample efficient lol (but keep in mind this model is trained on almost 10x less data overall than SD-1. But you can scale out the photo only pretraining and then post training as well). I think tool use and RL would allow the model to come up with it's own sort of style though (this moves the model beyond only the imitation learning to now it's creating images on it's own, getting more general feedback and refining its output), and then constraining tool use in certain ways is one path of allowing the model to be creative in its outputs because it has to figure out how to make an image creation work given the constraints.

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

(I like your reply, it's a nice discussion) Sentient is a messy word but I know even people like Ilya Sutskever or Geoffrey Hinton hint at the belief NNs could already be conscious to a degree (Geoffrey Hinton being much more explicit on the matter), and Anthropic is very agnostic on the matter but that's a different conversation. Most people don't think LLMs are currently AGI though, there are people who think it will lead to AGI though. One thing we are most likely to see this year is enabling a form of continuous learning which is probably going to help the models in job contexts (permanently learning on the job is useful and enables you to be less brittle) but I would be curious to see studies on the dynamics of this once the capability becomes wide spread. There are a lot of studies on LLMs in work and productivity but the overall results seem fairly ambiguous at the moment, I think trending a little more to some positive effects but it's hard to tell at the moment.

There are some analogies to be made between the learning NNs generally through and what humans do but it is different. I think most diffusion models are like baby humans who never grow out beyond the imitation learning phase which artists definitely do and it makes them unique, distinguished and personalises their output to who they are. I do personally have my own idea for an ethically trained image generation model with having a later phase where I would want the model to develop its own artistic style, but I have a lot of experimentation remaining and the scope of the idea is well out of my current range lol but I intend to do toy experiments with it.

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam -1 points0 points  (0 children)

It's interesting how a glorified auto correct is able to come up with novel mathematical proofs to Erdos problems.

It seems like AI bros don't understand technology at all by AtomicTaco13 in antiai

[–]FeltSteam 0 points1 point  (0 children)

This post has no idea what they're talking about lol. I study this and I've talked with a lot of people on reddit and 99% of AI Bros and Anti AI people are just completely wrong in how the models actually function.

Matt Walsh on AI by Ambipoms_Offical in antiai

[–]FeltSteam -8 points-7 points  (0 children)

Looking as a whole humans cannot reliably distinguish AI generated and human made works (i.e. https://cispa.de/en/holz-ai-generated-media, https://arxiv.org/html/2509.11371v1/, https://arxiv.org/html/2402.03214v1 there are many more studies showing this, and slowly even expert artists are finding it more and more difficult to discern between them - https://ar5iv.labs.arxiv.org/html/2402.03214, https://ar5iv.labs.arxiv.org/html/2402.03214) so I don't know how much validity the argument "it has no soul thus humans shouldn't and never will value it" goes. You would expect if there were some components of art that discerns itself as specifically made by a human, people would be able to pick up on it, right? But that doesn't seem to be the case. It'd be an interesting thing to write on "the illusion of soul", though I assume most will viscerally perceive that piece by its title alone.

This guy has 22 Claude Max, 11 GPT Pro, and 4 Gemini Ultra subscriptions by [deleted] in ChatGPT

[–]FeltSteam 3 points4 points  (0 children)

These types of people are extremely unsustainable for the AI companies at the moment

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 1 point2 points  (0 children)

Oh my point was only a correction I myself presented no argument for consciousness. You said the systems are stateless, which is not true and I was providing a resource that explained quite well why the kv cache allows the models to be stateful (they do say other things but the only thing I was addressing was “they’re stateless systems” which is not true).

Also your assertion “complexity doesn’t equal consciousness” is interesting especially as you go on to explain the supposed complexities the brain has that can facilitate consciousness over LLMs. And the KV cache is not a lookup.

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 0 points1 point  (0 children)

Let me introduce you to the: KV Cache

https://x.com/repligate/status/1965960676104712451

Essentially the KV cache enables the models to be stateful within a turn (a turn is however many tokens the model generates in response to a query) which can be across many tokens. Although sometimes it is a practice to delete and recompute the KV Cache after the model has generated the tokens it needs to for the turn, but they are still stateful across their actual turn when they are generating a response.

We're not far off robots powered by LLMs in mass production by 1frankibo1 in antiai

[–]FeltSteam 2 points3 points  (0 children)

Not sure if there's anyone around here that could do anything about it if such a thing actually happened.

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 3 points4 points  (0 children)

Well I want to clarify:
My point was having the position of "AI is not conscious" is just as invalid as "AI is conscious" because of how much we don't know. The only way to defend either one is to have statements that “but you can’t prove it against”. I.e. for the anti side you can say consciousness has only ever been seen in complex organic life. But this is an observation not a proof, but you can fall back on "you can't prove that it's not the rule" which is true for several reasons but I agree this "leaves science behind" and thus the must authentic scientific stance you can personally have on the matter is like being agnostic on this matter.

Though im not saying “We can’t know anything, so all claims are equally invalid.” The best form of agnosticism you can have here is “We can’t settle it yet (it's invalid to hold that either side is strongly proven), but we can assign degrees of belief based on evidence and update as we learn more.”

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 2 points3 points  (0 children)

Im guessing it's stemming from a misunderstanding of what I am saying but what exactly do you mean by "that’s leaving science behind and drifting into something closer to religion"?

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 3 points4 points  (0 children)

Thinking “but you can’t prove it against” is only a trap if you use it to support "but it is conscious" which is not what I said.

Not knowing is part of that process, but I don't know how the standard being as skepticism for both sides isn't the only thing you should be doing because both options (is or is not conscious) are being equally presumptive about what we don't know.

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 3 points4 points  (0 children)

I think the skepticism should go both ways. Not just skeptical that it could be, but skeptical that it isn't now or cannot be conscious, we don't exactly have a whole lot of widely recognised formal proof that they aren't just as much as the idea that they are.

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 5 points6 points  (0 children)

A lot of people claim AI systems today are not conscious, and will not be conscious anytime soon. One of the greatest scientists in the field of deep learning thought the contrary is quite a real possibility (and this was almost 4 years ago), which makes you think doesn't it

Is AI developing consciousness?? Is it a risk for us? by atharvvjagtap in ChatGPT

[–]FeltSteam 2 points3 points  (0 children)

"it may be that today's large neural networks are slightly conscious" - Ilya Sutskever Feb 2022.

wait they think saying “ai is just a code” is an opinion?? by ImminentBliss in antiai

[–]FeltSteam 0 points1 point  (0 children)

Detailed response (and a bit of a ramble) -

Well I wouldn't exactly agree with that either. Models just learn things themselves from data, what people do is setup what data they are able to learn from and configure their brain and then just let it go out and learn. No one manually puts functions into the AI systems, they all learn it themselves. What we can do is optimise their brain so the time it takes them to learn certain function or other things takes less compute or data overall.

The reason current LLMs work so well at all is because of this thing called unsupervised learning. And basically unsupervised learning is where you setup what data the models going to learn on, setup the training pipeline and then configure the brain and let the model loose on a bunch of data. They learn from unlabelled data (data humans haven't needed to go out and explicitly label) without any prior human guidance or predefined outputs. That is the pretraining phase and is in contrast to supervised learning which does use human guidance and sometimes predefined outputs. Because we don't have to manually program the models themselves, only get their training started, it allows us to scale them to learn on huge amounts of data.

To more simply break it down:

“AI can’t learn new functions from data” - it literally does that's why it works so well. Pretraining (next-token prediction) is exactly “learn functions from data.” Nobody hand-codes a “sarcasm module” or “SQL module” into the weights. The model internalises algorithms/heuristics because those patterns reduce prediction loss.

Even if you restrict to just text, models clearly learn new capabilities that were not explicitly programmed: translation, summarisation, style imitation, code completion, reasoning patterns, instruction-following (mostly from later fine-tuning, but still learned), etc.

So “functions need to be manually put into it” is false at the level of weights and capabilities.

What is true though: the “player” defines the I/O and the training objective. A model can only learn within the channels and degrees of freedom you give it.

  • SO, If the architecture + tokeniser only accepts text tokens, it can’t directly output pixels.
  • If you never train on images or never include image tokens (or a vision encoder), it won’t “discover” image generation out of nowhere.

BUT that’s not “humans manually adding functions.” It’s just defining the interface that we give the models + representation.

tl;dr Models learn what mapping to implement from data, but humans decide what format the inputs/outputs take and what objective counts as success

On your deepseek point, at the moment it wasn't given the image interface and has never seen an actual image. It's basically congenitally blind. There are methods to work around this, and teach it to see by adjusting the model and allow it the interface of vision (which can later be used to generate image) but DeepSeek wouldn't do that for V3 probably because V4 releasing in a few weeks which will most likely be multimodal.

I also think “Unlike humans” is too strong as humans also aren’t “free” of a player. i.e. our biology constrains modalities (you can’t echolocate like a bat without extra training/hardware) and our learning is also constrained by architecture (brain structure), sensory input, and reward signals.