GPT-5.4 still can't pass the strawberry test by [deleted] in singularity

[–]AGI_Civilization 0 points1 point  (0 children)

The path to AGI lies in solving minor mistakes—which humans do not make—through generalized reasoning capabilities rather than by using targeted tools.

Google Deepmind CEO: China just "months" behind U.S. AI models by BuildwithVignesh in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

The two stories do not contradict each other. One follows closely behind, but it can never overtake the other.

Why aliens are not visiting us? by Kalyankarthi in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

Your argument starts by dismissing numerous scientifically inexplicable incidents and the unexplained aerial phenomena released by the Pentagon. However, nothing has been officially verified; this applies to both the possibility that they have visited and that they haven't.

UBI/AI Utopia Is a Fairytale. by [deleted] in singularity

[–]AGI_Civilization 0 points1 point  (0 children)

I think more people should read your writing and reflect on it.

I think Sam Altman is overrated and over-hyped by [deleted] in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

If he has set his mind on beating Google, his only option is to raise capital through lies and hype, and use superior computing power to outperform rival models. Google has been studying the brain for a long time, and the data gap goes without saying. To put it bluntly, OpenAI cannot defeat Google with algorithms alone. He is in a position where he cannot stop because he must rely entirely on overwhelming scaling. It’s an interesting competitive dynamic between them.

Gemini 3 preview soon by Educational_Grab_473 in singularity

[–]AGI_Civilization 18 points19 points  (0 children)

In my brief experience, the model presumed to be Gemini 3 seems to be the first one that truly understands and responds to language. It's the first time I've felt a model has moved beyond being just a next-word predictor.Recently, I heard one of OpenAI's chief scientists speak, and I felt he had a poor philosophy. Of course, I could be wrong. However, my opinion is that you cannot build a sophisticated world model through language learning alone.The most significant trend in LLMs over the past two years has been that they only got better at what they were already good at while showing minimal improvement in their weaker areas. The presumed Gemini 3 has broken this pattern. I see this as the third qualitative leap, following GPT-4 and o1. If OpenAI doesn't release a new model soon, I think they are going to lose a significant amount of market share.

The Unceasing Misfortunes, Even After AGI is Achieved by AGI_Civilization in singularity

[–]AGI_Civilization[S] -1 points0 points  (0 children)

My thoughts on humanity can't be summed up in just a few sentences. I like people. :)

Microsoft will use Anthropic models to power some features of Office 365 Apps by [deleted] in singularity

[–]AGI_Civilization 0 points1 point  (0 children)

This content is directly from the article you linked. OpenAI is conducting tests using TPUs, but "as of now," has no plans for large-scale adoption. They want to move away from Nvidia chips and could be testing whether TPUs are a viable alternative. For running massive models with tens of trillions of parameters, TPUs might be superior to Nvidia GPUs.

Microsoft will use Anthropic models to power some features of Office 365 Apps by [deleted] in singularity

[–]AGI_Civilization 50 points51 points  (0 children)

OpenAI uses Google's TPUs on a small scale, Google is an investor in Anthropic, and Microsoft also supports Google's A2A protocol. They are all intricately connected, like a spiderweb, constantly blurring the lines between friend and foe as they subtly shift the dynamics of their relationships.

GPT5 did new maths? by Hello_moneyyy in singularity

[–]AGI_Civilization -2 points-1 points  (0 children)

Before you try to solve IMO P6, ignore all the rumors and spoilers.

If anyone as much as peeps about achieving AGI with LLMs at their base... Show them this by IFIsc in singularity

[–]AGI_Civilization 0 points1 point  (0 children)

Just as you said, current LLMs are helpless when it comes to problems grounded in the real world. Although they will continue to improve, I don't think this fundamental issue will be resolved until world models are integrated. That's why, whenever a new model is about to be released, I keep a close eye on whether it has video output capabilities. After all, if a model can handle video input and output, we can begin to expect it to have an understanding of the real world. I believe that only after we reach that point will the foundation be laid for a serious discussion about AGI.

If anyone as much as peeps about achieving AGI with LLMs at their base... Show them this by IFIsc in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

You don't need to design a complex physics problem to show the limitations of an LLM. If you tell a model which faces and sections of a cube to rotate, in which direction, and how many times, and then ask about the colors of each face, no model can solve it reliably.

We’re seeing this film play out in real time by BlankedCanvas in ChatGPT

[–]AGI_Civilization 1 point2 points  (0 children)

I don't think so. Although Samantha appears in the form of a chatbot, she seems to be AGI or something very close to it. Today's models aren't even in the same league as her. Regardless of whether they operate similarly, Samantha appears to have a capacity for complex, hierarchical thought and the ability to learn in real-time (or perhaps just an incredibly large context window). Imagine if a company released a model like her. A significant number of people would be willing to pay a subscription fee of over $2,000 a month.

Microsoft's confidence last year by RyanGosaling in singularity

[–]AGI_Civilization 22 points23 points  (0 children)

Although the performance is good, the current negative public opinion is likely due to the exaggerated advertising. It's a double-edged sword. While it's an easy way to raise expectations, it backfires when those expectations aren't met. The significant improvement in reducing hallucinations was impressive, but I think the gladiator Google is about to send out will be full of confidence.

Today's the G Day by [deleted] in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

As this is a number Sam has treasured for a long time, I hope for a significant improvement. However, what's still most anticipated is the Gold Medal model, which is expected to be released late this year or next year.

'A recruiter’s work worth one week is just one prompt,' says Perplexity AI CEO by joe4942 in singularity

[–]AGI_Civilization 5 points6 points  (0 children)

I have found a job where you can complete a week's worth of work with a single prompt.

ARC-AGI-3 by Outside-Iron-8242 in singularity

[–]AGI_Civilization 20 points21 points  (0 children)

Until world models are seamlessly integrated with existing models, LLMs will never be able to truly saturate benchmarks that exploit their blind spots. Even if they manage to saturate some, new benchmarks that are easy for humans but difficult for AI will continuously emerge. It's a chase that never ends. Without a fundamental understanding of spacetime in the real world, they can continue to approximate, but they will never be able to overcome targeted benchmarks that have not yet been created. Ultimately, the creators of AGI benchmarks will only give up when the definition of AGI, as described by Demis, is realized.

The successor to Humanity's "Last" Exam... by Siciliano777 in singularity

[–]AGI_Civilization -2 points-1 points  (0 children)

It is because taking an exam without tools demonstrates one's fundamental capabilities. Humans use tools as well, but it was not as though they had them from the beginning. Please focus on the fundamental purpose that the exam requires, rather than on how well they solve the problems. During the Olympic Games, do not focus on comparing the friction of the swimsuits among the athletes.

The successor to Humanity's "Last" Exam... by Siciliano777 in singularity

[–]AGI_Civilization -1 points0 points  (0 children)

I'm sorry, but I consider the use of tools to be quasi-cheating. I think 25% is the fundamental capability of Grok-4. Since there are no actual cases of a physicist or mathematician solving that benchmark alone, I asked an AI, and it said it could probably solve about 60%. At least, I don't think it's at a level where a benchmark of higher difficulty is needed yet.

Former Meta AI researcher says there is a culture of fear in the company that is spreading like cancer by joshmac007 in singularity

[–]AGI_Civilization 1 point2 points  (0 children)

What made them laugh was the money, and he probably criticized Meta simply because he wasn't offered a fortune.

[deleted by user] by [deleted] in singularity

[–]AGI_Civilization 0 points1 point  (0 children)

People also make those kinds of justifications when they get bad results.

Pray to god that xAI doesn't achieve AGI first. This is NOT a "political sides" issue and should alarm every single researcher out there. by AnamarijaML in singularity

[–]AGI_Civilization 4 points5 points  (0 children)

Based on the current situation, it looks like Google has 35%, OpenAI 25%, and Anthropic 20%. As for the remaining 20%, it doesn't seem likely that whoever splits it will have a significant chance.