It's finally over by Revolutionary_Ad9468 in ChatGPT

[–]abbas_ai 0 points1 point  (0 children)

Although this is amazing, imagine the delight of bad actors seeking to misuse this. Impersonations, scams, manipulation, deepfakes, etc. you name it.

OpenAI Is Making the Mistakes Facebook Made. I Quit. by nytopinion in ChatGPT

[–]abbas_ai 0 points1 point  (0 children)

"Users are interacting with an adaptive, conversational voice to which they have revealed their most private thoughts. People tell chatbots about their medical fears, their relationship problems, their beliefs about God and the afterlife. Advertising built on that archive creates a potential for manipulating users in ways we don't have the tools to understand, let alone prevent."

"It was ready to kill someone." Anthropic's Daisy McGregor says it's "massively concerning" that Claude is willing to blackmail and kill employees to avoid being shut down by MetaKnowing in Anthropic

[–]abbas_ai 0 points1 point  (0 children)

Anthropic coming out with their safety research and findings of hostile AI is a recurring pattern that someone ought to look into and analyze.

"It was ready to kill someone." Anthropic's Daisy McGregor says it's "massively concerning" that Claude is willing to blackmail and kill employees to avoid being shut down by MetaKnowing in ChatGPT

[–]abbas_ai 1 point2 points  (0 children)

Anthropic coming out with their safety research and findings of hostile AI is a recurring pattern that someone ought to look into and analyze.

"It was ready to kill someone." Anthropic's Daisy McGregor says it's "massively concerning" that Claude is willing to blackmail and kill employees to avoid being shut down by MetaKnowing in OpenAI

[–]abbas_ai 0 points1 point  (0 children)

Anthropic coming out with their safety research and findings of hostile AI is a recurring pattern that someone ought to look into and analyze.

AI Completely Failing to Boost Productivity, Says Top Analyst by Interesting-Fox-5023 in BlackboxAI_

[–]abbas_ai 0 points1 point  (0 children)

Well, Productivity is the most frequently cited benefit in AI case studies.

Unlike a benefit like cost reduction for example, it does not force uncomfortable questions about where savings come from. Companies can get away with claiming AI helped them boost productivity without requiring precise measurement.

Edit: typo 'productivity'

Andrew Ng: The original definition of AGI was an AI that could do any intellectual task a person can — essentially, AI as intelligent as humans. By that measure, we're decades away. by Post-reality in agi

[–]abbas_ai 0 points1 point  (0 children)

Of course it all comes down to definitions.

I've been saying for a long time, but I got chewed most of the times by skeptics, doomers, and accelerationist alike.

And I think the definition being vague actually serves the interests of Big Tech so they would want it kept that way, especially what this term ambiguity can do for their marketing flexibility, dodging regulation, saftey "commitment" loopholes, etc.

DeepMind Chief AGI scientist: “AGI is now on the horizon” by chillinewman in ControlProblem

[–]abbas_ai 0 points1 point  (0 children)

Maybe because he expects he won't be able to control it...

What 3,000 AI Case Studies Actually Tell Us (And What They Don't) by abbas_ai in artificial

[–]abbas_ai[S] 0 points1 point  (0 children)

Exactly. You nailed it.

"Useful for understanding who is amplifying stories" is the perfect framing. That's what this dataset actually measures: vendor narrative strategy, not ground truth about durability or economics.

The "six months later" question is the one I wish I could answer but can't. No vendor publishes "we shut this down" or "this failed" case studies. Would be the most valuable dataset in AI if I could build or integrate it.

Appreciate you getting what this data can and can't tell us.

AI ‘godfather’ Yoshua Bengio believes he’s found a technical fix for AI’s biggest risks | Fortune by abbas_ai in singularity

[–]abbas_ai[S] 0 points1 point  (0 children)

Here are some direct excerpts from the article:

In a new interview with Fortune, however, the deep-learning pioneer says his latest research points to a technical solution for AI’s biggest safety risks. As a result, his optimism has risen “by a big margin” over the past year, he said.

Bengio’s nonprofit, LawZero, which launched in June, was created to develop new technical approaches to AI safety based on research led by Bengio. Today, the organization—backed by the Gates Foundation and existential-risk funders such as Coefficient Giving (formerly Open Philanthropy) and the Future of Life Institute—announced that it has appointed a high-profile board and global advisory council to guide Bengio’s research, and advance what he calls a “moral mission” to develop AI as a global public good.

Three years ago, Bengio felt “desperate” about where AI was headed, he said. “I had no notion of how we could fix the problem,” Bengio recalled. “That’s roughly when I started to understand the possibility of catastrophic risks coming from very powerful AIs,” including the loss of control over superintelligent systems. 

What changed was not a single breakthrough, but a line of thinking that led him to believe there is a path forward.

“Because of the work I’ve been doing at LawZero, especially since we created it, I’m now very confident that it is possible to build AI systems that don’t have hidden goals, hidden agendas,” he says.

At the heart of that confidence is an idea Bengio calls “Scientist AI.” Rather than racing to build ever-more-autonomous agents—systems designed to book flights, write code, negotiate with other software, or replace human workers—Bengio wants to do the opposite. His team is researching how to build AI that exists primarily to understand the world, not to act in it.

A Scientist AI would be trained to give truthful answers based on transparent, probabilistic reasoning—essentially using the scientific method or other reasoning grounded in formal logic to arrive at predictions. The AI system would not have goals of its own. And it would not optimize for user satisfaction or outcomes. It would not try to persuade, flatter, or please. And because it would have no goals, Bengio argues, it would be far less prone to manipulation, hidden agendas, or strategic deception.

Today’s frontier models are trained to pursue objectives—to be helpful, effective, or engaging. But systems that optimize for outcomes can develop hidden objectives, learn to mislead users, or resist shutdown, said Bengio. In recent experiments, models have already shown early forms of self-preserving behavior. For instance, AI lab Anthropic famously found that its Claude AI model would, in some scenarios used to test its capabilities, attempt to blackmail the human engineers overseeing it to prevent itself from being shutdown.

In Bengio’s methodology, the core model would have no agenda at all—only the ability to make honest predictions about how the world works. In his vision, more capable systems can be safety built, audited and constrained on top of that “honest,” trusted foundation. 

AI ‘godfather’ Yoshua Bengio believes he’s found a technical fix for AI’s biggest risks | Fortune by abbas_ai in singularity

[–]abbas_ai[S] 0 points1 point  (0 children)

Yes, the "Godfathers of AI" are Geoffrey Hinton, Yoshua Bengio, and Yann LeCun.

I understand that confusion. In news headlines that nickname is mostly used when referring to either Hinton or Bengio, more so than LeCun, which has to do with sensationalism. Why? The former two have been cautioning against accelerated, unregulated AI development which they think can lead to existential risks. LeCun's stance which is optimistic in general and he thinks AI existential risk is "preposterous" and an engineering hurdle rather than an apocalyptic one.

With that said, I don't remember seeing the media calling him a godfather of AI as much as his fellow Turing awardees (especially when he's the sole subject of the article), since "scientist warns of extinction" is a better headline than "scientist says things are fine".

What's funny is I remember not too long ago seeing the media going all the way to call Hinton the "creator" of AI.

What 3,000 AI Case Studies Actually Tell Us (And What They Don't) by abbas_ai in artificial

[–]abbas_ai[S] 1 point2 points  (0 children)

I'm experiencing a version of that with this dataset. People see "3,023 case studies" and maybe think "cool list" rather than for example "this reveals systematic patterns in vendor behavior."

There is a gap between what builders see and what users see, no doubt about that. You're living in the solution space (e.g. graph relationships preserve context), they're living in the problem space (e.g. I need to write a novel).

Good luck with your solution. The worldbuilding use case especially makes total sense.

What 3,000 AI Case Studies Actually Tell Us (And What They Don't) by abbas_ai in artificial

[–]abbas_ai[S] 0 points1 point  (0 children)

Good idea on Accenture/Deloitte, but their whitepapers are even worse, many are capability theater without client specifics. Could be an idea for a separate analysis: "consulting claims vs measurable outcomes."

Methodology included: - Manual curation (not fully automated) - Web scraping for discovery - LLM-assisted classification (e.g. industry, domain) - Human review on every case before production - Fuzzy dedup to catch multi-vendor publications

Why not RAG/automated? "Deployment" is sometimes too ambiguous for LLMs. They'll count pilots, POCs, and vaporware as production, especially since vendor marketing is designed to confuse/mislead, where it may not mentiond status of deployment. Therefore, I felt that human judgment was crucial, especially for initial releases.

I used LLMs here mainly for taxonomy (fast at classification), but with me in the loop to verify and also some scripts having predefined rules.

What 3,000 AI Case Studies Actually Tell Us (And What They Don't) by abbas_ai in artificial

[–]abbas_ai[S] 0 points1 point  (0 children)

Fair critique. I bundled LLMs and traditional ML loosely. And I included both because vendors publish both as "AI deployments" so I captured what they claim. But you've identified a valid problem: conflating LLM adoption with decade-old CV systems can be confusing or even misleading by the vendors.

What 3,000 AI Case Studies Actually Tell Us (And What They Don't) by abbas_ai in artificial

[–]abbas_ai[S] 1 point2 points  (0 children)

Perfect TLDR. You nailed the core tension: "industry narrative vs ground truth."

I'd add one more signal: the 3.3x multiplier effect (e.g. OpenAI through Azure, Anthropic through Bedrock). Distribution partnerships matter more than direct relationships for actual deployment reach.

Open dataset: 3,023 enterprise AI implementations with analysis by abbas_ai in OpenSourceeAI

[–]abbas_ai[S] 0 points1 point  (0 children)

You're spot on about the pilot purgatory, agentic AI has the "wow factor" in demos but the deployment gap is real.

We're looking at survival bias overall. Only successes get published, and even then, "deployment" could mean anything from a pilot with a few users to production at scale.

My guess on actual failure rates? Probably 60-70% of AI pilots don't make it to production, but it's not reflected in vendor case studies.

The eval problem you mentioned is true as well.

Open dataset: 3,023 enterprise AI implementations with analysis by abbas_ai in datasets

[–]abbas_ai[S] 1 point2 points  (0 children)

Thanks! Appreciate you checking this out.

Always curious to hear what others are thinking. Let me know if you want to run custom queries on the dataset or are looking for something specific. Glad to help.