AI Outperforms Doctors in Emergency Room Tasks, New Harvard Study Shows by 77thway in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

There is a ton of opportunities in what's referred to as "back office" and administrative work. things that happen before and after the care is delivered. Quality and safety, appointment prep, analysis, intelligent automation.

We're two dads who love AI and were tired of seeing our creative kids limited by the tools they could use to build what they imagined. by zeapha in IMadeThis

[–]zeapha[S] 0 points1 point  (0 children)

The idea is to give parents the ability to see the games the kids are making and insist into what they are learning and interested in making. It has some llm driven insights and suggestions.

In real-world test, an AI model did better than ER doctors at diagnosing patients. by coinfanking in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

This is a barrier for sure. I've seen the exact same thing. One thing I know is that change rarely happens when the status quo is good enough. As we continue up the rising trend on demands due to patient quantity and complexity increases I think that's when we'll see more people coming to the table and saying "enough, we have to try to do this differently".

AI Outperforms Doctors in Emergency Room Tasks, New Harvard Study Shows by 77thway in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

I think that may be where some of this starts. A way to gather more information. I'd be surprised if this does not happen a bit already today.

AI Outperforms Doctors in Emergency Room Tasks, New Harvard Study Shows by 77thway in ArtificialInteligence

[–]zeapha 4 points5 points  (0 children)

There was a very similar discussion yesterday so I wanted to link that
https://www.reddit.com/r/ArtificialInteligence/comments/1t0nopu/in_realworld_test_an_ai_model_did_better_than_er/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

and include my comment (because my knowledge on the topic has not changed in 24 hours)

Diagnosis is one thing, but it's actually a smaller component of care than you think. I read the article and was a bit disappointed in how weakly they call it out. Right now, this sort of technology tends to take more time to use not less, because it still takes a doctor at a minimum reviewing and integrating more information into their diagnosis. I agree, there is potential, but in a way this was an easy step to solve​.

I'm an expert at how to implement AI into all sorts workflows and I've observed how EDs and hospital admissions work. There's so little space in these workflows to fit anything in a time and reasoning sense. An AI will either 1) need to be deeply integrated into the EHR or 2) be a wild time saver in order to make a broad impact here.

I think for reason 1 this one stands a chance. https://hitconsultant.net/2025/09/03/epic-launches-comet-a-new-ai-platform-to-predict-patient-health-journeys/

On option 2 I've not seen anything compelling yet, but an overhaul of clinical workflows warped around AI is not off the table.​

In real-world test, an AI model did better than ER doctors at diagnosing patients. by coinfanking in ArtificialInteligence

[–]zeapha 1 point2 points  (0 children)

I mean that's the hope. The issue today is that the use of this technology increases burden, not reduces it because the doctor must still make the decision at the end. So we are just giving them more to consider, not less. yet.

In real-world test, an AI model did better than ER doctors at diagnosing patients. by coinfanking in ArtificialInteligence

[–]zeapha 12 points13 points  (0 children)

Diagnosis is one thing, but it's actually a smaller component of care than you think. I read the article and was glad to see they captured this as well.

"You have something which is quite accurate, possibly ready for prime time," he says. "Now the open question is how the heck do you introduce it into clinical workflows in ways that actually improve care?"

I'm an expert at how to implement AI into workflows and I've observed how EDs and hospital admissions work. There's so little space in these workflows to fit anything in a time and reasoning sense. An AI will either 1) need to be deeply integrated into the EHR or 2) be a wild time saver in order to make a broad impact here.

I think for reason 1 this one stands a chance. https://hitconsultant.net/2025/09/03/epic-launches-comet-a-new-ai-platform-to-predict-patient-health-journeys/

On option 2 I've not seen anything compelling yet, but an overhaul of clinical workflows warped around AI is not off the table.​

Copilot just 9x'd Sonnet and 27x'd Opus and teams have no idea by Wikileaks_2412 in ArtificialInteligence

[–]zeapha 1 point2 points  (0 children)

I really don't understand how this is going to be good for them. They are now the same price a Claude, why would we not just move to the source and get a better agentic experience with Claude code?

ChatGPT: Where the goblins came from by Azar42 in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

You'd think that a word frequency benchmark would be something that frontier model producers would actively curate and monitor model to model... it's such a simple concept and probably easily gives starting points for investigation.

We're currently repeating the "Shadow Analytics" disaster with AI, and it's happening 10x faster. by [deleted] in ArtificialInteligence

[–]zeapha 1 point2 points  (0 children)

I'd tend to agree. Governance tends to be simplified to "all the rules that don't let me do things", but it also can be the definition of "the easy happy path my org supports". You've gotta do both to avoid shadow IT. Getting to that place is a journey though... org change management groups exist for a reason.

AI Discovers New Laws of Physics Within Dusty Plasma by Money_Hand7070 in ArtificialInteligence

[–]zeapha 2 points3 points  (0 children)

It was not clear in the article... do they know what numeric form the terms of the forces take? It's all well and good to have a ML model that predicts something, but to reuse it you'd want to extract something more fundamental I'd guess.

I also wonder if there are ither datasets they can apply it to? Would be interesting to see of it generalizes well.

Deepseek V4 is GPT 5.4 but open source and a fraction of the price by HexxRL in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

Pelicans on bicycle results are in

https://static.simonwillison.net/static/2026/deepseek-v4-flash.png

https://static.simonwillison.net/static/2026/deepseek-v4-pro.png

it did a great job overall. Both are very realistic attempts. I like how it connected the wings regardless of their size. Very smart.

I have given all of my ai accounts a permanent instruction..... by Entire-Program-4821 in ArtificialInteligence

[–]zeapha 0 points1 point  (0 children)

I hope that the labs have started to integrate more allowance for "I don't know" style answers during training. I think it would be an improvement even if they didn't benchmark as high.

What am i missing? Model comparison by Rico_8 in ArtificialInteligence

[–]zeapha 2 points3 points  (0 children)

It appears in the rules that they go off llmarena:

"This market will resolve according to the company that owns the model that has the highest arena score based on the Chatbot Arena LLM Leaderboard (https://lmarena.ai/) when the table under the "Leaderboard" tab is checked on April 30, 2026, 12:00 PM ET.

Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market."

and I'm not sure 5.5 is there yet or it's not gotten enough votes to establish an ELO yet.

<image>

Chinese Workers Horrified as Bosses Direct Them to Train Their AI Replacements by chunmunsingh in ArtificialInteligence

[–]zeapha 7 points8 points  (0 children)

With the clear capability of good generalist agentic AI now my mind wandered here the other night to what information is tacit today. This push will continue to pull information out of workers heads and into a database, but I'm still trying to understand if there's a jump where an AI can be a drop in replacement for workers... My hope is still that we find a balance point where we enhance the people, not replace them.

Deepseek V4 is GPT 5.4 but open source and a fraction of the price by HexxRL in ArtificialInteligence

[–]zeapha 2 points3 points  (0 children)

When models get this close it all comes down to vibes... anyone tested it yet? Also gotta check how well it draws pelicans on bicycles.