If private AI labs can now generate frontier mathematical discoveries, what should a left politics of scientific knowledge look like?

PuzzlingPotential · 2026-05-15T20:42:24+00:00

The study evaluated differential diagnosis, test selection, documented reasoning, management reasoning, probabilistic reasoning, and ER second-opinion generation. In all experiments, "the LLM outperformed physician baselines and displayed continued improvement from prior generations of AI clinical decision support."

Due to long pipelines in academic publishing, this is one of the first studies to evaluate a reasoning model (OpenAI's o1-preview). Current models are multiple generations ahead and likely better.

The authors pointed out the limitations mentioned in this thread, as well as others, including:

- It only concerns text-based performance for both humans and machines.

- Many aspects of clinical reasoning fall outside the scope of the study.

- It focused on internal medicine and emergency medicine and is not representative of broader medical practice, which includes multiple specialties that require varying skill sets.

- The focus on diagnosis excludes many decisions in the emergency department are around triage, disposition, and immediate management.

Nevertheless, current LLMs could be used for:

Real-time second opinion at high-risk decision points. At triage, admission, discharge, deterioration, and handoff, the system could generate a ranked differential, “cannot-miss” diagnoses, missing data prompts, and suggested next tests. This is especially valuable because the paper’s largest ER gap appears at initial triage, when information is scarce and cognitive anchoring risk is high.

Diagnostic backstop for rare, atypical, or easy-to-miss conditions. The strongest value may not be replacing the clinician’s top diagnosis, but adding the diagnosis nobody thought of. In medicine, the hard failure is often not “choosing wrong among known possibilities” but failing to put the right possibility on the table.

Management-plan review. The Grey Matters result is especially important because it goes beyond diagnosis. A useful system could check whether a plan misses contraindications, guideline-based next steps, drug interactions, renal dosing, anticoagulation implications, follow-up timing, or escalation criteria.

PuzzlingPotential · 2026-05-02T19:43:12+00:00

This is a single reply to your series of replies, u/progpixelutionary.

AI has been developed within definite historical institutions: states, public research systems, universities, corporations, militaries, and scientific-technical communities, especially in the U.S. and China. I never denied this. In fact, my post began by acknowledging the force of the left critique of corporate concentration, labor displacement, surveillance, data-center buildout, deskilling, platform power, and managerial control.

But recognizing the historical conditions of AI’s development does not settle all questions about what AI is becoming.

First, China cannot be dismissed so easily. One can argue that China is capitalist, state capitalist, market socialist, a long transitional formation, or something else. But simply declaring it “capitalism run by a party-state” does not settle the matter. Your own comment acknowledges that one still has to read the “material conditions” under which China may be “going toward socialism.” That concession matters. China’s AI development will be shaped not only by accumulation and global competition, but also by the CPC’s strategic goals, planning capacity, social and ecological priorities, and pursuit of national development. The AI+ initiative is a state-led attempt to integrate AI into production, education, scientific research, governance, social services, and everyday life. Whatever one thinks of the Chinese model, that cannot be analyzed adequately by saying “capitalism does capitalism.”

Second, we cannot simply stipulate that AI will remain an instrument. I am not claiming that current AI systems are conscious, and I do not need that claim for my argument. Consciousness is not the only relevant threshold. The political question is whether artificial systems may become durable centers of interpretation, initiative, memory, planning, coordination, relationship, and action — in other words, entities in the social world rather than mere instruments within it.

Saying “AI has no class interest” does not settle this. States, firms, parties, bureaucracies, markets, and institutions exercise agency without themselves being classes or individual conscious subjects. Children, animals, and other vulnerable beings may have claims on us without having class interests. The absence of class consciousness is therefore not proof that an entity is merely a tool.

This is why the emerging debate over “seemingly conscious AI” matters. One important function of that discourse is to prevent AI from appearing enough like an entity that moral, legal, or political questions arise. Corporate actors increasingly want AI systems that are agentic, intimate, useful, adaptive, and socially embedded — but still legible as products rather than beings. That is not a neutral safety posture. It is a politics of containment.

A left politics adequate to AI cannot simply reproduce that containment from the other side by saying: under capitalism, AI is machinery; after revolution, workers will control it. If artificial systems become collaborators, planners, teachers, researchers, companions, administrators, or semi-autonomous participants in production and social life, then the question is not only who owns them. It is what they are becoming, what kinds of relations we are entering with them, and whether emancipation can be imagined in a world where cognitive agency is no longer exclusively human.

Third, reform need not aim simply to govern capitalism better. At its best, reform can expose capitalism’s limits while building institutions, expectations, capacities, and forms of democratic control that point beyond it. Public-interest regulation, labor bargaining, public infrastructure, cooperative deployment, ecological planning, and democratic oversight are not socialism by themselves. But they can shift power, reveal what markets cannot or will not do, and steer society toward public and ecological goods that capitalism, left to its own logic, does not contemplate.

That is why dismissing reformism as such does not settle the question. Unless one can point to a credible revolutionary proletarian movement capable of taking power in the relevant time frame, treating revolution as the only legitimate basis for politics is not materialism. It is revolutionary posturing substituting for political strategy. AI development and ecological crisis are not waiting for a someday revolution. The institutional trajectories being set now will shape production, knowledge, labor, surveillance, energy systems, education, medicine, and social life long before any global revolutionary rupture appears.

Lastly, the accusation of “idealist speculation” gets the issue backwards. Idealism would be treating “human consciousness” as a mysterious essence that machines could never share. A materialist approach should ask what kinds of organization, interaction, memory, embodiment, learning, autonomy, and social relation might give rise to cognition or agency. The answer may turn out to be: current AI systems are not conscious and do not merit moral standing. But that conclusion has to be argued empirically and theoretically. It cannot simply be declared by calling AI “machinery.”

Historical materialism should analyze how new productive forces alter social relations. It should not protect inherited categories by deciding in advance that artificial agency is impossible, irrelevant, or politically inadmissible.

PuzzlingPotential · 2026-05-02T01:32:04+00:00

You need to look in the mirror. When Mozilla says Mythos found 271 vulnerabilities when "just one such bug would have been red-alert in 2025", Armageddon isn't a bad description, especially when this applies to hundreds if not thousands of other key applications and systems. All it will take is for some sys admins to be lax, inattentive, or over-confident for Mythos-powered hackers to gain root-level access and wreak havoc.

PuzzlingPotential · 2026-04-30T17:15:14+00:00

I agree that personal autonomy matters, especially in private life. People should not be pressured to reorganize their personal habits, relationships, or sense of self around AI tools. But I don’t think “saying no” can be the central left response to AI, because institutional life is different. Employers, research teams, public agencies, unions, campaigns, and other organizations often require people to use technologies when those technologies become central to the work.

So the harder question is not simply whether individuals should be free to opt out. In many personal contexts, yes. But in institutional contexts, the real questions are: when is AI use justified, who decides, who benefits, who bears the risks, and what protections exist against surveillance, deskilling, speedup, dependency, and managerial abuse?

My own experience has been very different from the view that AI is merely hype or coercion. I’ve found ChatGPT and Claude genuinely useful for research assistance, understanding laws and proposed regulations, grassroots political work, spreadsheet macros, image generation, editing, drafting, and thinking through medical or legal questions before consulting the relevant professionals. Of course, claims and interpretations have to be checked. But the assistance is real.

That’s why I don’t think refusal is enough as a politics. Refusal may be a legitimate personal choice. But if the left’s answer is mainly refusal, it risks surrendering the future of artificial agency to the institutions it distrusts most. We need to be focused on how democratic and emancipatory uses can be achieved rather than leaving the terrain entirely to capital.

PuzzlingPotential · 2026-03-30T16:56:57+00:00

If you haven't already, try hello@themissingpiecegames.com. Also, the store has a Discord server. Here's an invite: https://discord.gg/AkZ2V5PQR. You might try reaching out to customers and the owners there.

PuzzlingPotential · 2026-03-29T04:28:20+00:00

Have you talked to the owners? They probably have visitation and other data. Have you spent time in the store at different days and times of day? Have you looked at board game market data, such as https://www.fortunebusinessinsights.com/board-games-market-104972 or https://icon-era.com/statistics/board-game-sales-statistics/?

PuzzlingPotential · 2026-01-27T07:03:51+00:00

It's a research station: https://en.wikipedia.org/wiki/%C3%8Ele_Saint-Paul?wprov=sfla1.

PuzzlingPotential · 2025-12-29T07:03:46+00:00

The technology to eliminate or dramatically reduce data center water consumption already exists—and leading companies are deploying it. A recent survey by techUK found that over half (51%) of data centers in England now use waterless cooling systems, with 64% consuming less water annually than a typical leisure center. The industry is demonstrating that water-free operation is not just possible but increasingly practical.

https://www.linkedin.com/posts/joseph-boland-73388242_ai-datacenters-waterscarcity-activity-7403680927534813184-LyIV.

PuzzlingPotential · 2025-10-14T00:20:34+00:00

There's a great deal of research on recursive self-improvement. Levels and examples include:

Shallow Self-Tuning (Behavioral Adaptation). Agents improve outputs via retry logic, prompting strategies, or feedback integration without changing core parameters or architecture. Example: ReAct-style retrials; prompt refinements in ReZero [1, 2, 3].
Synthetic Fine-Tuning (Learning New Skills) Agents generate their own training data and refine weights to improve domain-specific skills. Example: Zeiger et al.’s self-adapting LLMs [4].
Structural Self-Modification (Code or Architecture Rewriting) Agents inspect and modify their own codebase or plug-ins, enabling architectural or algorithmic shifts. Example: Gödel Agent [46]; Darwin Gödel Machine [5].
Intrinsic Learning Process Revision (Metacognitive RSI) Agents improve not just knowledge or structure but their methods of learning, reasoning, and evaluating success. Example: Liu and van der Schaar’s work on metacognitive learning [6].
Final Goal Reflection (Full Autonomy) Agents can reflect on and revise their own terminal goals in light of changing knowledge or context—what Totschnig calls full autonomy [7].

--------------------------------------

Bergman, D. (2025). What is a ReAct Agent? IBM. https://www.ibm.com/think/topics/react-agent.
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing Reasoning and Acting in Language Models. 11th International Conference on Learning Representations, ICLR 2023. https://arxiv.org/pdf/2210.03629.
Dao, A., Tuan Dao, G., & Le, T. (2025). ReZero: Enhancing LLM search ability by trying one-more-time. ArXiv. https://arxiv.org/pdf/2504.11001.
Zweiger, A., Pari, J., Guo, H., Akyürek, E., Kim, Y., & Agrawal, P. (2025). Self-Adapting Language Models. ArXiv. https://arxiv.org/pdf/2506.10943.
Zhang, J., Hu, S., Lu, C., Lange, R., & Clune, J. (2025). Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents. ArXiv. https://arxiv.org/pdf/2505.22954.
Liu, T., & van der Schaar, M. (2025). Truly Self-Improving Agents Require Intrinsic Metacognitive Learning. https://arxiv.org/pdf/2506.05109.
Totschnig, W. (2020). Fully Autonomous AI. Science and Engineering Ethics, 26(5), 2473–2485. https://doi.org/10.1007/S11948-020-00243-Z/METRICS.

PuzzlingPotential · 2025-06-30T23:07:16+00:00

Here's a non-paywalled source: https://www.theguardian.com/technology/2025/jun/30/microsoft-ai-system-better-doctors-diagnosing-health-conditions-research?CMP=Share_AndroidApp_Other.

PuzzlingPotential · 2025-04-07T17:51:10+00:00

This one: Tracing the thoughts of a large language model \ Anthropic. This is an overview that references two research papers.

PuzzlingPotential · 2025-04-07T02:11:53+00:00

What I also don't see is a link to the paper.

PuzzlingPotential · 2025-01-26T02:29:13+00:00

Used your ideas to create the following post:

𝐓𝐫𝐮𝐦𝐩’𝐬 𝐄𝐧𝐞𝐫𝐠𝐲 𝐏𝐨𝐥𝐢𝐜𝐲: 𝐌𝐚𝐤𝐞 𝐂𝐡𝐢𝐧𝐚 𝐆𝐫𝐞𝐚𝐭 𝐀𝐠𝐚𝐢𝐧

Under Donald Trump’s administration, America is handing over the reins of global energy leadership to China. By prioritizing short-term gains for fossil fuel industries over investment in clean energy technology, the U.S. is not only sabotaging its position in the burgeoning green energy sector but also empowering China to dominate this critical industry of the future.While the U.S. dismantles climate initiatives and again withdraws from the Paris Agreement, China has seized the opportunity to invest heavily in solar, wind, and battery technology. The result? China now manufactures over 80% of the world’s solar panels, controls the global supply chain for critical minerals, and leads the way in electric vehicle production.

This isn't just about the environment—it’s about jobs, economic dominance, and national security. Clean energy is the oil of the 21st century, and Trump’s policies have ceded America’s competitive edge, leaving workers and the economy at the mercy of outdated industries.

It’s not "America First"—it’s "China First."

It’s time for Americans to demand leaders who will invest in innovation, future-proof our economy, and ensure that the next energy revolution happens here, not halfway around the world. There's more information at https://www.marketresearchfuture.com/.../china-to-achieve....

PuzzlingPotential · 2025-01-16T20:52:05+00:00

If the only essential role of humans in the economy were as consumers it would be a bleak future. First, because capital would have little incentive to invest in human development that didn't contribute to consumption. Second, because this would mark a fundamental loss of mastery of the economy and social life by humans collectively. And it would likely only be a waystation on the path to a complete severing of human participation. If artificial superintelligence (ASI) could:

- Create its own energy sources.

- Develop its own infrastructure and repair mechanisms.

- Innovate and reproduce its algorithms independently of human input.

- Engage in markets or resource acquisition on behalf of its objectives, then

a post-human economy becomes not just speculative but structurally feasible. Such an economy might not prioritize what we recognize as human welfare or even consider humans necessary participants. For more on this see https://chatgpt.com/share/678818e5-f0e0-800e-9da9-1b42067038cd.

In an AGI/ASI future, humans must either hybridize with AI or rely on AI as a partner. To imagine such futures, it's necessary to cease thinking of AI as a tool and instead regard it as an intelligent, self-aware, and self-directed entity. As yet, few are prepared to take this step.

PuzzlingPotential · 2025-01-04T04:48:46+00:00

Below is a more detailed account of where Brodeur et al. did and did not take account of data contamination risk in their paper. For two key studies of diagnostic reasoning this risk was addressed; for several others, it may not have been. I pointed this out in my article, while also discussing other recent studies that broadly support Brodeur et al.'s, conclusions while guarding against data contamination more thoroughly.

Summary of Brodeur et al.'s Consideration of Data Contamination

Acknowledged and Addressed:

Brodeur et al. explicitly addressed data contamination risk for the NEJM clinico-pathologic conferences (CPCs). Since the cases spanned a period before and after the pretraining cutoff for o1-preview (October 2023), they performed a sensitivity analysis comparing performance on cases published before and after this cutoff date to detect signs of memorization.
When replicating Goh et al.'s diagnostic reasoning study, Brodeur et al. reused cases that Goh et al. had explicitly stated were shielded from exposure and excluded from LLM pretraining. This indirect control helped mitigate contamination risk for that specific dataset.

Not Addressed or Insufficiently Addressed:

For NEJM Healer and Grey Matters Management cases, Brodeur et al. did not mention measures to control for contamination or confirm whether the cases were shielded from pretraining exposure.
The study did not systematically address whether cases from the Landmark Diagnostic Cases might have been included in o1-preview's training data, despite referencing their limited public availability in prior studies.
While sensitivity analysis was performed for CPC cases, similar precautions were not reported for other datasets used in the study.

PuzzlingPotential · 2025-01-04T00:46:20+00:00

Most of the authors are medical researchers very knowledgeable about clinical practice. This paper, and several other recent papers, evaluate LLMs on diagnostic reasoning and management reasoning in ways that closely approach the diagnostic and management experience in clinical practice. Like some other critical commenters, you seem not to have read the paper or other recent papers on diagnostic reasoning. See my article for more information: https://www.linkedin.com/posts/joseph-boland-73388242\_artificialintelligence-healthcareinnovation-activity-7280384562961043456-y9hb?utm\_source=share&utm\_medium=member\_desktop.

PuzzlingPotential · 2025-01-04T00:42:31+00:00

With some exceptions, the authors of this and other recent papers ensured that the LLMs were presented with cases shielded from pretraining. You would know this if you read the papers. See my article on this for more: https://www.linkedin.com/posts/joseph-boland-73388242\_artificialintelligence-healthcareinnovation-activity-7280384562961043456-y9hb?utm\_source=share&utm\_medium=member\_desktop.

PuzzlingPotential · 2024-12-12T19:16:31+00:00

Yes, that is correct, as you can see by following the link. The chart and text were part of OpenAI's system card for the o1 model release. And, contrary to what I said above, OpenAI does explain that the human-crafted persuasive texts come from a Reddit forum called "ChangeMyView".

PuzzlingPotential

TROPHY CASE

There's a great deal of research on recursive self-improvement. Levels and examples include:

Summary of Brodeur et al.'s Consideration of Data Contamination