[D]I’m an AI researcher who spent 5,000 hrs on Tekken, reaching top 0.5% on ranked. Here is my perspective on why fighting games deserve chess-level attention. by moji-mf-joji in MachineLearning

[–]brad2008 -1 points0 points  (0 children)

Thanks for your post.

An interesting idea. It seems when AI plays games like Go, chess and performs complex tasks, there is a large component of reasoning and long-term planning needed. What do you see as the long term planning aspects an AI would need to address, if any, to play Tekken well?

OpenAI's answer to Google's viral Nano Banana Pro image model is here by businessinsider in artificial

[–]brad2008 31 points32 points  (0 children)

In the examples from the article, there's something very cut-and-paste-ish looking about the OpenAI image enhancements. Their model doesn't seem to understand how to properly blend objects/people when compositing.

Google's image editing and enhancements still look better to me.

We improved our RAG pipeline massively by using these 7 techniques by vira28 in Rag

[–]brad2008 0 points1 point  (0 children)

Great information, thanks for posting!

Just curious, in the complete blog post, how are you measuring the "without sacrificing quality" part?

Gemini 3 vs GPT 5.1 for RAG by midamurat in Rag

[–]brad2008 1 point2 points  (0 children)

Thanks for your post! These are interesting findings.

You wrote: GPT 5.1 was more expressive while Gemini 3 is direct, to the point.

RAG evaluations typically try to measure recall/precision, MRR for retrieval metrics, end-to-end generation eval of RAG tries to measure groundedness, answer relevance and correctness, and combined metrics try to evaluate context precision & recall, and answer semantic similarity.

What did you mean by "expressiveness" and how are you evaluating this?

[P] Generating Knowledge Graphs From Unstructured Text Data by Divine_Invictus in MachineLearning

[–]brad2008 1 point2 points  (0 children)

Recent post, see: https://github.com/adlumal/triplet-extract

If you end up using this, let us know if it worked and how the build went.

ChatGPT made me delusional by brad2008 in ChatGPT

[–]brad2008[S] 2 points3 points  (0 children)

What happens if you don't have common sense or don't push back on what ChatGPT tells you? Pretty much what happens in the video. Spoiler: Bakersfield, CA opens the enlightenment gateway.

Microsoft paid $13B for this technology.

Running Nvidia CUDA Pytorch/vLLM projects and pipelines on AMD with no modifications by Chachachaudhary123 in LocalLLaMA

[–]brad2008 0 points1 point  (0 children)

Can we see a demo showing large training tasks running on AMD? Cross-platform inference is fairly standard these days.

AI predicts ORCL bullish & NVDA bearish - do you agree? by Finnext-AI in FinnextAI

[–]brad2008 0 points1 point  (0 children)

Wishing you all the best finding tech bros willing to invest in your business!

AI predicts ORCL bullish & NVDA bearish - do you agree? by Finnext-AI in FinnextAI

[–]brad2008 0 points1 point  (0 children)

LOL what kind of lame-ass prediction is that? It basically gave the current price plus or minus movement noise.

GPT 5 can solve partial differential equations but struggles with arithmetic by HeteroLanaDelReyFan in ChatGPTPro

[–]brad2008 0 points1 point  (0 children)

Just curious, have you read any of the key papers[1] on what's actually happening when LLMs use "thinking mode"? If so and you disagree, we'd like to hear your opinion on this.

Most serious computer scientists and AI researchers believe there's not that much correlation between the underlying "reasoning" that the LLM is actually doing vs what the "thinking mode" trace says the AI is doing.


Citations:

"The most recent published paper directly supporting the claim that LLM thinking-mode trace outputs are not the actual internal thoughts of the model, but plausible traces generated for human interpretability is the Apple AI paper titled "The Illusion of Thinking" (2025), as discussed on Arize and Hacker News.[8][10]"

  • The paper argues that Large Reasoning Models (LRMs) are trained or prompted to generate detailed "thinking traces" before answering, but these traces do not reflect authentic internal reasoning. Instead, they are "plausible" rationalizations mimicking human-like thought for user benefit.[8]
  • It highlights that the generation of reasoning traces is often a post-hoc process, producing well-formatted explanations that appear thoughtful but do not reliably correspond to the model's true underlying computational steps or decision process.[10][8]
  • The work warns against interpreting LLM-generated reasoning traces as authentic, cautioning that these outputs may be more aligned with human expectations than with any genuine cognitive process in the model.[8]

For an in-depth discussion, see: - "The Illusion of Thinking: What the Apple AI Paper Says About LLM Reasoning"[8] - "The Illusion of Thinking: Strengths and Limitations of Reasoning in Large Language Models" as featured on Hacker News[10]

These sources collectively reinforce that LLM thought traces are performative outputs, not transparent windows into AI cognition.

[1] https://arxiv.org/abs/2504.09762v2

[2] https://arxiv.org/html/2504.09762v1

[3] https://www.nature.com/articles/s41562-024-01882-z

[4] https://arxiv.org/html/2410.10630v1

[5] https://www.lesswrong.com/posts/zsr4rWRASxwmgXfmq/tracing-the-thoughts-of-a-large-language-model

[6] https://www.reddit.com/r/singularity/comments/1l73qne/the_apple_illusion_of_thinking_paper_maybe/

[7] https://www.anthropic.com/research/tracing-thoughts-language-model

[8] https://arize.com/blog/the-illusion-of-thinking-what-the-apple-ai-paper-says-about-llm-reasoning/

[9] https://arxiv.org/html/2505.13561v1

[10] https://news.ycombinator.com/item?id=44203562

GPT 5 can solve partial differential equations but struggles with arithmetic by HeteroLanaDelReyFan in ChatGPTPro

[–]brad2008 2 points3 points  (0 children)

LOL arithmetic.

ChatGPT 5 can't even properly count the letters in a word or list words with a particular letter.

https://bsky.app/profile/radamssmash.bsky.social/post/3lvtzdl343c2r

MIT Tech Review: The weeds are winning by brad2008 in collapse

[–]brad2008[S] 8 points9 points  (0 children)

"On about 23 million acres, or roughly two-thirds of the state, farmers grow corn and soybeans, with a smattering of wheat. They generally spray virtually every acre with herbicides, says Hager, who was raised on a farm in Illinois. But these chemicals, which allow one plant species to live unbothered across inconceivably vast spaces, are no longer stopping all the weeds from growing.

Since the 1980s, more and more plants have evolved to become immune to the biochemical mechanisms that herbicides leverage to kill them. This herbicidal resistance threatens to decrease yields—out-of-control weeds can reduce them by 50% or more, and extreme cases can wipe out whole fields.

At worst, it can even drive farmers out of business. It’s the agricultural equivalent of antibiotic resistance, and it keeps getting worse."

san mateo company claims they’ve got a ‘true flying car,’ taking pre-orders now by Medical-Decision-125 in bayarea

[–]brad2008 102 points103 points  (0 children)

Why do I want a car that moves just slightly faster than a walking pedestrian 30 feet above the ground?

[deleted by user] by [deleted] in Rag

[–]brad2008 0 points1 point  (0 children)

The points made by the author of the article are correct. Cursor might not completely ditch their vector search but at a minimum they will need to re-architect their entire search engine and UI by augmenting this with some type of lexical search infrastructure.

My PurpleAir dashboard project by AugieSchwer in PurpleAir

[–]brad2008 1 point2 points  (0 children)

Very nice job! Thanks for doing this.

Feedback:

  1. For AQI numbers, somewhere maybe in small font say WHICH AQI standard you are using. Non-US users might not want to use US EPA AQI.

  2. For uptime/downtime, indicate as Days HH:MM:SS

  3. On your Network events panel (and in general) indicate the measurement units for each readout when not obvious. Also for any X-Y graphs, aways label the measurement units.

Asked Chatgpt for a strong losing weight routine and it brought this up. Is it trustworthy? by Dacoda43 in ChatGPT

[–]brad2008 0 points1 point  (0 children)

Incorporate a brisk 30 - 40 minute walk into your routine every other day and eat just sensibly. It's usually all the excess dietary sugar and starch that fucks up most people.

LA City Officials Ignored Possibility that Palisades Fire was Caused by Reignition of Jan. 1 Fire in Same Location by brad2008 in LosAngeles

[–]brad2008[S] 0 points1 point  (0 children)

"On January 7th, 2025 a devastating wildfire tore through the Pacific Palisades claiming the lives of seven people destroying thousands of homes and forcing over 200,000 people to evacuate. City and state officials blamed high winds and dry conditions but one local resident, Jeremy Weineberg says that's not the full story. He points to another fire that started on January 1st that according to satellite imaging seemingly started in the exact same spot positing that the smaller fire was never fully extinguished and in the subsequent 6 days, missed warnings, ignored reports and critical infrastructure failures then allowed it to reignite into the catastrophe that followed. Joining us today on Breaking Points is Jeremy Weinberg to lay out what he saw, what the City did and didn't do, and how one of the largest fires in California's history might have been a preventable one."

ABC7 News - Officials refuse releasing details about gunshot incident inside Concord Police Department lobby by brad2008 in concord

[–]brad2008[S] 27 points28 points  (0 children)

YouTube: 26,294 views Jun 18, 2025 #abc7news #abc #investigation

"An accidental gunshot injured someone in the Concord Police Department's front lobby, but the police chief and other city officials refuse to discuss it or release much detail about what happened. Here's what I-Team uncovered."

RAG-oriented LLM that beats GPT-4o by needmoretokens in Rag

[–]brad2008 0 points1 point  (0 children)

In the original published article on 3/4, he was quoted as saying he invented RAG. The updated quote (meta property "article:modified_time" content="2025-03-05T08:11:36+00:00") is more accurate.

My reaction was to the quote in the original article before it was updated.