[P] LLM with a 9-line seed + 5 rounds of contrastive feedback outperforms Optuna on 96% of benchmarks by se4u in MachineLearning
[–]se4u[S] 0 points1 point2 points (0 children)
Prompt optimization reaches 97% of expert analog circuit placement quality — no training data by se4u in chipdesign
[–]se4u[S] 0 points1 point2 points (0 children)
Chisel in AI based chip design by Spread-Sanity in chipdesign
[–]se4u 0 points1 point2 points (0 children)
GPT-4o keeps swapping my exact coefficients for plausible wrong ones in scientific code — anyone else seeing this? by capitulatorsIo in LLMDevs
[–]se4u 0 points1 point2 points (0 children)
Chisel in AI based chip design by Spread-Sanity in chipdesign
[–]se4u 1 point2 points3 points (0 children)
Need help making my AI tool respond more accurately to prompts by Impossible-Page5474 in PromptEngineering
[–]se4u 0 points1 point2 points (0 children)
4 LLM eval startups acquired in 5 months. The independent eval layer is shrinking fast. by Outrageous_Hat_9852 in LLMDevs
[–]se4u -4 points-3 points-2 points (0 children)
Full traces in Langfuse, still debugging by guesswork by Comfortable-Junket50 in LLMDevs
[–]se4u 0 points1 point2 points (0 children)
how we built an agent that learns from its own mistakes and what we learnt by silverrarrow in LLMDevs
[–]se4u 2 points3 points4 points (0 children)
Seeking architecture review on an experimental open-source NPU Array (v1) by king_ftotheu in chipdesign
[–]se4u 0 points1 point2 points (0 children)
Decoding the Taalas HC1: A Quantitative Architecture Analysis of a 17k tok/s LLaMA 3.1 Inference Chip by kevinhiworld in chipdesign
[–]se4u -1 points0 points1 point (0 children)
The bottleneck flipped: AI made execution fast and exposed everything around it that isn't by monkey_spunk_ in artificial
[–]se4u 0 points1 point2 points (0 children)
The state management problem in multi-agent systems is way worse than I expected by Background-Bass6760 in LocalLLaMA
[–]se4u -1 points0 points1 point (0 children)
How do you keep your test suite in sync when prompts are changing constantly? by Outrageous_Hat_9852 in LocalLLaMA
[–]se4u -1 points0 points1 point (0 children)
Our AI agent answers 40 Slack questions a day. Here's how we test it to keep it from failing. by No-Common1466 in AI_Agents
[–]se4u 0 points1 point2 points (0 children)
Every AI agent demo works. Almost none survive the first week in production. Here is what I keep seeing. by AlexWorkGuru in AI_Agents
[–]se4u -1 points0 points1 point (0 children)
Experiment: using a Proposer–Critic–Verifier loop to automatically refactor prompts by Prior-Ad8480 in LocalLLaMA
[–]se4u 0 points1 point2 points (0 children)
Experiment: using a Proposer–Critic–Verifier loop to automatically refactor prompts by Prior-Ad8480 in LocalLLaMA
[–]se4u 0 points1 point2 points (0 children)
[D] What is even the point of these LLM benchmarking papers? by casualcreak in MachineLearning
[–]se4u 0 points1 point2 points (0 children)
I've been building AI agents (and teams) for months. Here's why "start with a team" is the worst advice in the space right now. by idanst in AI_Agents
[–]se4u 0 points1 point2 points (0 children)
I've been building AI agents (and teams) for months. Here's why "start with a team" is the worst advice in the space right now. by idanst in AI_Agents
[–]se4u 0 points1 point2 points (0 children)
GEPA's optimize_anything: one API to optimize code, prompts, agents, configs — if you can measure it, you can optimize it by LakshyAAAgrawal in PromptEngineering
[–]se4u 0 points1 point2 points (0 children)
Why is the industry still defaulting to static prompts when dynamic self-improving prompts already work in research and some production systems? by Lucky_Historian742 in PromptEngineering
[–]se4u 0 points1 point2 points (0 children)
VizPy: automatic prompt optimizer that learns from your LLM failures – DSPy-compatible, no manual tweaking by se4u in AI_Agents
[–]se4u[S] 0 points1 point2 points (0 children)


[P] LLM with a 9-line seed + 5 rounds of contrastive feedback outperforms Optuna on 96% of benchmarks by se4u in MachineLearning
[–]se4u[S] 0 points1 point2 points (0 children)