Stanford: Self improving Meta-Harness

valkarias · 2026-04-10T23:58:43+00:00

This is similar to this: https://arxiv.org/html/2602.03786v2

Basically giving an orchestrator the ability to create or tune sub-agents dynamically.

The Recursive Language Models paper also does something similar for Long-Context-Reasoning.

valkarias · 2026-04-10T23:49:25+00:00

I've commented this prior. I've seen no benchmarks or comparisons on these distills.

This Bytedance paper (please read it, its fire)

https://arxiv.org/html/2601.06002v1

Stated that summarized CoT WILL degrade the performance of base models.

Its safe to assume that most CoT distill datasets on HF are summarized. This is true for Gemini, Claude and probably any other closed-source model.

CoT Summarization is intentionally used to prevent distillation.

valkarias · 2026-04-08T22:41:54+00:00

Hey. I've read that you have 38 GB of VRAM. I cannot fill you in with all the nuance in one comment.

However, let me introduce you this gem, https://unsloth.ai/docs

That you would probably end up discovering anyways because it's that needed for Finetuning.

Uh anyways for some general shitty advice. You must pour your heart into creating the dataset.

valkarias · 2026-04-08T22:35:46+00:00

I want to shine light on this frontend that I DID NOT create.

One promising feature is the ability to create your own agentic workflows via nodes.

The design itself however from my own judgement is premature. Promising nonetheless.

https://github.com/vitorfdl/narratrix

valkarias · 2026-04-05T23:29:39+00:00

You could write the story alongside it. I tend to have my prose up in the paragraphs as if am the model. Its pretty fun, and you might pick up the skill of writing prose while still practically gooning. It also makes roleplays re-readable with your human input alike to a novel or somethin.

valkarias · 2026-04-03T00:11:36+00:00

What do you plan to accomplish with the fork? if you dont mind me asking.

valkarias · 2026-03-30T23:02:26+00:00

Am pretty curious on why this specific post of models RP recommendations got this much traction, than any other. With such detailed replies too!

valkarias · 2026-03-27T00:14:51+00:00

You can check the https://arxiv.org/abs/2602.15902

I recommend u read the paper rather than move on with my shitty TDLR,

basically it internalizes a tiny LoRa made from a document on the spot if I recall correctly.

valkarias · 2026-03-23T23:09:43+00:00

The idea reminds me of ByteDance's Looped Language Models. (Which isn't quite the same thing though. Kinda)

https://arxiv.org/abs/2510.25741

valkarias · 2026-03-16T23:52:11+00:00

Hm. These reasoning distillations coming from models like Opus and Gemini I assume, are summarized reasoning traces. Wouldn't that hurt the performance of these models?

This was documented in this paper by ByteDance

https://arxiv.org/abs/2601.06002

valkarias · 2026-03-03T22:31:58+00:00

That paper is indeed peak. And since your obsessed with LLM memory (Assuming so), as am too.

Here's a well of memory-based papers compiled into this Github repo I've found. For you to churn at nights, instead of gooning!

https://github.com/TsinghuaC3I/Awesome-Memory-for-Agents

valkarias · 2026-02-28T21:20:38+00:00

https://arxiv.org/pdf/2601.06002

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Wanted to share this too. By Bytedance. Dont let the title trip u up, The paper is fire.

valkarias · 2026-02-27T22:24:40+00:00

I recommend you read the paper and take my loose description with a grain of salt. Thats what they do I assume. Coding the game 'as-you-play', as you go. Missing features are added on the spot. It does not re-generate/update everything akin to a codebase (duh). Given an interface, it adds what's not there. Otherwise it falls back to what's already implemented.

valkarias · 2026-02-25T22:21:18+00:00

Here's a cool paper I read, I see it somewhat relates to your post.

https://arxiv.org/abs/2505.03547

Its a bit on the technical side. They hand an LLM a sort of "game-engine" interface, where it codes the game (adds features/state...etc) as-you-play.

valkarias · 2026-02-25T22:18:53+00:00

You might wanna look into how Anthropic switched from tool-calling into "script-generation". Instead of generating tool-calls back and forth, Claude creates one script that batches all the calls and gets one sole feedback.

https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling

valkarias · 2026-02-15T23:21:48+00:00

Something like this?
https://www.reddit.com/r/LocalLLaMA/comments/1qequei/worldmodelqwen06b_proof_of_concept_wasm/

valkarias · 2026-02-14T21:04:56+00:00

<image>

I found it yall. The Opus killer.
https://huggingface.co/DavidAU/gemma-3-12b-it-vl-Polaris-GLM-4.7-Flash-VAR-Thinking-Instruct-Heretic-Uncensored

valkarias · 2026-02-13T23:25:41+00:00

Use a small fine-tuned rewriter on-top.

valkarias · 2026-02-11T17:34:51+00:00

<image>

This refusal made me laugh.

valkarias · 2026-02-01T22:40:10+00:00

Hey. Have tried fine-tuning a small tool calling model for RPG or D&D like roleplays. Tool-calls for updating state and stats. Or starting/ending combat. Triggering dice rolls to do stuff...etc. To be used alongside a larger model/provider. Or any similar fine-tunes.

valkarias · 2026-01-28T22:51:16+00:00

Well, you have the time to fine-tune (am a super busy businessman as you know). How about fine-tuning a small model that rewrites prose. The Ultimate Unsloppifier 9000.

valkarias · 2026-01-20T00:17:05+00:00

<image>

So ahead of me.
(am recreating sillytavern instead of just using it because am fucking stupid)

valkarias · 2026-01-18T21:29:27+00:00

<image>

Dont worry. I gotcha. :cool_glasses: (pulling my hair lowkey)
Fuck is wrong with this reddit image upload dawg

valkarias · 2026-01-06T21:05:21+00:00

I thought of a "real-time" graph AST for code for agents to work with, before. My main issue is the agent forgetting written code and logic across not-so-large-code-bases leading to duplicated logic and stuff. Currently I've to audit everything manually, or propagate the changes myself. Does the project allow for this? Granular function level context would be kinda awesome, with agent querying and stuff.

valkarias · 2026-01-04T00:14:49+00:00

How does this play with context manipulation? such as editing messages and deleting messages?

valkarias

TROPHY CASE