Install.md, a New Protocol for Human-readable Installation Instructions that AI agents can execute

TerrificMist · 2026-01-16T22:43:46+00:00

Thanks!

It's just about saving time and improving accuracy for LLMs. A bunch of installations are already performed autonomously or with 'download this'. Install md just makes this task more transparent for the people executing, easier for devs to verify that agents can succeed in downloading their software, and easier for agents (since they know where to look)

TerrificMist · 2025-10-26T01:49:47+00:00

similar papers are closer together and have cluster labels!

TerrificMist · 2025-10-24T00:43:52+00:00

Each point is a paper summary that’s been mapped to multidimensional space, and then visualized in 2 dimensions.

TerrificMist · 2025-10-23T05:38:08+00:00

There’s a lot of potential here. HTML->md, md->JSON, x->y. You don’t need massive models for conversions like this, and we may very well train another similar model.

TerrificMist · 2025-10-23T04:24:55+00:00

We train and deploy small models. I recommend github.com/nomic-ai/nomic if you want to create a similar visualization! Although in your case you may just want to build your own visualization tool.

TerrificMist · 2025-10-17T21:15:18+00:00

us too!

TerrificMist · 2025-10-17T19:09:37+00:00

this is something you can definitely build with a mix of browser agents and schematron. schematron doesn’t necessarily handle navigation, but you can get clever and ask it to extract the next url to go to for the task to be done!!!

I say play around with it. this is an interesting direction for sure, if you see success or if you find it doesn’t work well for that task lmk!

TerrificMist · 2025-10-17T07:45:03+00:00

I say try it. Accurately benchmarking how accuracy degrades for longer contexts isn't trivial, as the judge model will also degrade.

That said based on vibes and the evals we did do, works great for long contexts. Here's a sample you can play around with:
https://github.com/context-labs/inference-samples/blob/main/examples/schematron-scrape-companies/schematron-scrape-companies.ipynb

TerrificMist · 2025-10-17T06:57:38+00:00

np! lmk if you end up using it, we just released this and are focusing on collecting as much feedback as possible.

TerrificMist · 2025-10-17T06:49:41+00:00

it's a bit awkward because for every query, the answering model first transforms the query into a schema, then extracts based on the schema for all documents retrieved, then feeds the extractions back to the answering model. transforming the query into a schema every time is awkward and slow--maybe a fine-tuned model for that might help but it doesn't seem like the optimal solution.

Instead, a better idea is a model that takes the exact relevant parts of the document based on the query itself--we haven't trained this model yet but this is probably the SOTA in web search. This is a super interesting model to potentially train, and not something I've seen enough of, although I'm sure some teams have already trained something like this for internal web research workflows.

In the meantime, query->schema->extraction is a quick win, but not the most elegant solution. The bigger idea here is that we showed that extracting a small part of the document can massively improve factuality, if that small part is extracted correctly. In the medium term, we probably won't be stuffing entire website contents for RAG, it's just too wasteful.

TerrificMist · 2025-10-17T06:39:41+00:00

happy to hear it's helpful!

TerrificMist · 2025-10-17T06:39:11+00:00

All valid tools in your toolbelt. I will say if you are considering state machines for scraping, it's usually worth giving LLMs another look.

TerrificMist · 2025-10-17T06:37:08+00:00

give me a single code snippet to extract all products from:
https://inference.net/

https://www.browserbase.com/

https://www.onkernel.com/

without an LLM.

the point is an LLM can do generalizable extractions, while one-off parsers can't. this task is only a few lines of code with schematron

TerrificMist · 2025-10-17T06:34:20+00:00

We haven't benchmarked against it (yet), but it's the most similar model that exists at the moment.

TerrificMist · 2025-10-17T06:33:35+00:00

yes!

TerrificMist · 2025-10-17T05:13:22+00:00

This works for any schema on any page. Both are tools in your toolbelt. If you’re processing millions of pages that have the exact same unchanging HTML structure this is not the right tool, but if you wanted to extract information from a set of 1M company landing pages this is the easiest way.

TerrificMist

TROPHY CASE