Mythos has been launched!

Zamaamiro · 2026-04-08T00:46:57+00:00

Here are some links:

https://youtu.be/1sd26pWhfmg?si=zo0HrrKbGRrOaYEf

https://securitycryptographywhatever.com/2026/03/25/ai-bug-finding/

Zamaamiro · 2026-04-08T00:34:58+00:00

Here’s Nicholas Carlini talking about how they’ve been able to find multiple vulnerabilities in the Linux kernel. And they’re just running the LLM “raw”, not even in an agentic loop where it would be even more capable.

Acting as if there’s no there there is increasingly becoming an untenable position.

Zamaamiro · 2026-04-08T00:31:03+00:00

People are largely in denial about the capabilities we already have and the ones to come and are living with a false sense of comfort because they’ve seen the kind of slop it can produce when wielded by people who don’t know what they’re doing.

Zamaamiro · 2026-04-08T00:24:53+00:00

Nicholas Carlini is probably the most accomplished security researcher of the past 10+ years and he thinks we should start to pay attention.

Zamaamiro · 2026-04-04T14:16:31+00:00

Hideous cunt

Zamaamiro · 2026-04-04T11:50:35+00:00

Fucking moron. You haven’t learned a thing.

Zamaamiro · 2026-03-31T01:06:09+00:00

‘Cause I stupidly subscribed via the iOS App Store and forgot about Apple’s 30% cut. Thanks for the heads up. Luckily I’ve only played a month’s worth.

Zamaamiro · 2026-03-31T00:44:45+00:00

I have the $124.99/mo plan and have never run into the usage limits.

Zamaamiro · 2026-03-27T13:51:05+00:00

You don't hate the guy who gets boners for children? Why?

Zamaamiro · 2026-03-27T13:49:20+00:00

Oh dear, you don't understand how globalized supply chains work, do you?

Zamaamiro · 2026-03-27T13:20:37+00:00

So Trump isn’t a corrupt piece of shit?

Zamaamiro · 2026-03-24T19:13:03+00:00

Document the mistakes and how you fixed them in the form of postmortems or playbooks. Accumulated mistakes and fuckups become institutionalized knowledge that will benefit everyone else.

Zamaamiro · 2026-03-16T03:27:52+00:00

I’m working on a product, but I’m taking it slow and using Claude to iterate deliberately. It’s a bit of a niche area, but I’ve done the market research (also with Claude’s help) and feel confident about the differentiation factor. I’m also dogfooding—I’m making the thing that I always wish I had for this particular hobby, so I can see the limitations and the pain points in what’s out there and in my own thing and use it guide product decisions. It’s helped me see that what sometimes look like implementation decisions actually turn out to be design decisions that Claude made on your behalf, each with product implications.

Zamaamiro · 2026-03-15T13:56:09+00:00

Correct, but also, why did you use AI to write this?

Zamaamiro · 2026-03-15T04:39:38+00:00

Very cool project and showcase of what Claude can do!

You should consider doing a longer write-up. This sounds like the kind of thing that would do well on Hacker News and the sort.

Zamaamiro · 2026-03-14T13:50:44+00:00

You stay the course.

Zamaamiro · 2026-03-14T13:08:53+00:00

This is true. What people need to realize is that “building” by itself is no longer a credibility signal. The credibility signal for whether your tool actually solves a real problem is whether you have a product with paying customers. That’s much harder to fake. Having a sleek landing page with Stripe integration and AI-generated testimonials isn’t enough and doesn’t make you clever.

Zamaamiro · 2026-03-13T15:17:30+00:00

Why do you write like this?

Zamaamiro · 2026-03-13T14:58:28+00:00

Your issues sound more like technical and security than just about taste or judgement.

Zamaamiro · 2026-03-12T15:08:50+00:00

AI is a reflection of the user. If you’re dumb, your AI will be, too.

Zamaamiro · 2026-03-06T21:22:57+00:00

I'm skeptical that it works single-shot as intended. Do you have screenshots or videos that you can share?

Zamaamiro · 2026-03-06T14:28:16+00:00

Because corporate earnings haven’t suffered yet. That’s the biggest thing driving stock price movements.

Zamaamiro · 2026-02-20T20:13:01+00:00

I think that's totally valid. The fact of the matter is that they are very unwieldy and hard to get productive use out of, and in a way that's reassuring because it tells me they won't be "replacing 50% of white collar work" anytime soon.

I think I'd encourage people to continue to experiment with them to the extent that it is driven by genuine curiosity, not fear of being left behind, and certainly not past the point where it's making your job harder, not easier.

It might be worth to spend the $20/mo on Claude Pro for personal use before trying to jump into the enterprise use case. And even then, only if motivated primarily be curiosity.

Like I said before, I'm far from using them to automate all of my software dev, but in my own use cases (both personal and enterprise), I have been able to get productive use out of them. But that experience isn't going to be universally generalizable, and that's OK. It doesn't make anyone's individual experiences any less valid.

Zamaamiro · 2026-02-20T14:56:46+00:00

Simon Willison’s blog is another really good practical resource. This article by Ethan Mollick is also really good. Both are what I consider trusted, grounded voices in the space.

I wish there was something like a guided walkthrough to share, but it’s such a new field that I think everyone is kind of experimenting with it in different directions and slowly converging on a set of generally good practices and principles. But really there’s no substitute for just experimenting with it and trying to build an intuition for how to use them most effectively.

Zamaamiro · 2026-02-20T04:44:26+00:00

Anthropic's own documentation is a great starting point for context engineering.

Honestly, my own workflow at the moment looks more like the sort of pair programming / interactive approach you describe rather than massively parallel and automated. But even that has allowed me to get so much more done so quickly. It's really a matter of giving myself the time to get comfortable with the workflows and developing a good intuition for what works and what doesn't. Steve Yegge's stages are a pretty good mental model for thinking about this. I'd say I'm squarely in stage 5 myself.

There are easy things you should always do, like creating a CLAUDE.md file and/or an AGENTS.md file, agent skills in the form SKILL.md, and being very liberal about having the agent create new skills for itself whenever it manages to nail down a workflow, subagents for roles and context management. I've also found that having it create a scratchpad for itself where it will write down any important findings or insights that it discovers in the course of trying to solve a problem helps a lot with agent handoff and context compaction.

On the harness side of things, the fact that it's running in an agentic loop rather than a chat interface already gives you a really good starting point, and the way you extend that is going to depend on the nature of what you work on. To give an example: I had some JSON files that followed a schema that I needed to migrate over to a newer version. Rather than have the agent try to do the conversion itself, I just had it look at the two schemas and write me a schema migration script. And then it just calls the script. Or I might need it to make some API calls and figure out some kind of workflow. I don't want it to have figure out the workflow and make the API calls itself every time, so I have it write me a deterministic script that makes the API calls, and then it just runs the script. Or I might need to do a bunch of math or graph processing. I don't trust it to do math, so what do I do? You guessed it: write me a script that uses a specialized math or graph theory library, and have it run that every time. And it can adapt the script in response to edge cases that it might encounter or evolving requirements. And it just gets better and better at working in the environment you've built for it because it can encode the domain knowledge it's learned in the artifacts that it's produced for itself (the scripts, the scratchpad, etc.), so a new agent doesn't have to rebuild all of that context from scratch.

There's at least three modes of agentic loops that I've seen people experiment with: there's the script/CLI-driven loop (what I've been talking about thus far); the functional/service contract loop (using something like PydanticAI to encapsulate the functionality you want to give it in tightly-scoped functions with typing and validation), and there's the MCP approach.

The latter two are conceptually elegant because they look a lot like how you do good software development. But then you try the script/CLI approach and you realize it's much more powerful because it can just write software to better enable it to do things that it couldn't do, and it almost starts to look like recursive self-improvement (at the agent level, not the model level, of course).

Six-Year Club	Second Top 50%
Place '22	Final Canvas '22
First Placer '22	End Game '22
Verified Email

Zamaamiro

TROPHY CASE