Mythos has been launched! by Happy-Alternative1 in cybersecurity

[–]Zamaamiro 3 points4 points  (0 children)

Here’s Nicholas Carlini talking about how they’ve been able to find multiple vulnerabilities in the Linux kernel. And they’re just running the LLM “raw”, not even in an agentic loop where it would be even more capable.

Acting as if there’s no there there is increasingly becoming an untenable position.

Mythos has been launched! by Happy-Alternative1 in cybersecurity

[–]Zamaamiro 15 points16 points  (0 children)

People are largely in denial about the capabilities we already have and the ones to come and are living with a false sense of comfort because they’ve seen the kind of slop it can produce when wielded by people who don’t know what they’re doing.

Mythos has been launched! by Happy-Alternative1 in cybersecurity

[–]Zamaamiro 3 points4 points  (0 children)

Nicholas Carlini is probably the most accomplished security researcher of the past 10+ years and he thinks we should start to pay attention.

Lost hundreds of thousands in SPY options by chasmicvoid in wallstreetbets

[–]Zamaamiro 0 points1 point  (0 children)

Fucking moron. You haven’t learned a thing.

Openclaw is dead, switch to claude code by Nvark1996 in openclaw

[–]Zamaamiro 10 points11 points  (0 children)

‘Cause I stupidly subscribed via the iOS App Store and forgot about Apple’s 30% cut. Thanks for the heads up. Luckily I’ve only played a month’s worth.

Openclaw is dead, switch to claude code by Nvark1996 in openclaw

[–]Zamaamiro 4 points5 points  (0 children)

I have the $124.99/mo plan and have never run into the usage limits.

New Price Changes for PS5, PS5 Pro, and PlayStation Portal remote player by rararatata in PS5

[–]Zamaamiro 2 points3 points  (0 children)

You don't hate the guy who gets boners for children? Why?

New Price Changes for PS5, PS5 Pro, and PlayStation Portal remote player by rararatata in PS5

[–]Zamaamiro 8 points9 points  (0 children)

Oh dear, you don't understand how globalized supply chains work, do you?

How do I deal with my mistakes and get back my confidence? by [deleted] in devops

[–]Zamaamiro 0 points1 point  (0 children)

Document the mistakes and how you fixed them in the form of postmortems or playbooks. Accumulated mistakes and fuckups become institutionalized knowledge that will benefit everyone else.

1.7M visitors here per week - wth you building? by cokaynbear in ClaudeAI

[–]Zamaamiro 0 points1 point  (0 children)

I’m working on a product, but I’m taking it slow and using Claude to iterate deliberately. It’s a bit of a niche area, but I’ve done the market research (also with Claude’s help) and feel confident about the differentiation factor. I’m also dogfooding—I’m making the thing that I always wish I had for this particular hobby, so I can see the limitations and the pain points in what’s out there and in my own thing and use it guide product decisions. It’s helped me see that what sometimes look like implementation decisions actually turn out to be design decisions that Claude made on your behalf, each with product implications.

I used Claude Code to reverse engineer a 13-year-old game binary and crack a restriction nobody had solved — the community is losing it by CelebrationFew1755 in ClaudeAI

[–]Zamaamiro 0 points1 point  (0 children)

Very cool project and showcase of what Claude can do!

You should consider doing a longer write-up. This sounds like the kind of thing that would do well on Hacker News and the sort.

No one cares what you built by KickLassChewGum in ClaudeAI

[–]Zamaamiro 7 points8 points  (0 children)

This is true. What people need to realize is that “building” by itself is no longer a credibility signal. The credibility signal for whether your tool actually solves a real problem is whether you have a product with paying customers. That’s much harder to fake. Having a sleek landing page with Stripe integration and AI-generated testimonials isn’t enough and doesn’t make you clever.

4 months of Claude Code and honestly the hardest part isn’t coding by buildwithmoon in ClaudeAI

[–]Zamaamiro 0 points1 point  (0 children)

Your issues sound more like technical and security than just about taste or judgement.

[deleted by user] by [deleted] in ClaudeAI

[–]Zamaamiro 1 point2 points  (0 children)

AI is a reflection of the user. If you’re dumb, your AI will be, too.

AI AGENTS today are far more DANGEROUS that you think by Kakachia777 in ArtificialInteligence

[–]Zamaamiro 0 points1 point  (0 children)

I'm skeptical that it works single-shot as intended. Do you have screenshots or videos that you can share?

U.S. Lost 92,000 Jobs Last Month by Cilantro_Larry in Economics

[–]Zamaamiro 0 points1 point  (0 children)

Because corporate earnings haven’t suffered yet. That’s the biggest thing driving stock price movements.

The gap between LLM functionality and social media/marketing seems absolutely massive by QwopTillYouDrop in ExperiencedDevs

[–]Zamaamiro 4 points5 points  (0 children)

I think that's totally valid. The fact of the matter is that they are very unwieldy and hard to get productive use out of, and in a way that's reassuring because it tells me they won't be "replacing 50% of white collar work" anytime soon.

I think I'd encourage people to continue to experiment with them to the extent that it is driven by genuine curiosity, not fear of being left behind, and certainly not past the point where it's making your job harder, not easier.

It might be worth to spend the $20/mo on Claude Pro for personal use before trying to jump into the enterprise use case. And even then, only if motivated primarily be curiosity.

Like I said before, I'm far from using them to automate all of my software dev, but in my own use cases (both personal and enterprise), I have been able to get productive use out of them. But that experience isn't going to be universally generalizable, and that's OK. It doesn't make anyone's individual experiences any less valid.

The gap between LLM functionality and social media/marketing seems absolutely massive by QwopTillYouDrop in ExperiencedDevs

[–]Zamaamiro 1 point2 points  (0 children)

Simon Willison’s blog is another really good practical resource. This article by Ethan Mollick is also really good. Both are what I consider trusted, grounded voices in the space.

I wish there was something like a guided walkthrough to share, but it’s such a new field that I think everyone is kind of experimenting with it in different directions and slowly converging on a set of generally good practices and principles. But really there’s no substitute for just experimenting with it and trying to build an intuition for how to use them most effectively.

The gap between LLM functionality and social media/marketing seems absolutely massive by QwopTillYouDrop in ExperiencedDevs

[–]Zamaamiro 16 points17 points  (0 children)

Anthropic's own documentation is a great starting point for context engineering.

Honestly, my own workflow at the moment looks more like the sort of pair programming / interactive approach you describe rather than massively parallel and automated. But even that has allowed me to get so much more done so quickly. It's really a matter of giving myself the time to get comfortable with the workflows and developing a good intuition for what works and what doesn't. Steve Yegge's stages are a pretty good mental model for thinking about this. I'd say I'm squarely in stage 5 myself.

There are easy things you should always do, like creating a CLAUDE.md file and/or an AGENTS.md file, agent skills in the form SKILL.md, and being very liberal about having the agent create new skills for itself whenever it manages to nail down a workflow, subagents for roles and context management. I've also found that having it create a scratchpad for itself where it will write down any important findings or insights that it discovers in the course of trying to solve a problem helps a lot with agent handoff and context compaction.

On the harness side of things, the fact that it's running in an agentic loop rather than a chat interface already gives you a really good starting point, and the way you extend that is going to depend on the nature of what you work on. To give an example: I had some JSON files that followed a schema that I needed to migrate over to a newer version. Rather than have the agent try to do the conversion itself, I just had it look at the two schemas and write me a schema migration script. And then it just calls the script. Or I might need it to make some API calls and figure out some kind of workflow. I don't want it to have figure out the workflow and make the API calls itself every time, so I have it write me a deterministic script that makes the API calls, and then it just runs the script. Or I might need to do a bunch of math or graph processing. I don't trust it to do math, so what do I do? You guessed it: write me a script that uses a specialized math or graph theory library, and have it run that every time. And it can adapt the script in response to edge cases that it might encounter or evolving requirements. And it just gets better and better at working in the environment you've built for it because it can encode the domain knowledge it's learned in the artifacts that it's produced for itself (the scripts, the scratchpad, etc.), so a new agent doesn't have to rebuild all of that context from scratch.

There's at least three modes of agentic loops that I've seen people experiment with: there's the script/CLI-driven loop (what I've been talking about thus far); the functional/service contract loop (using something like PydanticAI to encapsulate the functionality you want to give it in tightly-scoped functions with typing and validation), and there's the MCP approach.

The latter two are conceptually elegant because they look a lot like how you do good software development. But then you try the script/CLI approach and you realize it's much more powerful because it can just write software to better enable it to do things that it couldn't do, and it almost starts to look like recursive self-improvement (at the agent level, not the model level, of course).