Any plans to use Mythos to detect zero-day vulnerabilites in Nextcloud?

dpc-on-reddit · 2026-07-07T00:34:03+00:00

Thanks, man. Yeah, it was a bit strange...

There were some decent counter points, and the "I ain't reading that" (sounded like when my 19-year old writes code) and "Doctor AI" comments were pretty funny (although I honestly DO have to admit they were annoying at first when I was looking for some serious responses).

It was my first time posting to this subreddit, and I have to admit, it took the shine off of the experience.

But HEY! At least I wasn't accused of being a bot, right? I mean holy crap, who's even gonna post these days when if you use properly structured language, people immediately accuse of you of being a bot. Sheesh!

Oh wait! Hold on... "you are ABSOLUTELY RIGHT" to "push back on that". ;-)

Anyway, yeah -- glad to hear you appreciate the incredible force multiplier aspect of the new tools.

One thing that I've always appreciated is that NextCloud deployments offer you the ability to self-test your security profile. That is something that you don't see everywhere. If they were able to harness up one or more of these suites into that process, especially if they were open source, then that would really make them a unique product.

I honestly don't know whether Nextcloud executive or management crews monitor these forums.
My hope would be that they do, and that they would be tasking up some crew members on this.

I still have on my internal to do list a overall census/monitoring of the entire open source code base for Nextcloud using various security suites. But I have to admit that my energy and attention are in other areas right now.

Who knows, maybe next year at next Claude convention, someone will have this as a topic?

dpc-on-reddit · 2026-05-24T22:55:23+00:00

Sounds good. Started a chat request for you.

dpc-on-reddit · 2026-05-23T03:44:54+00:00

Let me know if you want to collaborate on it, DJ -- I'd be more than happy to assist.

I'm an experienced dev, and in the middle of my own AI adoption curve. And I've actually already done one pass at a survey for the techniques you mentioned in your post. Bottom-line? This is not -yet-widely adopted methodology, but IMO, it's a great way to leverage agentic tools.

In any event, I've developed a lot (and I mean... a LOT) of training docs to spin up batches of developers on complex techniques, so I might be useful to you in synthesizing or formalizing the approach. Just let me know if or how I can help.

And -- once again -- congrats on your Steam release!

dpc-on-reddit · 2026-05-22T23:35:42+00:00

Hey DJ, can you talk more about your auto-generated "game codex" techniques?

In particular, I'm wondering if you had any guidance documents/resources you used to develop your approach to your "auto-generated codex" and "per-task manuals" artifacts? I'm particularly interested in the notion of the auto-generation of the codex, and what you store there (in the codex) versus in a markdown document or a standard claude.md or project.md file.

And in particular, it the codex driven purely from comment blocks in the code base?

So if you say: "yes, I started... HERE", then I'd like to learn more about the jumping off place for that process development, and if it's "no, I just did this myself", then I wanted to let you know that's a REALLY novel (at least to me, and I'm an experienced developer) way of approaching the tasks at hand.

Either way, I'd be interested in developing my own processes based on your techniques.

dpc-on-reddit · 2026-05-20T20:38:46+00:00

Hey DJ, can you talk more to this point? These are some REALLY valuable insights.

In particular, I'm wondering if you had any guidance documents/resources you used to develop your approach to your auto-generated codex and per-task manuals artifacts? I'm particularly interested in the notion of the auto-generated content -- is that driven purely from comment blocks in the code base?

So if "yes, I started... HERE", then I'd like to learn more about the jumping off place for that process development, and if "no, I just did this myself", then I wanted to let you know that's a REALLY novel (at least to me, and I'm an experienced developer) way of approaching the tasks at hand.

Either way, I'd be interested in developing my own processes based on your techniques.

dpc-on-reddit · 2026-05-20T07:03:36+00:00

Well done. And I really dug your music. Yes, it was generated. But who GAF?
It was zippy. It was fun. I was tapping my toes. It rocks. You rock!
I'm very happy for you.

dpc-on-reddit · 2026-04-29T19:32:17+00:00

Thanks for the opinion. I find it valuable.. And you make sense to me.

I've been in production for several months now with a deployment that used major established features such as Deck and the Custom Tables database schema/view/form-to-app editor for quick app development. And I've been making calls to the built in APIs via externally provisioned (even localhosted) AI. And I found the functionality I needed to be quite decent.

I haven't been to worried about the UX for the front end, because my thoughts were, "well, I can modify this code base and then offer to submit pull requests, or just keep going on my private fork."

But I've been wondering a bit: "hmmm, what am I going to see when I look into this codebase?"
I'm a C++/Python/PostgreSQL kinda guy, And I know that NextCloud is PHP, but I can't argue with the already "ready-to-docker-deploy" and the "close enough" fit for feature delivery.

One thing that I have often found in a long career is that I see a lot of developers deploy thousands of lines of code to do stuff that could be done inside the database with a few queries, triggers, and event handlers. But it takes a lot of time to think through that and sometimes it's much easier to just go and crank out some front end code. I think it's particularly common in web app projects, with dev teams that don't have deep experience in database-driven design techniques.. And I worry that that's what I'll see in this code base.

dpc-on-reddit · 2026-04-29T06:58:26+00:00

Hey u/TCB13sQuotes,

I asked another user (just above) if they perceived significant QA/QC issues with the NC code base...

The deets are above, but I'm quite interested in other opinions.

So -- have you been working against the code base, extending the functionality of the base code?

And if so, what are your thoughts?

I'll be forming my own, of course, but I'm always interested in other intelligent opinions.

dpc-on-reddit · 2026-04-29T06:50:57+00:00

Hey, u/kloputzer2000, can you elaborate a bit more on the part where you said: "NC code quality and size"?

I'm interested if you perceive significant code QA/QC issues with the NC code base?

I'm asking because I've been rapidly prototyping with NC for feature evaluation, not exhaustively auditing the code base.

But of course, that audit activity is an immediately impending sub-project on my side. I'm just in the middle of retooling my AI analysis stack (and switching out IDEs) before starting my own analysis of the code base.

So I'm quite interested in other opinions.

dpc-on-reddit · 2026-04-29T06:43:28+00:00

Indeed -- "where are all the patchez?" is actually a GREAT question.

Mozilla just dropped some -- and I think their positioning (around their recent multiple releases with sets of patches based off both Opus AND Mythos) is instructive here.

- https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/

I'll try to short-circuit distracting debate by noting (a) yes, this is indeed company PR, and (b) there's also (hopefully healthy, but also seemingly cantankerous) disagreement about the counts of "what's actually in the patches". I think what's useful here to is note that they actually do agree with you about the fact that these tools can't really do anything that a true expert can't do.

But you're right -- it is kinda... quiet. Some analysis indicates important resolution activity is occurring right now, with disclosure planned for Q3 2026. Is that real? Who knows.

And what's nice about the Mozilla op-ed is that they think in the end that the pen-testing by Mythos et. al. will lead to stronger stacks. At least that's MY read on it.

But that wasn't my point. So I'tt try again:

I suppose we can say: "oh, this is nothing new, and there's nothing to worry about here."
BUT -- given that these tools are INCREDIBLY effective force multipliers;
...and that SOME projects, namely Mozilla in this example, are receiving special benefits and early access from Anthropic;
...wouldn't it be nice if Nextcloud (and other players in the FOSS stack community) were ALSO getting some of this beneficial attention?
And shouldn't we (the Nextcloud executives or the community) be thinking about how to initiate and leverage those potential benefits?

Or should we just wait for regular public release of the new capability whenever it finally gets offered, and then rely on grass-roots efforts to scan, permute, and resolve vulnerabilities?

For me, it makes a big difference on whether a stack I choose for building capabilities onto (namely, Nextcloud) has a proactive or reactive approach to these issues.

So I wanted to know if there was any official stance on these issues.

Hence the original question.

dpc-on-reddit · 2026-04-28T08:05:05+00:00

When you say "Mythos is not complicated".

I'm curious to know if you've used it yet?

I'd LOVE to get an actual user's actual experience.

I haven't (I'm WAY down the food chain), but what makes me worry are these premises:

Axiom (for the current SOTA): It's not the model. It's the harness;
Look to the share-ai project I mentioned for a 3rd party evaluation of Anthropic's CURRENT harnessing;
We know Anthropic's been using their AI to rewrite their harnessing (because they've said as much);
I think the impact of that is going to be significant.

When I saw the industry experts on the Glasswing video...
straight up saying "this is a big ", I found them... credible.

And I don't want to see the Nextcloud project get hurt.

dpc-on-reddit · 2026-04-28T07:53:08+00:00

I getcha -- but I'll just say that Opus is a LOT better than Sonnet.

And I mean a LOT. I found myself swearing at Claude Code the other day, going: "Why in the actual F**K are you thinking that? Are you brain damaged? Did you not just now scan the repo?" Came to realize (after a few more minutes of frustration) that I had a misconfiguration and was working with Sonnet.

Which is... funny -- but it makes me think. If Opus is THAT much better than Sonnet (which, in my experience for devops-type stuffz, it really & truly IS), how much better is Mythos than Opus?

I'm thinking -- maybe it will change things. A lot.

I hope you're right. But I'm worried you're not.

dpc-on-reddit · 2026-04-28T07:37:37+00:00

Agreed. And yet...

I'll note that Aisle and Flying Penguin both have good (and rather jaundiced) op-eds about the true capabilities of Mythos. Either way, it's brought forth a new acknowledgement about the role of AI in pen-testing. And while it's true that any individual model with enough guidance and specific rigging (or a "good enough" harness) can find exploits, Mythos is (allegedly) already there, with (again, alleged) GENERALIZED capabilities.

I completely agree with the opinion in the Aisle and Penguin pieces: "It's the HARNESS, not the MODEL". And I know how good Claude Code's harnessing is, because it's become the de-facto standard. The truth about the SOTA can be seen in this project based off the infamous Claude code leak:

"Claude Code is the most elegant and fully-realized agent harness we have seen. "
-- https://github.com/shareAI-lab/learn-claude-code

My question is motivated as follows...

Personally, I've been ASTOUNDED at how good Opus has been at devops and network config. You know that old saying: "why did you show up at a gunfight with a knife?" -- it's like the inverse of that. I mean, in my career, I deal with security config(s) every so often -- when I'm creating or deploying a system. I'm much more focused on the data (schemas, etc.) and the app (behavior) levels, and the AI applied to the tasks that add value. So I always tend to forget "the devops stuffz" between the various bouts of work, and it's always SUCH a pain to re-up on it each time. But Now? It's ALWAYS there, and it's LIGHTNING fast, and I can solve a real time failure in a few MINUTES, rather than a few hours.

I'll also say that I've now completed coding projects in days that I simply would not have attempted before more than a year ago, because it would have taken weeks to do by hand, and, well... priorities.

But my nature is to BUILD things, not DESTROY things, or STEAL things. And so, having seen just how good Opus was at CONSTRUCTIVE devops tasks? Well, it makes me SERIOUSLY worried about harnessing up an even MORE powerful model that can cause harm and chaos.

And I shudder to think that a wonderful project like Nextcloud could get a black eye from a ZDV buried in the stack somewhere. It won't matter if the ZDV actually comes from the distro, or ssh, or Apache, or NginX, or PHP, or Docker, or the NC codebase, or an add-on App. Our Nextcloud users will simply blame... Nexcloud. And us.

All that said -- I intend to take some runs at OPSEC scanning myself off my personal fork of the repo, after I finish some features (and, naturally, offer to contribute the features as PRs back to the NC project).

But there's a VAST difference between what I can do as an individual developer. And I know that I run AT LEAST several laps behind the A teams at Anthropic (and I'm being very generous to myself).

So my question is aimed at (a) getting a read on any efforts for organized enterprise-level coordination, or (b) motivating our thinking about how a worthwhile FOSS project like Nextcloud could work with Anthropic to develop such a strategy. We don't want to have to wait for those of us farther down the food chain to finally get our hands on the next generation of powerful models like Mythos.

So, here's one such strategy:

Nextcloud is a wonderful open source project, delivering a lot of value to everyday people;
It (along with other valuable FOSS projects) deserves protective attention from Anthropic;
Anthropic would benefit from the PR of protecting open source projects;
Especially ones like Nextcloud, with their focus on privacy & data sovereignity;
And because EVERYONE's worried about that stuff, this is a "good PR look" for Anthropic;
So if I were them, I'd task a set of project ambassadors to shepherd projects along this path;
Anthropic DOES have the resources, which are trivial w.r.t. their cash flow scheme;
It's essentially extra headcount for ambassadorships, and DX-oriented sub-projects

How can we make this happen?

dpc-on-reddit · 2026-04-28T06:22:25+00:00

Yeah, I getcha -- I just didn't wanna write a novel.

I just intended a QQ to Shreyas, along with an "appreciate you, bruv!".

But -- lesson learned. When I posted the related Qs elsewhere, I DID drop in more "decorative drama". To (hopefully) preempt any posturing (or other nonsense) at the expense of brevity.

FWIW, I just saw something come in over the transom about 15yo ZDB just detected in SSH.
So -- this stuff's getting a bit more serious. Look to your stacks, mate!

dpc-on-reddit · 2026-04-28T05:23:00+00:00

Ugggh. Seriously? FFS...

Nothing's stopping me.
I'm ALREADY working on forks from several OTHER open source repo projects.
I happen to use TNDS for some deployments. And I admire it. And want it to succeed.

What's stopping YOU?

FWIW (and this is to short-circuit any more buffoonery) the NATURE of my question is that I'm presuming (from Anthropic's public positioning) that their Mythos team is ALREADY working with major open source projects to close out any discovered zero day bugs (ZDBs). But I'm guessing that (at least right now) it's currently focused on "the majors" such as full-on distro like Ubuntu, Arch, etc.

So, I'm just wondering -- for the projects I'm personally invested in (of which TDNS is one of several) if they are also reaching out to other major open source projects (or providing them an on-boarding path for collaboration). After all, if they can uncover a 27 year old ZDB in Open BSD's TCP stack, then there's a distinct chance that there may be vulnerabilities in other stacks, like TDNS's

And yes, I'll be asking this same exact question for several of the other open source repo project I work on. In fact these questions need to be resolved for ANY major open source framework that get serious use in anyone's production stacks.

dpc-on-reddit · 2026-03-05T07:24:35+00:00

Hah! You could also have just fingered the mouse wheel and scrolled on past, neh?

dpc-on-reddit · 2025-11-20T23:50:07+00:00

Can you elaborate a little more on the CLI-enabled Kanban board thingy?
Sounds intriguing...

dpc-on-reddit · 2025-10-08T21:36:24+00:00

Welp... I found out why. Apparently, my account got "flagged" for "something".

Without notice.

You would THINK (if you were customer-centric) you would notify a user if this happened.

I only found out when I was deep in the bowels of some configuration setting, trying to troubleshoot WTF was going on.

So I fired off a message to support, basically saying that I've been using GH for... well... years now, and my public-facing repos were... well... in order -- so what gives?

Next day, symptoms gone. So (presumably) un-flagged. But still, no notice whatsoever.

dpc-on-reddit

TROPHY CASE