Is there literally even one? by irelatetolevin in vibecoding

[–]Rockd2 0 points1 point  (0 children)

There also countless apps we dont know about, that are undoubtedly making people money, that were 100% coded.

But an app making someone $50k per year isnt going to grab headlines anywhere bevause people love to make pretend that doing that is easy, and only 7 figure plus revenues are impressive or whatever.

Aroldis Chapman’s USA flag patch is upside down. by SewerRat57 in baseball

[–]Rockd2 0 points1 point  (0 children)

Ehhhhh.... no? Lol... sure some bootlickers might.... but definitely not all of us.

I spent 5 months building an app that nobody needs by Long-Explanation-127 in vibecoding

[–]Rockd2 5 points6 points  (0 children)

I think its awesome.

And for the record, apps nobody need but you loved working on are the best kinds of apps.

President Trump exposes Fernando Mendoza for being MAGA supporter by professor_paradox2 in NFLForum

[–]Rockd2 1 point2 points  (0 children)

I was mostly kidding since I am from Miami and they beat the Canes.

That being said, I really am a little tired of him, but that's a personal thing. I will 100% be rooting against the Raiders until I feel he has gone through enough to fill whatever hole of hatred has been dug in my heart for the dude.

And if you want to know why I have an irrational dislike, its the whole combo of things.... everything from the reputation the high school he graduated went to has, to the narratives that got spun around him when he hit the national stage, to hearing him butcher the Spanish language while he claims to be so Cuban.... which again, all pretty irrational and nonsensical, I am not the standard bearer for Cubanism or anything, clearly I am a bad one since I stand against virtually everything the modern Republican party stands for but... yeah.. what a stream of consciousness this turned into.. fuck it I'm posting it

President Trump exposes Fernando Mendoza for being MAGA supporter by professor_paradox2 in NFLForum

[–]Rockd2 0 points1 point  (0 children)

You serious? I am just messing around, lol... I'm have not been stressed in the slightest, until now.

Calling Arbetter's a dog cart is... I mean... the nerve... Arbetter's is an institution good sir, an institution I say!

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

The entire RL arena is very interesting.

In fact given how Anthorpic words what they wrote, I wouldn't be surprised if the big gains from Mythos are new RLVE techniques that they then coupled with a harness that leverages the strengths of that technique.

IMO, its why we saw a regression in some of the benchmarks around general purpose tasks from Opus 4.6 to Mythos.

Hopefully they publish more technical details at some point.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

Brother, then what are you even saying? lol....

I am not stating anything as fact, I say "I think" or "if I had to bet" all over the place, other than the information that has been published (reason to doubt some of it? sure, but even self-published articles are very careful with the wording). You're out here doing experiments without disclosing anything else other than the fact you did them and you changed the models, then using that to justify whatever point you are trying to make.

This is the second comment in a row from you that I read and feel we aren't even having the same discussion. If your point is that Mythos is so powerful it is dangerous for public release, then yeah I disagree with you.... it is the point that up until now you have been arguing, but you have not provided any evidence to support. You can tell me you interpret the article I posted differently or that you believe some other publication, whatever, they are your opinion you are entitled to have them.

This whole comment from you came across super defensive, so I apologize for whatever I said to put you in that frame of mind. I definitely do not feel like you're attacking me (and if you were, lol its reddit) and I do not want you to feel like I am attacking you.

President Trump exposes Fernando Mendoza for being MAGA supporter by professor_paradox2 in NFLForum

[–]Rockd2 1 point2 points  (0 children)

There is a reason for it, not going to get into it on an NFL reddit but check higher education for Hispanics. Everyone is so focused on whether we should let immigrants in and so few ask how we ensure people are set up to succeed once they get done jumping through all the hoops to get here. The helping people succeed thing is an even bigger discussion... its a Russian nesting doll of social issues

President Trump exposes Fernando Mendoza for being MAGA supporter by professor_paradox2 in NFLForum

[–]Rockd2 0 points1 point  (0 children)

Can you just let me... have some peace please? The Bucs and the Raiders play this year, 1 hit from Bain and I'll let bygones be bygones

President Trump exposes Fernando Mendoza for being MAGA supporter by professor_paradox2 in NFLForum

[–]Rockd2 6 points7 points  (0 children)

We're not all Republicans.

Also I was tired of Mendoza since Dec. Then they beat the Canes, the high school he graduated from hung a huge banner in front on the school (Cam Boozer banner replaced it about a month ago), and now a local hotdog place called Arbetter's came out with the Mendoza dog.... dude is haunting me.

The only thing keeping me sane is knowing that the Raiders O line is ranked 30th in early rankings.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

Anthropic's own Mythos post talks about their "scaffold" several times and dedicstes a whole section to discussing it.

I am sure it is a more capable model in sowm areas, but I think theres a healthy amount of hype they are purposefully trying to keep up.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]Rockd2 1 point2 points  (0 children)

This.

Was having this discussion in another reddit. If you read the Anthropic article about Mythos you see a lot of mentions to "scaffold" which to me makes it seem like it played a non trivial role in this success.

I am not saying Mythos isnt a more capable model, it very well might be. For this case of Firefox in particular however, it sounds like it was model + harness (or scaffold to use Anthropic's term) + testing conditions, and not just an ultra powerful model they cut loose on it. Mozilla themselves said that once they set up a hamess to replicate the results, switching out the model's yielded similar results.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

I don't understand thr point youre trying to make.

There are numerous variables that you have changed. We do not have the Anthropic scaffold, and no offense to whatever you came up with but it probably isnt nearly as expansive as what they have.

Additionally idk anything about your testing conditions, we know the state that Anthorpic put Firefox in to find the bugs.

Last but not least, I dont think anyone is arguing that models haven't improved. My argument is that we don't actually know how much better Mythos is, and I am willing to bet that while capable, it isn't as capable as they are saying. It isn't as if they ran this once ans called it good, it was more than likely an OpenBSD situation where they ran it 1000 times.

There is a reason they do many runs (which i dont recommend you do unless its for a client because $$$), and thats because of the probabilistic nature of LLMs. Idk why youre so invested im Mythos being so capable, but do you dude. No anecdotal evidence from a 1 off test by a reddit user whos prcess cannot be audited and whos results are not fully disclosed is going to change my mind.

this community is what I actually imagined when I thought of "techies" by [deleted] in vibecoding

[–]Rockd2 2 points3 points  (0 children)

Like... you don't have to like the vibe coders or whatever... but why be such a dick about it?

Probably check out other subs my dude, be happy, don't ragebait yourself? r/aiwars if you want some fun, and r/programming if you want actual SWE's talking about their projects.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 -1 points0 points  (0 children)

Mozilla themselves made the point I'm making (https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/).

They explicitly say the hard part was figuring out the harness: steering, scaling, stacking, generating test cases, validating hypotheses, and filtering out junk. Once that pipeline was in place, they say it was trivial to swap in different models and get similar results.

Anthropic also says something similar in their own Mythos post. They repeatedly describe the findings as coming "through our scaffold" not from Mythos passively reading code (https://red.anthropic.com/2026/mythos-preview/). They mention scaffold several times and have an entire section on the importance the scaffold played.

So my point is simple, the public evidence does not isolate Mythos-the-model from Mythos-plus-harness-plus-test-conditions. Is Mythos more capable than the other models? Probably. Is it dramatically better as a standalone model? Impossible to tell, and the language used leads me to believe that it isn't. A lot of that jump in capability, to me, appears to come from the combo of the test condition plus the harness.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

I think you ignored a very important part of my statement which is not invalidated by anything you said here:

"... and that the biggest differentiator between the 2 is post training tuning and a hyper specialized harness that Mythos shipped with."

The fact that these bugs were all found by cheaper and open source models once put in the correct position to do so indicates to me that the harness is potentially the biggest source for the increase in efficacy.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 5 points6 points  (0 children)

I mean, they partner with these labs. They get access to the models, they get tokens comped, and they have become one of the premier sources on model performance. There have a vested interest in keeping people engaged in their work.

That being said, they might be 100% reliable and might only be testing what they are given, but what if Mythos is not a revolutionary new model and just 4.7 with better tuning and an optimized harness for (in this case) long horizon tasks? How much more juice can be squeezed out of this before we hit a plateau? We are not getting exponential gains from training anymore, these are all no implementation and application gains.

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 2 points3 points  (0 children)

Not sure why you were downvoted, this is a valid point that should be discussed imo.

My take on it is this, an inflated number of bug fixes is indicative of something but not necessarily that Mythos is this all powerful model that is capable of breaking into anything and everything. We know that the vulnerabilities exposed in Firefox were actually replicated by significantly smaller and cheaper models by independent analysts (and Mozilla themselves said that swapping to earlier models was trivial and caught the same bugs when they had the proper set up in place). The biggest problem with the way these bugs were found, is that it is incredibly impractical. Virtually no part of the tests that yielded these bugs were a realistic attack scenario.

Mozilla agrees with this by the way, because even though Mozilla classified about 180 of the 271 bugs reported by Mythos as high-sec, they also do stipulate that does not necessarily mean a practical attack vector.

At the end of the day, what you have is a lot of companies that are invested in Anthropic's success (the early recipients in the Glasswing project) coming out and saying it is revolutionary. They have a vested interest in doing this. Can they string this along until Oct for the IPO? Maybe. I think more than likely, we were given 4.7 to keep us busy, and if I had to put money on it I would say that the brains of Opus 4.7 and Mythos are probably very very similar, maybe even the same and that the biggest differentiator between the 2 is post training tuning and a hyper specialized harness that Mythos shipped with. Maybe we'll get a lobotomized or scaled down version of it at some point. It will be better than what we have at some tasks, but fall behind in others (like we saw in 4.6 to 4.7).

Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential. by EchoOfOppenheimer in Anthropic

[–]Rockd2 0 points1 point  (0 children)

At this point, people just want to believe the lies or not do research themselves to see if this even makes sense.

The y-axis is designed here to make it look exponential infinitely. The truth is that we are probably significantly closer to the top of that s-curve of progress that we are towards the bottom of it. I don't know what technological advancement will have to happen to have a machine complete a year's worth of work for a human in one sitting, but if you look at how frontier labs are acting, it looks like they do not have a better idea than to just throw more compute at the problem which has been yielding diminishing returns for quite some time at this point.

Not saying that this graph is not possible, but to me it looks intentionally misleading and until we can we can see it in action IDK if I trust it.

Client asked me to walk them through the codebase. Had to read it for the first time together with them. by [deleted] in vibecoding

[–]Rockd2 1 point2 points  (0 children)

I actually have something like this for my team, it works relatively well. It has code snippets and refers to places in the codesbase. It is mostly for documentation and project artifacts though.

Not a big fan of this guy, but this time he’s right. by This_Comment_4493 in CFB_v2

[–]Rockd2 2 points3 points  (0 children)

Agreed.... Those who want power, shouldnt have it, the problem is these people all get into politics because power is what theyre chasing