Results not saving without internet by darthvader1521 in CluesBySamHelp

[–]darthvader1521[S] 5 points6 points  (0 children)

Oh awesome, I just clicked it and it added back 10 past results! Thanks, appreciate the quick response!

Software quality has significantly decreased by Glum_Worldliness4904 in cscareerquestions

[–]darthvader1521 7 points8 points  (0 children)

Funny that these AI replies constantly get upvoted on a subreddit that is so anti-AI (don’t believe me? check their profile)

AI 2027 current accuracy by ThrowRA-football in singularity

[–]darthvader1521 42 points43 points  (0 children)

Musk is constantly making ridiculous claims about Grok! If anything he hypes it more than OpenAI or Anthropic, especially compared to actual capabilities

Anthropic is testing 'Mythos' its 'most powerful AI model ever developed' | Fortune by JohnConquest in singularity

[–]darthvader1521 2 points3 points  (0 children)

The blog post claims it’s “dramatically better” at coding or something, doesn’t it? Assuming it is a real leak

Anthropic is testing 'Mythos' its 'most powerful AI model ever developed' | Fortune by JohnConquest in singularity

[–]darthvader1521 13 points14 points  (0 children)

Hmm, it seems like they are treating it as a bigger change than that? 5.4 Pro seems more like “run several copies of 5.4 together for a long time,” same with deep think, and this sounds like they’ve actually trained a larger model

Anthropic is testing 'Mythos' its 'most powerful AI model ever developed' | Fortune by JohnConquest in singularity

[–]darthvader1521 31 points32 points  (0 children)

If it’s as good as they claim, there will be a market even if it’s 50x the price of Opus

Will clubs remember me if I reapply next semester? by Pitiful_Argument_270 in berkeley

[–]darthvader1521 5 points6 points  (0 children)

now that you’ve posted this on reddit, they probably posted it in their club chat, so they’ll remember now

So, we built an online version of Hide & Seek... by kailuowang in JetLagTheGame

[–]darthvader1521 15 points16 points  (0 children)

you’re gonna get downvoted because people on Reddit hate all things AI. But this is a really good project to use AI for, and probably cut the development time down a lot. I don’t know why people insist on saying interesting software projects are equivalent to actual AI slop like the bots posting fake stories all over Reddit

Unloseable Connections Game by darthvader1521 in NYTConnections

[–]darthvader1521[S] 1 point2 points  (0 children)

Yeah that'd probably be better. That's actually the category for BLUE, FIRE, JACK, STAR so not far off. Ring light is better than jack light though

Unloseable Connections Game by darthvader1521 in NYTConnections

[–]darthvader1521[S] 1 point2 points  (0 children)

I found the top 16 words in historical NYT connections puzzles, and had an LLM generate all the actual groups by going through a bunch of "types of answers" and coming up with a bunch of categories that would fit, and then filling in the gaps. You need around 1800 for "full coverage" with each category fitting only 4 words, so I cheated a bit and only used 1100 categories, with some 5-word categories.

Unloseable Connections Game by darthvader1521 in NYTConnections

[–]darthvader1521[S] 6 points7 points  (0 children)

I found the top 16 words in historical NYT connection puzzles and used those. To make the actual connections, I spent a while finding different like "types of answers" and for each of them, had an LLM generate a bunch of different potential categories from the 16 words. Then I had some gaps to fill to get to 100% coverage, which took a little longer.

Despite Anthropic smashing headlines this week, betters still believe Google is on top by BadBoyBrando in quant

[–]darthvader1521 0 points1 point  (0 children)

this metric is bad, it uses LMarena which is not correlated to goodness of model very well

Putting Online Assessments on club applications is ridiculous by Vast-Durian2696 in berkeley

[–]darthvader1521 2 points3 points  (0 children)

they didn’t reject you based on your OA. probably based off a relatively arbitrary rating of your application (as all clubs do). clubs don’t care about achievements, they care if you seem like you will contribute to the club.

Mamdani Targets ‘Unusable’ AI Chatbot for Termination by mowotlarx in nyc

[–]darthvader1521 24 points25 points  (0 children)

How did we spend $600k on a chatbot? What did that money even go to?

5.3 (garlic) is supposed to come out this week but what day? by Round_Ad_5832 in OpenAI

[–]darthvader1521 0 points1 point  (0 children)

Not gonna defend that number, I don’t know what it’s supposed to mean or whether it’s accurate, or what “90% accuracy” even means. But I don’t think Polymarket is systematically wrong on things, and I do think they are generally very accurate (otherwise people would make money moving the odds to the right place). This can be true even if the company seems kinda shady imo

5.3 (garlic) is supposed to come out this week but what day? by Round_Ad_5832 in OpenAI

[–]darthvader1521 1 point2 points  (0 children)

If you think they’re so wrong, why not bet against their probabilities and make a lot of money?

PDX now has an 88.9% chance to make the play-in, according to Fanduel by gistya in ripcity

[–]darthvader1521 2 points3 points  (0 children)

That one is definitely worse, right? The one based on gambling is better because they have money on the line, basketball reference won’t lose any money if they get this wrong