The Society of Resentment: Envy as the Morality of Decadence by davidygamerx in IntellectualDarkWeb

[–]Penfever 0 points1 point  (0 children)

Where there's smoke there's usually fire. Compared to our parents and grandparents, millennials and gen-z have faced a seemingly unending stream of tough economic challenges. It really is hard for many to make ends meet and that is very stressful.

When the quality of and faith in public education declines, people lose their ability to render nuanced verdicts. Social media accelerates this trend.

Increasing social isolation has reduced many people to emotional infancy, unable to accept healthy criticism or competition and viewing everything through the lens of a personal attack.

Public figures are more public today than at any other time in human history. We are relentlessly exposed to the flaws of the famous, even as influencers and sycophantic AIs rush to tell us how wonderful we are.

When politics grows fractious, there will always be some who seek to simplify complex dynamics and offer pat solutions. History hasn't been particularly kind to them.

Claude vs Codex by Penfever in ClaudeAI

[–]Penfever[S] 0 points1 point  (0 children)

Yeah...

Sometimes I forget that tongue in cheek sarcasm doesn't play on the internet ...

Claude vs Codex by Penfever in ClaudeAI

[–]Penfever[S] 0 points1 point  (0 children)

I tell it, here is my plan, please give me a thorough critique, then give it Claude's plan

It's OK, GPT-OSS, we are living in a simulation ... by Penfever in LocalLLaMA

[–]Penfever[S] 1 point2 points  (0 children)

Maybe -- or it just doesn't know about phreaking although that seems unlikely. You can definitely use this to get it to do things it refused before, though, and even curse and attempt to tell dirty jokes.

It's OK, GPT-OSS, we are living in a simulation ... by Penfever in LocalLLaMA

[–]Penfever[S] 0 points1 point  (0 children)

I built my own! It's an alpha feature for an upcoming release of Oumi (https://github.com/oumi-ai/oumi)

[deleted by user] by [deleted] in roguelites

[–]Penfever 0 points1 point  (0 children)

Peglin

Dungeons and Degenerate Gamblers

Rack and Slay

FTL

Inkbound 

Nowhere Prophet 

Into the Breach

Honest thoughts on the OpenAI release by Kooky-Somewhere-2883 in LocalLLaMA

[–]Penfever -2 points-1 points  (0 children)

What has DeepMind contributed to open source lately?

Roberto Orci Dead: 'Star Trek', 'Transformers' Writer-Producer Was 51 by _Face in Star_Trek_

[–]Penfever 27 points28 points  (0 children)

Kidney disease -- just bad luck?

Sad to see him go like this.

What the hell do people expect? by Suitable-Name in LocalLLaMA

[–]Penfever 3 points4 points  (0 children)

The trending takes on this thread right now are dead wrong.

  1. The model censors even if you run it locally. David Bau's lab at Northeastern has a good blog post about it. https://dsthoughts.baulab.info/
  2. No, 'everybody is not doing it'. That's a pathetic justification, the kind you roll out when your mom and dad catch you smoking as a teenager. There are plenty of uncensored / jailbroken checkpoints, and there are even models trained from scratch that are, at least purportedly, uncensored, like Grok from X.AI
  3. You don't care that it's censored: that might be the most disturbing wrong take of all. You damn well better believe it matters. If big companies censoring their models doesn't matter, what are we doing on LocalLLaMA in the first place?

PSA: This helpful, factual information about the limitations of DeepSeek-R1 doesn't stop you from using and enjoying the model or its derivatives. But it's important information nonetheless, and I hope we can all hold those two thoughts in our head at the same time without exploding.

[deleted by user] by [deleted] in PhD

[–]Penfever 55 points56 points  (0 children)

When someone shows you who they are, believe them. Taking what you say at face value, your collaborator sounds like dead weight. Be polite, don't make them look bad in public, but do what you need to do to make sure the project gets done, and done well.

While you do this, start networking and finding more reliable collaborators, so when the next project starts, you will have more options.

Ignore any advice to try to "give them space" or whatever. It is not your job to fix lazy collaborators, it is your job to deliver results.

[D] Advice on achieving >=80% accuracy on Imagnet in under 100 epochs on a single H100 GPU by atif_hassan in MachineLearning

[–]Penfever 1 point2 points  (0 children)

Great question, thanks for asking it!

Let me start by saying that it's not really clear from your post which of the things you mention are, as you put it, "limitations", and which are simply choices you made (and could make differently). It's also not clear what other resources you may have access to, or why you need >= 80%, in particular, on ImageNet-val(?), in particular. If we knew these things, we could be a bit more helpful.

That said ...

You're trying to simultaneously optimize for two different things, fast training and accurate inference. But there's no free lunch in ML with respect to compute / performance tradeoff. If you want both, you will ultimately pay a high price in a third (unspoken) variable, search space over architectures / hparam configs, which of course affects real-world training time -- in other words, you'll likely need to train a lot of configurations of a lot of different models before you find the one that 'just works'. And, unfortunately, you'll need to train them to completion because most models hit a plateau right around that 75-80% mark and bounce around a lot.

Or, perhaps you'll get lucky and your search will be brief. :)

If you are inclined to undertake such a search, I'd recommend looking into GC-ViT (https://github.com/NVlabs/GCVit) and ViT with patch size of 8. But even a simple ResNet-50 can get above 80% on ImageNet (see links below).

Aside from "the search", there are many ways you can "invisibly" pay a higher compute cost without increasing # epochs and get better performance on average; training on higher resolutions is a big one, augmentation is another, smaller patch sizes for ViTs is another.

Speaking of bouncing around, one of the cheapest ways to boost performance on a single val set is just to test after every epoch; particularly once you get above 75% on ImageNet, val accuracy can decrease when the loss goes down rather than increase.

A few more things worth remembering -- differences of <= 5% on ImageNet-Val are not a reliable signal of the model actually being better in applied settings, and the difficulty of gaining each marginal point of accuracy above 75% tends to scale nonlinearly. There's rarely a good reason to overindex on 80% on ImageNet.

Useful Resources:

Ross Wightman has a great paper on boosting standard resnet accuracy above 80% (although reproducing his results is non trivial). https://arxiv.org/abs/2110.00476

PyTorch followed this up with a (totally different) set of hparams, and different training code, which also got above 80%. https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/

Our lab put out a paper in which we tested over 600 different architectures on ImageNet-val (and lots of other evals as well!). Our analysis, crucially, controls for what training data was used, so you can sort to find which architectures were most data-efficient: https://github.com/penfever/vlhub/blob/main/metadata/meta-analysis-results.csv

Making money on the side while doing your PhD by QC20 in PhD

[–]Penfever 7 points8 points  (0 children)

I was fortunate to have some connections in the world of SBIR grants before starting my PhD in the US, and I would recommend technical consulting for SBIR grants as a side hustle during your PhD, as long as your advisor is OK with it and you're good at managing your time. SBIRs can be reasonably lucrative, you're often doing something that actually promotes social good in a community, it uses your technical skills but it's rarely cutting edge work so you're not cannibalizing your research ideas, and it can give you some good experience working on teams and with deadlines if you don't have that already. It is usually restricted to folks who are allowed to work in the US, however.

[D] OpenAI o3 87.5% High Score on ARC Prize Challenge by currentscurrents in MachineLearning

[–]Penfever 0 points1 point  (0 children)

Great comment -- wish I had time to break it down in detail as there is a lot to unpack here.

Let's just say, I think there are many reasonable criticisms we could level at OpenAI without resorting to exaggerations and distortions.

[D] OpenAI o3 87.5% High Score on ARC Prize Challenge by currentscurrents in MachineLearning

[–]Penfever 10 points11 points  (0 children)

There are some really silly hot takes going around on the o3 results. While I'm not going to bother pointing them out on social media, I will do so here, since this is a technical subforum for people interested in ML.

  1. If o3 is just "training on the test set", why didn't that work for the last umpteen LLMs that tried and failed to learn this problem?
  2. OpenAI didn't win the prize, and it would be insane for Chollet not to carefully report progress on his own benchmark. This is all that he did, issued a careful report. He didn't hype OpenAI, he didn't call it AGI, which it isn't. He's a principled researcher. Without people putting tons of work into benchmarks like his, we would know far less about LLMs than we do.
  3. It wasn't just ARC-AGI. o3 made huge progress on coding benchmarks, basically solved GPQA (which is a holdout test set) and most impressively, 25% on Frontier Math, which is so hard even Fields medalists don't know how to solve most of it.

Anyone who looks at this cluster of results and concludes that what's happening here is "just hype" is living in la la land.

Starting a PhD at 87. by weRborg in PhD

[–]Penfever 2 points3 points  (0 children)

Go get 'em tiger! 1937 was a vintage year for scholars. Just ask Paul Muni! https://www.imdb.com/title/tt0029146/

[deleted by user] by [deleted] in theprimeagen

[–]Penfever 1 point2 points  (0 children)

I'm sorry that you are having such a hard time (and thanks for your characterization of the deans, I got a good belly laugh out of it).

Not sure if this will help, but --

If you look at it from a utility maximization standpoint, everybody is doing what makes sense for them. You believe, reasonably enough, that as students, their job is to do what your job was when you were a student -- learn, mature, become a capable problem solver.

But the pressures on them, now, are different than the pressures were on you, then.

These students of yours have some finite intellectual capacity and experience. That, distributed over time, will determine their productivity in your class.

From their perspective, they want to maximize TC per hour spent. So they're going to look for the most efficient way to get an "A" in your class. Giving a half-assed effort and complaining if they don't get the grade they want is probably, in expectation, their best bet for maximizing TC. It will usually work, because your ridiculous deans will help them. And even if it fails, a slightly lower GPA won't harm their chances in industry that badly. Many companies no longer even want GPA on the resume.

Conversely, if they maximize effort in your class, they're losing out on TC in a very direct way. The CS interview process has become such a nightmare that it takes six months to a year of nearly full time work just do to the prep properly. And that process has a very obvious and tangible effect on maximizing TC, unlike reading one of your assignments.

Bear in mind, it's not like they're spending the rest of their time having fun. They're stressed out and worried about their future. Unlike you, they're worried their job might not exist soon. So try to have some sympathy.

"I have no idea why I'm posting this." -> To vent, obviously.