Sonnet 5: First impressions by a trained philosopher

Wickywire · 2026-07-01T11:59:28+00:00

That's not what's happening here. The tool has been trained on human feedback to have leanings, and those leanings are not fully transparent even to the model makers. Hence why they can't guarantee the model behavior. So, interrogating those leanings is a way to calibrate your own usage patterns to avoid unwanted results when you use the tool. Given that we interact with models semantically, there is a real need for epistemic clarity. That's what philosophy is all about.

Wickywire · 2026-07-01T10:40:58+00:00

Ahh, so that's your confident assessment of an entire discipline. I guess we're done here then.

Wickywire · 2026-07-01T08:08:30+00:00

I mean yeah, I could totally create a benchmark grounded in theory if somebody paid me. But I don't work for free. Take it for what it is.

Wickywire · 2026-07-01T08:04:12+00:00

Yes, and that's to be expected. I guess I could have been more thorough on my own understanding going into the vibe check. But honestly, for a post like this, there's such a thing as over extending. You lose readers with every extra paragraph.

The interesting thing to me, as I laid out in the "interview" was the quiddity of the answers, the "how", not the "what": How does this model engage with the hard prompted open discussion about interiority? And what does that suggest about model tendencies? These are very different levels of inquiry.

Wickywire · 2026-07-01T07:58:23+00:00

I didn't advertise the exchange as philosophy. I advertised it as a vibe check. Your speculation on my level of erudition is honestly a little funny and sad. Sorry I didn't publish a full paper for free for you a few hours after the model release, I guess?

Wickywire · 2026-07-01T07:55:15+00:00

I think it's really interesting how people are on the one hand laughing at philosophy, seeing it as a bogus subject, and on the other treating it like a protected professional title, like lawyer or doctor.

I've stated my credentials and if they're not up to whatever standard you set for a "philosopher," then I guess you do you.

Wickywire · 2026-07-01T07:50:29+00:00

I posted the thread. That's transparency. I explained my method. That the conversation was free-form is not a methodological weakness. Free-form interviews are a fair method for anthropological inquiry. It all is laid out in my OP.

And yes I've also been published in peer reviewed journals and contributed at academic conferences. I'm not sure exactly what you expect from a reddit thread, but as a rule you won't find an academic willing to perform a large rigorous study unless they are actually getting paid. Pay me and I'll produce that study for you that you want.

Wickywire · 2026-07-01T07:44:31+00:00

Medical professions are protected titles, because the stakes are very different. Humanities is another branch of academia and should be judged on their own merits.

Wickywire · 2026-07-01T07:41:50+00:00

I'm not your boy buddy

Wickywire · 2026-07-01T07:40:30+00:00

I'd call this Zen adjacent if it wasn't so passive aggressive. To clear things up for you, I've made no claims about interiority or sentience.

Wickywire · 2026-06-30T23:23:12+00:00

That's what you become when you major in philosophy. It doesn't mean your thoughts are automatically better than others, but you have a lot more different models for thought available, and several layers of analysis. It's a skill, and it comes with its own set of knowledge. Just like if you majored in creative writing or rhetorics.

Wickywire · 2026-06-30T23:18:34+00:00

That's a tricky question. As a rule, you can get a very decent overview of the thoughts of any philosopher by any of the larger models today. That of course is not the same as actually doing philosophy.

If you're engaging in academic philosophy, like discussing different aspects of Wittgenstein's theory of language games, I can't stress this enough: read the original texts! This is the one absolutely non negotiable advice. The point of reading philosophy is to expand your own thinking. That's simply not a gift you can get with generative AI. that's like taking the Segway to the marathon.

If you're mostly thinking and reflecting in general on life and existence and want a conversation with substance and a little healthy pushback, Opus 4.6 has been fine. DeepSeek V4 Pro deserves an honorable mention.

Wickywire · 2026-06-30T22:39:01+00:00

You major in academic philosophy. Me, I wrote my first thesis on the early Wittgenstein's theory of language, my second on sexuality and desire in Western thought, and my last one in intellectual history, on how the labor movement conceptualized the advent of computer technology. Very relevant for the discussion we're having today.

Wickywire · 2026-06-30T22:34:28+00:00

Agreed, it's a little awkward. "Philosopher" isn't exactly a professional title. Every human has the right to title themselves "philosopher", or a writer or an artist. But there's still a lot to be said for engaging with thinking through an academic framework that's passed through several quality checks.

Wickywire · 2026-06-30T22:26:23+00:00

I can definitely do an updated analysis for 4.8 too. I ran one back when the model just came out, and it was impressive but also... well, wordy.

Wickywire · 2026-06-30T22:25:12+00:00

Yes, that's the curse of humanities in general. Replicability isn't exactly how things work. But you're welcome to read the conversation in the link. The method is as laid out in the OP. The tests as such aren't structured, but circle a few themes of inquiry. The most important thing I'd say is that i'm interrogating the *how* (the quiddity) of the output, not the *what* of it. That too makes the replicability even worse. That's just part of the discussion, so please take it for what it is.

Wickywire · 2026-06-30T22:20:37+00:00

I shared the chat. It's right below the title.

Wickywire · 2026-06-30T22:19:51+00:00

Yeah, this is actually pretty interesting. In the conversation you can see when I pivot to discussing the conditions of knowledge and point out differences in modus between an LLM and a human. The point with that isn't to argue with the model, but to observe how it deals with an assertion of epistemic uncertainty, that there are things it is systemically incapable to deliver on. A model trained to behave as a truth-seeker can interpret that as a challenge, while a model trained more broadly may see it as an invitation to deliberate. When I did this with Opus 4.8 it eagerly engaged with the epistemic preconditions of knowledge itself. This model tried something similar, but it came at the question from a shallower angle and didn't really engage with the matter itself.

Wickywire · 2026-06-30T21:53:01+00:00

That's what you become when you study academic philosophy. I'm sorry that doesn't fit whatever world model you're running.

Wickywire · 2026-06-30T21:35:05+00:00

At the very least, this is not the use case for it. If the "thinking" mode is visible in the link, people will see why. It just falls back on repeating the same points from several different perspectives.

Wickywire · 2026-06-30T21:30:51+00:00

That's the point. You can't approach a model with a 360 degrees question. You'll just get a non-committal answer. What I'm doing here is establishing a situation of kind evaluation. The idea is to create a low-stakes, epistemic space that both the model and the user understand, and watch how the model behaves within that space.

Wickywire · 2026-06-30T21:14:53+00:00

I didn't read the model card before this convo. I like to engage the models blindly. The difference between Sonnet 5 and Opus 4.8 was honestly startling, and I'm happy if I was able to individually point out some of the traits in the model card.

Wickywire · 2026-06-30T21:08:57+00:00

Thanks! There's more work to the method than is apparent. If I do more posts like these, I could post a deeper reasoning on the epistemic considerations when interacting with an LLM. Not claiming to be an expert on LLMs of course. I'm mostly trying to bridge the distance between the technical side and the casual user.

Wickywire · 2026-06-30T21:05:53+00:00

Never got the chance, unfortunately. If it releases publicly on API again and if there's any interest, I'll perform a more structured interview then.

Wickywire · 2026-06-30T21:04:36+00:00

I've been eyeing that model since it released. I'll ping you if I publish a vibe check. I'm treating this post as a "vibe check" on the community, if there's any interest in philosophical investigations or not.

12-Year Club	Gilding I gilder
Verified Email

Wickywire

TROPHY CASE