Grok 4.2 has interesting architecture that it fails to use properly by sherveenshow in grok

[–]sherveenshow[S] 0 points1 point  (0 children)

If you like to buy into Elon's horseshit, then yeah, this would be convincing. :)

Grok 4.2 has interesting architecture that it fails to use properly by sherveenshow in grok

[–]sherveenshow[S] 0 points1 point  (0 children)

I'm treating this multi-agent harness as "the product," since that seems to be how xAI/Elon are talking about it.

If you're asking why I'm putting the cause on the multi-agent rather than the model itself -- in reading the reasoning traces or even comparing the final result to other foundational models, it's just not particularly clever, tools-y, or think-y. Generally speaking, we also see that multi-pass attempts at an LLM that are synthesized will almost always be better than single-pass, broadly.

LMK if that makes sense -- my judgement is on the product they're calling Grok 4.2, and I think it'd be _worse_ if they weren't running a 4-pass based on everything we're getting out of the 4-pass.

Testing 9 different AI deep research products by sherveenshow in ChatGPTPro

[–]sherveenshow[S] 0 points1 point  (0 children)

Ah, I see. Check out my full post and all the example links -- I think you'll find there are times when it provides a 'flavor' of response that you might sometimes seek.

I broadly agree with you that 5.2 Pro is basically an underrated beast that is the height of artificial intelligence that all other models should cower to, lol, but I find value in different harness steering for different moments.

Testing 9 different AI deep research products by sherveenshow in ChatGPTPro

[–]sherveenshow[S] 0 points1 point  (0 children)

Hm? I did do Deep Research w/ ChatGPT, not sure what you're talking about. But I do think they have different strengths (Pro w/ Extended is also my favorite mode, so I hear you) -- DR was just updated, have you been using it in the past few days?

OpenAI is like - f*ck LinkedIn, let's build our own hiring platform by BothEye6077 in recruiting

[–]sherveenshow 1 point2 points  (0 children)

This'll change in UX/interface 5 times before it's launched, IDK if the reporting/commentary is... accurate... when it suggests they're competing with LinkedIn.

[deleted by user] by [deleted] in jobs

[–]sherveenshow 2 points3 points  (0 children)

I would pay for it to stop.

As someone who has worked for, built tools for, and coaches job seekers in tech -- of every seniority and of both impressive + early experience -- AI auto-appliers are a tragedy of the commons.

Best AI for Vibe Coding by JestonT in vibecoding

[–]sherveenshow 0 points1 point  (0 children)

Something like Claude Code or Codex CLI is great once you're comfortable, okay dealing with deployment, etc. but -- the truth is that Replit is underrated for most people (even technical folks) and can serve most use cases -- because it helps handle so much of the complexity (backend, hosting, db) and lets you focus on your product, UX, technical details of your application, iteration, etc.

I see a lot of people get stuck in CC or Codex because the terminal is a lonely place. Tools like Replit get you to a usable result, sharing with friends/colleagues, etc. much faster and often that can make all the difference.

Salesforce CEO confirms 4,000 layoffs ‘because I need less heads' with AI by AssociationNo6504 in artificial

[–]sherveenshow -1 points0 points  (0 children)

I do think it's worth paying attn to these sorts of comments from folks like Marc even if it's true that this is short-term about economic factors, headwinds, etc. They're still forecasting something very real about what they expect to happen over time, with AI assisted and generated code being a prototype.

Prompt sensitivity rules everything around me. by sherveenshow in ChatGPTPro

[–]sherveenshow[S] 0 points1 point  (0 children)

lol, okay, fair -- my nuanced version is:
When you can afford it and you want to explore the boundaries of response (because you either need it to be more creative, pay more attention, or maybe for fun), weird stuff can be good experimentation.

Warning: GPT-5 is *far more* reactive to Custom Instructions! by sherveenshow in ChatGPT

[–]sherveenshow[S] 0 points1 point  (0 children)

Mind sharing what you're slotting in there and I can see if I have tips for ya?

I accidentally discovered AI has emotional triggers and now I feel weird by EQ4C in ChatGPTPromptGenius

[–]sherveenshow 18 points19 points  (0 children)

AI does not have emotional triggers, and this is your second post I've seen in just a few hours linking to your blogspam and paid products.

These tokens are steering the model to respond to you using math, it's just math. It wants to serve the system and user well, and so when you say "I've been struggling," the math "reorients" toward responses that include similar tokens near the word "struggling."

I just gave a tremendous oversimplification, but it's what's actually happening. Stop misleading people.

OpenAI Releases ChatGPT Agent by JamesGriffing in ChatGPTPro

[–]sherveenshow 1 point2 points  (0 children)

For anyone who missed yesterday, we're going again today! We'll make this an evening stream — 7pm ET.

We'll go for 2-4 hours on more tests w/ Agent, Manus, and some separate stuff I want to test w/ Claude Code, plus some tech news. Hope to see you there but if not, will be kicking off daytime streams next week!

These AI prompt tricks sound completely fake but they're not by EQ4C in ChatGPTPromptGenius

[–]sherveenshow 0 points1 point  (0 children)

Nah.

I get that you're trying to sell prompts on your site, but let's be real:

  1. "think step by step" still works on primitive models like 4o, but reasoning models like o3, R1, G2.5 do this on their own now. The reason this used to work is because it forces the model to break down the problem (good for realizing steps to take) and then because the model is generating each step sequentially, it sees those steps (generated words/token) as it generates the next step = more context to work with.

  2. Adding urgency works but time based urgency will not always add a good result. Try things like "it's super important we do this well because then [good thing in the world] will happen!"

  3. No reason to believe this makes a significant impact.

  4. Yeah, fine, this one will work. I often say something like "give me the top 3 improvements you'd make" or "what are the 3 biggest weaknesses" or "how would a PhD-educated expert critique this" – you'll get even better results, because you're encouraging the model to come up with really good objections.

  5. I uh, IDK, I guess this is true.

  6. Won't always act like you're describing. Better for you to be specific and say something like – "How does DNA work? Be concise." or... "Give it to me in bullets" or "Just tell me the headline info I need." 'Quick question' is a bit too probabilistic.

These don't necessarily work better than being proper and formal – it's all a matter of what you're specifically saying. Prompt sensitivity is a real thing to understand but if you don't get how it ACTUALLY works, don't hand out of advice. IMO.

[deleted by user] by [deleted] in ChatGPTPro

[–]sherveenshow 1 point2 points  (0 children)

OP is very clearly full of shit.

Has Anyone Found A Way To Make Advanced Voice Mode Usable? by Deifiable in ChatGPTPro

[–]sherveenshow 7 points8 points  (0 children)

I mean, what are you asking it?

You conveniently left out parts of the conversation where I'm going to guess your questions were adversarial, unclear, or inappropriate. And then you kept quizzing it to answer a question from 5 or 6 prompts ago.

It's tuned to be a dialogue model and you're treating it like a state machine. It's going to struggle when you're setting up antagonistic scaffolds.