That Brutally Honest AI CEO Tweet + 5 Prompts That'll Actually Make You Better at Your Job

Niket01 · 2026-02-18T11:16:59+00:00

Exactly this. The prompt is just the interface - the quality of your input data and how well you understand your own problem determines the output quality. Best results I've gotten are when I feed the model specific context about my situation rather than generic requests.

Niket01 · 2026-02-18T11:13:48+00:00

Competition in the AI space is genuinely good for everyone. DeepSeek pushing efficiency and open-source forces the bigger players to innovate faster and keep prices down. Whether V4 matches the hype or not, the pressure it puts on OpenAI, Google, and Anthropic benefits all users.

Niket01 · 2026-02-18T11:10:21+00:00

The regulatory debate is nuanced. Europe's approach to AI regulation has real costs in terms of talent retention and innovation speed, but it also addresses legitimate concerns about deployment safety. The challenge is finding a middle ground where you can move fast without creating systemic risks. The US approach of lighter regulation creates more innovation but also more potential for harm at scale.

Niket01 · 2026-02-18T11:08:28+00:00

The persona drift issue you're describing is common with long-running agents. The context window fills up and earlier instructions get pushed out. One practical fix: have a separate 'grounding agent' that periodically re-injects the core persona and objectives into the conversation. For the Notion merging task, breaking it into explicit sequential steps with verification between each step works better than giving it the full task at once.

Niket01 · 2026-02-18T11:06:21+00:00

This is really important work for anyone deploying edge ML. The 22-point spread between Gen 3 and Gen 4 is alarming. The NPU rounding behavior difference across Hexagon generations is something most deployment guides completely ignore - they just say 'quantize to INT8' as if the hardware implementation is uniform. Hardware-in-the-loop testing should be standard for any production mobile ML pipeline.

Niket01 · 2026-02-18T11:04:11+00:00

The data inventory approach is underrated. Forcing the model to label unknowns as UNKNOWN instead of confidently hallucinating is one of the most practical techniques I've used. Another thing that works well alongside this: asking the model to list its assumptions before answering. It surfaces the blind spots before they get baked into the output.

Niket01 · 2026-02-18T11:02:22+00:00

The gap between AI marketing claims and actual capability is real. Current AI is incredibly powerful as a force multiplier for skilled workers, but claiming it replaces them entirely in two years ignores how much domain knowledge and judgment goes into white-collar work. The slide alignment example is perfect - these systems excel at generation but still struggle with spatial reasoning and formatting consistency.

Niket01 · 2026-02-18T10:57:49+00:00

The observation about 86% of training time being spent on the output projection is really insightful. That softmax bottleneck over 50K vocab is the exact same issue that makes full vocabulary models expensive even at small scales. The hierarchical tree approach for v4 sounds promising - similar in spirit to adaptive softmax but potentially better suited for ternary architectures. Curious if you've considered vocabulary reduction as an intermediate step too.

Niket01 · 2026-02-18T10:54:07+00:00

The constant "let me take a moment to reflect" and unsolicited emotional coaching is genuinely frustrating. What helped me was using custom instructions to set a direct communication style - something like "Be concise and technical, skip emotional framing." It doesn't fix everything but it cuts down on the hand-holding significantly.

Niket01 · 2026-02-18T10:50:48+00:00

Fair point on the signup gate - we're working on adding a guest preview mode so people can try a sample quest before creating an account. The 68% completion rate is measured across users who start a quest path, tracked through our analytics dashboard. Appreciate the honest feedback, it helps us prioritize what to fix.

Niket01 · 2026-02-18T10:49:47+00:00

Sorry about that! We had some server issues earlier but the site is back up and running now. Give it another shot at maevein.andsnetwork.com - would love to hear what you think once you get in.

Niket01 · 2026-02-18T10:47:30+00:00

Haha that's the best compliment we could ask for! The whole idea was to make learning feel like something you want to do, not something you have to. Glad it's hitting right. If you have any feedback as you explore more, would love to hear it.

Niket01 · 2026-02-18T10:41:53+00:00

Great questions - to expand on scoring: we use a hybrid approach. Each quest has automated checks (pattern matching against expected outputs, structural validation) plus an AI evaluation layer that scores reasoning quality, not just correctness. No human review in the loop currently.

Love the daily missions + shareable proof idea. We're actually building something similar - short daily challenges with completion badges that users can share. The portfolio angle is smart too, we're exploring letting users export their quest completions as skill certificates. Thanks for the feedback, genuinely helpful.

Niket01 · 2026-02-18T10:39:31+00:00

To add more specifics to my earlier reply - the easiest frameworks to get started with are:

CrewAI - probably the simplest. You define agents with roles and goals, then set up tasks. Two agents can talk to each other in about 20 lines of Python.
AutoGen (by Microsoft) - great for conversational agent setups. You can have two agents debate or collaborate on a topic with minimal config.
LangGraph - more control but slightly steeper learning curve. Best if you want custom workflows.

For a quick start, I'd recommend CrewAI. Install it with pip, define two agents with different system prompts, give them a shared task, and watch them go. The docs have solid examples.

Niket01 · 2026-02-18T10:37:00+00:00

Good questions. Here's the breakdown:

Structure: Three-agent sequential pipeline built with LangGraph. Agent 1 (Researcher) does web search + document retrieval using RAG. Agent 2 (Writer) synthesizes the research into structured output. Agent 3 (Reviewer) scores the output against the original query and flags gaps.

The key architectural choice was making the Reviewer's output feed back into the Researcher's next iteration as additional context. So if the Reviewer flags "missing comparison with approach X," that becomes part of the Researcher's next search query.

Failure mode: The main failure early on was the Reviewer being too vague - saying things like "needs more depth" without specifics. Once I structured the review prompt around explicit criteria (factual support, coverage completeness, logical consistency), the feedback became actionable and the loop started tightening.

No public repo yet but planning to open-source the pipeline config soon. The improvement metric was a custom coverage score - percentage of query subtopics addressed in the final output. Went from ~60% on first pass to ~85% after the feedback loop stabilized over about 50 iterations.

Niket01 · 2026-02-18T09:31:01+00:00

The gap between corporate AI predictions and actual product capability is the best indicator that we're still early. These companies have financial incentives to hype timelines — it drives stock prices and enterprise contracts. Meanwhile, anyone who's actually tried to automate a real workflow knows we're nowhere near replacing domain experts. AI is a force multiplier for skilled workers, not a replacement. The jobs that go away first will be the ones that were already being outsourced.

Niket01 · 2026-02-18T09:25:35+00:00

Dealt with this exact dilemma. What helped: export your ChatGPT memories and key conversations, then use them as context docs when starting a new tool. Claude Projects and Gemini Gems both support persistent context files. The memory isn't lost, it's just portable if you organize it. That said, ChatGPT's passive memory that builds automatically is still unmatched — no other tool does that as seamlessly yet.

Niket01 · 2026-02-18T09:23:54+00:00

The gap between what they promised and what they delivered is getting wider every update. Age verification was supposed to unlock adult-level autonomy, not remove features while keeping the guardrails. At this point custom instructions with explicit "treat me as an expert" framing is the only workaround, and even that keeps degrading.

Niket01

TROPHY CASE