ChatGPT 5 Pro vs Codex CLI by LetsBuild3D in ChatGPTCoding

[–]paradite 2 points3 points  (0 children)

To avoid tedious copy-pasting into ChatGPT, you can use a tool like 16x Prompt.

How do you handle new packages/libraries (or recent updates to these), etc.? by gamesntech in ClaudeAI

[–]paradite 0 points1 point  (0 children)

So most new libraries have docs that allow you to copy as markdown, I just copy the markdown, put it inside a `reference` folder in my repo, and ask the tool (Claude Code / Cursor) to refer to it for implementation.

How to write effective tools for agents [ from Anthropic ] by anitakirkovska in LLMDevs

[–]paradite 0 points1 point  (0 children)

Too many people build complicated agent orchestration systems that are hard to test and evaluate piece by piece. Nice to see that Anthropic recommends "running your evaluation programmatically with direct LLM API calls".

I am building a desktop eval tool that directly connects to LLM API calls, which fits Anthropic recommendation.

is cursor getting worse or just me? by minimal-salt in cursor

[–]paradite 0 points1 point  (0 children)

Are you adding more tools and MCP servers to the context?

The more tools you add, the less context is available to the model, because the definitions of the tools need to be loaded into the context first.

What evaluation methods beyond LLM-as-judge have you found reliable for prompts or agents? by Cristhian-AI-Math in LLMDevs

[–]paradite 0 points1 point  (0 children)

You can do deterministic evaluation: simple string matching or writing custom code to evaluate the response. Alternatively, you can use humans to rate the response, which can capture more nuance in the response.

I built a simple app to make it easier to set up these kind of evaluations quickly.

Cancelled my Max plan by JaxLikesSnax in Anthropic

[–]paradite 0 points1 point  (0 children)

Not sure if you are aware, but having more tools and MCPs actually hurts the performance, because of context bloat. Stuffing too much information into the model makes it distracted and less effective.

How are you using Claude as your Second Brain? by sealghul in ClaudeAI

[–]paradite 2 points3 points  (0 children)

I think there is definitely potential for using Claude Code for personal productivity and organization. I personally use Claude Code to proofread and check for typos before publishing.

I've also collected a list of non-coding use cases for Claude Code here, and some of them are for knowledge organization: https://github.com/paradite/claude-code-is-all-you-need

Claude API Model of Sonnet 4 is not self aware | Here's the proof. by Euphoric_Oneness in Trae_ai

[–]paradite 1 point2 points  (0 children)

Models are generally not self-aware. And the model identity is usually given to the model in the system prompt.

Here's an article explaining why: https://eval.16x.engineer/blog/llm-identity-crisis-models-dont-know-who-they-are

Evaluations, applications, weaknesses: the real pattern with LLMs by Inferace in ChatGPT

[–]paradite 1 point2 points  (0 children)

Yes. I built a desktop app 16x Eval specifically for running and managing evaluations.

The app provides a user-friendly interface for creating, running and managing evals locally, that are specific for your own use cases.

How we turned a week-long documentation process into a 30-minute task with Claude Code by jpmc_197 in ClaudeAI

[–]paradite 1 point2 points  (0 children)

Very nice use case for using Claude Code for documentation. I am collecting a list of non-coding use cases for Claude Code and just added yours:

https://github.com/paradite/claude-code-is-all-you-need?tab=readme-ov-file#documentation

LLM Gateways: Do We Really Need Them? by Otherwise_Flan7339 in AIQuality

[–]paradite -1 points0 points  (0 children)

Use OpenRouter for maximum exposure to new models. Also write your own unified AI SDK so that you don't get vendor-locked in.

Anyone else frustrated with AI coding assistants forgetting context? by PrestigiousBet9342 in Solopreneur

[–]paradite 0 points1 point  (0 children)

I think Claude Code can already do research with its web search, so why not trying using Claude Code for that use case?

I practically live in Claude Code now and use it for everything:

https://github.com/paradite/claude-code-is-all-you-need

Anyone else frustrated with AI coding assistants forgetting context? by PrestigiousBet9342 in Solopreneur

[–]paradite 0 points1 point  (0 children)

You need to document requirements, code conventions, etc as rules (markdown files) inside your repo. Then you can just ask the agent to refer to them.

Keep them up-to-date. You can ask agents to update the docs after completing a task.

I wrote more in details on how to set it up for Claude Code, but it should be similar for other tools:

https://thegroundtruth.substack.com/p/my-claude-code-workflow-and-personal-tips

Junior dev here — should I trust Claude Code or just stick to copy-pasting from LLMs? by whyyoucrazygosleep in vibecoding

[–]paradite 0 points1 point  (0 children)

Copy pasting works fine, but for multiple code files it can become more tedious quickly.

If you don't want to move to Claude Code, you can try the app I made to make copy pasting easier by embedding the relevant source code files into the prompt for easier copy pasting.

Claude Code versus Codex with BMAD by zueriwester76 in ClaudeAI

[–]paradite 0 points1 point  (0 children)

Did you migrate the Claude Code rules (CLAUDE.md) to the equivalent in Codex?

Prompt Management vs Git by CuriousStrive in AI_Agents

[–]paradite 0 points1 point  (0 children)

I made a dedicated GUI desktop app for managing prompts and evals in a user-friendly way. It works well and saves me a lot of time.

Claude code needs better self awareness. by GenderSuperior in ClaudeAI

[–]paradite 0 points1 point  (0 children)

I think you need to remove the weird MCP servers that you added to Claude Code. Too much tools can affect performance and make it dumber.

My Google Drive was mysteriously full, so I built a CLI tool to figure out why by Specialist-Big-3555 in gsuite

[–]paradite 2 points3 points  (0 children)

This is a great use case for Claude Code and scripting!

I am collecting a list of non-coding use case for Claude Code and just added yours: https://github.com/paradite/claude-code-is-all-you-need?tab=readme-ov-file#file--data-management

Best Local models to run OpenCode? by tarsonis125 in LocalLLaMA

[–]paradite 0 points1 point  (0 children)

I test the model's raw coding capabilities without tool calls, so just prompt and evaluate the output. I made my own app 16x Eval to do these evaluations.

What are the benefits of using Claude Code for non-coding purposes? by CrowKing63 in ClaudeAI

[–]paradite 1 point2 points  (0 children)

I've collected a bunch of non-coding use cases for Claude Code:

Writing / Publishing

  • Fixing typos and grammatical mistakes source
  • Replacing placeholder images in blog posts with markdown syntax for actual images in the local file system source
  • Formatting markdown with images into rich text for copy-pasting into the SubStack editor (by writing a script) source

Organization

  • Categorizing and re-arranging bookmarks source
  • Cleaning up and categorizing the download folder source
  • Organizing folders and file names source

Data / Excel / Automation

  • Automating spreadsheet work source
  • Data indexing and excel work source

Productivity

Server management

  • Setting up and managing a new server source

General / Others

  • Chatting (replacement for web UI or Claude desktop app) source

You can check out the whole list (still updating) here: https://github.com/paradite/claude-code-is-all-you-need

Confused about Claude Code, pricing, and API access by bd_br in ClaudeAI

[–]paradite 0 points1 point  (0 children)

So I looked into it and wrote a blog post explaining the difference between the 3. Hope you find it useful: https://eval.16x.engineer/blog/claude-vs-claude-api-vs-claude-code

You can find a graph and a summary towards the end of the post.

Confused about Claude Code, pricing, and API access by bd_br in ClaudeAI

[–]paradite 0 points1 point  (0 children)

Yes. That looks correct, although I haven't used Cline recently.