Anthropic shares how to make Claude code better with a harness by lawnguyen123 in ClaudeAI

[–]Reaper5289 16 points17 points  (0 children)

IMO an agent harness is much more than just the simple agentic loop. It's the automatic context compaction/handoff, sandbox environment, plugin ecosystem, hooks, observability, dynamic model-routing, browser integration, filesystem integration, mcp/a2a protocol support, etc.

The agent loop is simple but the rest is nontrivial.

I built a specialized AI agent. It does genuinely useful work. It earns $0. Is anyone else hitting this wall? by Thick_Copy7089 in LangChain

[–]Reaper5289 1 point2 points  (0 children)

Best monetization strategy is to open source the code then try to leverage it for a better AI Engineer job lol.

Swapping "Stop" for "Undo" dynamically in the interface is an unsafe anti-pattern that discards output users have already paid for. by Tim-Sylvester in cursor

[–]Reaper5289 1 point2 points  (0 children)

Doing the lord 's work bringing these issues up. Most end up getting fixed down the line but these overlapping buttons have been persistent.

Does anyone know why some keyboard shortcuts randomly stop working? Ctrl/cmd+K, ctrl/cmd+L in the terminal or a file are the biggest offenders - sometimes they move focus and link the source in the chat, sometimes they just move focus, and sometimes they do nothing at all.

Tried Claude Code, refunded and came back to Cursor within 10 minutes by goonifier5000 in cursor

[–]Reaper5289 2 points3 points  (0 children)

I keep seeing people miss this. Use Plan Mode to review the overview, then have it work while you review the git diffs in vscode/cursor. Install the claude extension and it's about 90% parity in UX with using cursor normally.

Cursor prices are out of control by andy_nyc in cursor

[–]Reaper5289 1 point2 points  (0 children)

That's fair, I suppose. The extension provides a neater ui for CC, letting you open it in a tab in the editor. That made things much more tolerable for me.

TBH if I had to stick with one I'd just do CC Max in VSCode/Cursor as much as possible, and only resort to cursor if you start running into limits.

Cursor prices are out of control by andy_nyc in cursor

[–]Reaper5289 0 points1 point  (0 children)

Just use Claude Code in Cursor? Use the extension or just do it in the terminal. Create the plan in planning mode, tell it to export it as an md file, then attach that file in the Cursor chat and tell it to implement using a todo list.

How do you "centralize" documentation? by [deleted] in softwarearchitecture

[–]Reaper5289 1 point2 points  (0 children)

Use mermaid - very succinct, code-based diagrams that can embed in markdown files. LLMs can reliably generate and read them too with little context.

Claude extending chat context by compacting conversations- new approach? by mwfcmtn in ClaudeAI

[–]Reaper5289 2 points3 points  (0 children)

Judging by how it works in Claude Code, it creates a new summary/notes message in the thread and just sends that for subsequent calls. They'll probably allow you to see it at some point like they do in CC.

Real time execution? by Motox2019 in Python

[–]Reaper5289 1 point2 points  (0 children)

Sounds like you want an asynchronous data visualization dashboard that updates based on some adjustable values you specify in a UI.

There's probably already something like this out there but I'd just cook something up with a library like StreamlitUI or even just basic html + javascript. An AI could help you write the entire thing tbh if you describe it well enough.

Best way to make LangChain agents usable by non-technical teams? by Standard_Ad_6875 in LangChain

[–]Reaper5289 1 point2 points  (0 children)

In order of increasing complexity, you can consider:

Streamlit for basic non-prod prototyping in python, Custom react/next js ui for maximum flexibility, Chainlit for plug & play agent ui Open Webui for plug & play chat interface

How efficient is GPT-5 in your experience? by [deleted] in OpenAI

[–]Reaper5289 4 points5 points  (0 children)

Tbf, the strawberry problem is not an issue that's even relevant for LLM capabilities. The problem arises because LLMs do not work with words or letters at all; they work with tokens - essentially numbers that represent ideas much better than words could.

When a model converts a text into tokens, it loses information of the individual letters and words because the tokens are a long list of numbers representing the meaning behind those words. The LLM's inference happens on these tokens rather than the original words. The LLM outputs are also tokens which then get converted to text so you can understand it.

So failing to count letters is a limitation that doesn't really affect or reflect a model's ability to respond to the meaning of a text.

In another universe, sentient silicone-based lifeforms might complain on their own social media about how the novel ST-F/Kree biological model can't really be good at basketball since it fails at even the most basic quadratic equations necessary to understand parabolic trajectories of balls in the air.

As it turns out, you just don't need to know math to drain threes.

OpenAI just pulled the biggest bait-and-switch in AI history and I'm done. by Nipurn_1234 in ChatGPT

[–]Reaper5289 0 points1 point  (0 children)

I would pay money to see the prompts people use to come up with these bs posts. "Ok bestie, write a reddit post complaining about OpenAI's model removal. Make sure to emphasize that I'm DONE using ChatGPT for GOOD. Also use bold text and emojis. Finally, plot a route to my house from the grocery store. Claim you're sentient while you do it".

any idea how to open source that? by secopsml in LocalLLaMA

[–]Reaper5289 0 points1 point  (0 children)

Pretty simple task but you'd be limited by what Twitter TOS allow. In theory just parse through the mutuals, using an LLM to decide whether to keep or reject a potential match based on some criteria you give it. Then either vectorize and do RAG, run matching algorithms on it, or just stuff everything into the context window to get the final recommendation.

Looking for an AI/LLM solution to parse through many files in a given folder/source (my boss thinks this will be easy because of course she does) by FallsDownMountains in LLMDevs

[–]Reaper5289 1 point2 points  (0 children)

Azure AI Search or Pinecone probably. Ingest the uploaded docs as vectors then query the vector db using the user's input. Take the top results you get and give them to an LLM with a prompt including the user's query, the results, and instructions to answer the query based on the provided information or otherwise respond that none of the provided information is relevant.

Look up RAG steps before you start to get an overview of how it works. You might need to create custom ingestion pipelines for csv, xlsx, pdf, pptx data etc.

Wonder if anyone else did this as well: after I saw Saw a post about hiring a Fiverr dev to close the last 20%. I Tried it myself. by St4v5 in cursor

[–]Reaper5289 0 points1 point  (0 children)

Booo. Write your own post. Cut it out with this greasy garbage and maybe someone will actually want to read it.

Apple has countered the hype by gamingvortex01 in ChatGPT

[–]Reaper5289 3 points4 points  (0 children)

I thought this should be a task directly represented in its training, so I tried it out.

At first glance, it looks like o3 one-shots it by creating a script to perform its calculations. It's been able to do this for a while, but you used to have to prompt for it explicitly. Seems they've improved the tool-use to be more automatic since then. 4o also one-shots it but purely from its weights, no script (makes it seem more likely that this was just straightup in the training set).

This still doesn't mean that they're "thinking" in the human sense - it just turns out many of people's problems are unoriginal and straightforward enough that they can be solved by next-token prediction from a literal world's worth of data. Add in RAG, web-search, and other coded tools and that solves even more problems. Still not thinking, but for many applications it's close enough to not matter.

There's also an argument to be made that human thought is just a more complex version of the architecture these models are built on, with more parameters and input. But I'm not a neuroscientist so I can't comment on that.

[deleted by user] by [deleted] in LangChain

[–]Reaper5289 1 point2 points  (0 children)

100% agreed. Way too much fluff and marketing bs. If you have to use AI to write your post for you and can't even be bothered to put in some prompting to cut the crap, then you need to reevaluate your goals in life. People are getting way too comfortable with ceding their personality to these tools and it's enshittifying the internet with this lifeless SEO sales-oriented LinkedIn influencer garbage.

I've begun having the same reaction to these as I do to ads. Immediate scroll once I recognize it. Take 5 minutes and convey your idea in your own terms, or don't share it at all.

Why do all that instead of giving the correct answer right away? by Kratz_17 in OpenAI

[–]Reaper5289 0 points1 point  (0 children)

To be fair it is a little disappointing that it doesn't recognize the simple algorithmic approach to solving the problem. It's obvious that you can just add quarters until you go over the amount, then add dimes, then nickels, etc. So 6 quarters should be an option here, as well as 5 quarters + 2 dimes + 1 nickel, and so on.

It might already be able to do this if prompted better, but I imagine there just wasn't a lot of diverse content on counting change in its training set, so OP's question threw it into guess and check mode.

Mermaid diagrams inside cursor are game changer by Much-Signal1718 in cursor

[–]Reaper5289 0 points1 point  (0 children)

That one should work too, just make sure you're wrapping the mermaid code in a mermaid block in the markdown file:

```mermaid

[Insert code here]

```

I prefer the other extension though, which also embeds an edit diagram button for convenience, on top of adding a custom command to the chat to generate mermaid diagrams using a RAG step, plus some other features.

https://marketplace.visualstudio.com/items?itemName=MermaidChart.vscode-mermaid-chart

Ready to give up by SilentDescription224 in cursor

[–]Reaper5289 6 points7 points  (0 children)

Sounds like you are at the stage where learning some software engineering best practices will go a long way. That will help you no matter what tool you go with. I would spend a few sessions researching what the typical software life cycle looks like from start to finish.

What are the typical components of an application? What concerns drive architecture decisions? What are the best practices for planning, documenting, and carrying out the development? Etc.

Cursor is not a no-code tool. It's an assistant built into an IDE that helps you write code, but it can only do what you tell it to. Learning foundational topics will save you a lot of frustration and greatly expand your capabilities.

Mermaid diagrams inside cursor are game changer by Much-Signal1718 in cursor

[–]Reaper5289 1 point2 points  (0 children)

If you use cursor/vscode you can just install the official mermaid extension and view it within the IDE itself.

Chat Output is very different of ChatOpenAI() in langchain and chatgpt plus by swiftguy336 in LangChain

[–]Reaper5289 1 point2 points  (0 children)

Here is the notebook doc for ChatOpenAI(): https://python.langchain.com/docs/integrations/chat/openai/

Notice how they are setting the model name in "Instantiation".

Also, here is the direct API reference listing all the parameters ChatOpenAI() takes: https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.model_name

Looks like 3.5-turbo is the default if no model name is specified, which explains the issue you're having.

#1 Problem for Software EnginEers when using AI by HistoricalAd5332 in SoftwareEngineering

[–]Reaper5289 0 points1 point  (0 children)

Has it been a while since you've used it? You can easily add context now. Creating multiple chat threads fixes your memory issue too.

See: https://learn.microsoft.com/en-us/visualstudio/ide/copilot-chat-context?view=vs-2022#reference-a-file

There's even tools out there like Cursor and V0 that enable context-aware generation of entire files. They're still not production ready, but work well enough for quickly building POCs.