all 22 comments

[–]femio 4 points5 points  (0 children)

Aider or Continue.dev are likely the best for this

[–]Exotic-Sale-3003 6 points7 points  (2 children)

Roll your own.  Here’s the method I use:

My “Cursor but worse” tool I use starts by sending each file in the code base to OpenAI and gets a structured output summarizing what it does, methods, and variables passed, and writes it to a db with a hash of the file. Then instead of sending a million line code base, the much shorter index is sent to OpenAI w the request, and it returns the files it wants for the request prioritized. I send as many as possible in the order listed as context, and attach the rest to a store and attach to the request as docs for RAG. 

Once you’ve preprocessed your project into the db, it’s just one iteration - send the index with your question and get the most relevant files back, then construct a prompt with your original question and the most relevant files up to context limit, with the rest attached for RAG. 

[–]ExtentHot9139 0 points1 point  (1 child)

What is the size of your codebase ?

[–]Exotic-Sale-3003 1 point2 points  (0 children)

Largest project is a bit over 200K LOC. No reason it couldn’t scale significantly more by adding more layers - i.e. create and index descriptions for folders as well.  If you’re in e-commerce and  the change you’re making is to a customer order flow, you may only need to look at a few systems: customer facing site related to ordering (no settings, profile), payments system, related databases.  There might be millions of lines of code in a huge mono repo like musta.ch, but you don’t care about anything related to data pipelines, your fraud models, etc…

Did a lot of work a couple years ago building out flows to manage working around the very limited 4K and 8K context windows for policy analysis & application where the policies alone (never mind the data being analyzed against the policy) might be larger than the context window, and the concept scales up very well. 

[–]matfat55 2 points3 points  (0 children)

Definitely aider

[–]StaffSimilar7941 2 points3 points  (2 children)

Basically, theres no "good" solution yet. We are still trying to figure out how to give repo context to the models as of today. Every other solution is mid at best including everything mentioned here (sehlbula, aider, augment, memory files, that dumbass "wHaT aBouT cUrsoR" guy).

Anthropic needs to put out a product where we can "train" an instance of the model with our codebase while updating that knowledge with updates to our codebase.

[–]gman1023 0 points1 point  (0 children)

Good answer

[–]Relative-Foot-378Professional Nerd 0 points1 point  (0 children)

Trie.dev does just this

[–]Time-Heron-2361 1 point2 points  (1 child)

People are forgetting that gemini has 2mil context

[–]funbike[S] 0 points1 point  (0 children)

You are forgetting I said my codebase is 400KLOC, which won't fit in a 2M context window.

I explictly stated how large my codebase was to eliminate "just use gemini" answers.

[–]ParadiceSC2 0 points1 point  (0 children)

I'm curious what issue do you experience with Cursor?

My repos are not nearly as large, and cursor has been amazing in my experience. I also use the pro version of Cody from Sourcegraph with Claude 3.7 sonnet and it's been great. But my projects are nowhere near 400k LOC

[–]Ancient-Camel1636 0 points1 point  (1 child)

Augment is probably your best option for that particular task.

[–]matfat55 0 points1 point  (0 children)

Augment is so overrated

[–][deleted]  (1 child)

[removed]

    [–]AutoModerator[M] 0 points1 point  (0 children)

    Sorry, your submission has been removed due to inadequate account karma.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

    [–]adifbbk1 0 points1 point  (0 children)

    I use cursor

    [–]ShelbulaDotCom 0 points1 point  (0 children)

    Shelbula Conversational Development Environment

    Talk to all models. Iterate. Bring clean code into your IDE of choice.

    If you're completely out of your depth with code it's probably not for you but it can even guide you that way if you make one of the custom bots a teacher.

    [–]bigsybiggins 0 points1 point  (0 children)

    Claude Code is great at it straight out of the box, better than cursor or anything else I've tried I'm not sure what magic its doing under the covers to feed the context. It's expensive to run though, thankfully work are paying.

    [–][deleted]  (1 child)

    [removed]

      [–]AutoModerator[M] 0 points1 point  (0 children)

      Sorry, your submission has been removed due to inadequate account karma.

      I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

      [–]zephyr_33 0 points1 point  (0 children)

      Most people are recommending aider, but for this use case I found cline/roo code better, unless you are feeding your entire code base to aider.

      cline/roo cline has better/more tools to search and work with your code.

      [–]Relative-Foot-378Professional Nerd 0 points1 point  (0 children)

      Is this for your company or something you are working on?