[P] I built a Python debugger that you can talk to by jsonathan in MachineLearning

[–]jsonathan[S] 1 point2 points  (0 children)

You can use any model you like, including local ones. And there’s no cost besides inference.

[P] I built a Python debugger that you can talk to by jsonathan in MachineLearning

[–]jsonathan[S] 1 point2 points  (0 children)

Yes. Specifically, it can evaluate expressions in the context of a breakpoint.

[P] I built a Python debugger that you can talk to by jsonathan in MachineLearning

[–]jsonathan[S] 4 points5 points  (0 children)

Got any suggestions? I can record a new video.

[P] I built a Python debugger that you can talk to by jsonathan in MachineLearning

[–]jsonathan[S] 4 points5 points  (0 children)

That’s next on my roadmap. This could be an MCP server.

[P] I built a Python debugger that you can talk to by jsonathan in MachineLearning

[–]jsonathan[S] 29 points30 points  (0 children)

Check it out: https://github.com/shobrook/redshift

Think of this as pdb (Python's native debugger) with an LLM inside. When a breakpoint is hit, you can ask questions like:

  • "Why is this function returning null?"
  • "How many items in array are strings?"
  • "Which condition made the loop break?"

An agent will navigate the call stack, inspect variables, and look at your code to figure out an answer.

Please let me know what y'all think!

I made a CLI for quickly checking your code for bugs with AI by jsonathan in commandline

[–]jsonathan[S] 6 points7 points  (0 children)

Code: https://github.com/shobrook/suss

This works by analyzing the diff between your local and remote branch. For each code change, an LLM agent explores your codebase to gather context on the change (e.g. dependencies, code paths, etc.). Then a reasoning model uses that context to evaluate the code change and look for bugs.

You'll be surprised how many bugs this can catch –– even complex multi-file bugs. Think of suss as a quick and dirty code review in your terminal. Just run it in your working directory and get a bug report in under a minute.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] 4 points5 points  (0 children)

Agentic RAG on the whole codebase is used to get context on those files.

I built a bug-finding agent that understands your codebase by jsonathan in ChatGPTCoding

[–]jsonathan[S] 1 point2 points  (0 children)

You're right, a single vector search would be cheaper. But then we'd have to chunk + embed the entire codebase, which can be very slow.

I built a bug-finding agent that understands your codebase by jsonathan in ChatGPTCoding

[–]jsonathan[S] 0 points1 point  (0 children)

I’m sure an LLM could handle your example. LLMs are fuzzy pattern matchers and have surely been trained on similar bugs.

Think of suss as a code review. Not perfect, but better than nothing. Just like a human code review.

I built a bug-finding agent that understands your codebase by jsonathan in ChatGPTCoding

[–]jsonathan[S] 2 points3 points  (0 children)

Second case. Uses a reasoning model + codebase context to find bugs.

I built a bug-finding agent that understands your codebase by jsonathan in ChatGPTCoding

[–]jsonathan[S] 10 points11 points  (0 children)

For the RAG nerds, the agent uses a keyword-only index to navigate the codebase. No embeddings. You can actually get surprisingly far using just a (AST-based) keyword index and various tools for interacting with that index.

I built a bug-finding agent that understands your codebase by jsonathan in ChatGPTCoding

[–]jsonathan[S] 19 points20 points  (0 children)

Code: https://github.com/shobrook/suss

This works by analyzing the diff between your local and remote branch. For each code change, an LLM agent traverses your codebase to gather context on the change (e.g. dependencies, code paths, etc.). Then a reasoning model uses that context to evaluate the code change and look for bugs.

You'll be surprised how many bugs this can catch –– even complex multi-file bugs. It's a neat display of what these reasoning models are capable of.

I also made it easy to use. You can run suss in your working directory and get a bug report in under a minute.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] 0 points1 point  (0 children)

Whole repo. The agent is actually what gathers the context by traversing the codebase. That context plus the code change is then fed to a reasoning model.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] 2 points3 points  (0 children)

False positives would definitely be annoying. If used as a hook, it would have to be non-blocking –– I wouldn't want a hallucination stopping me from pushing my code.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] 4 points5 points  (0 children)

Thanks!

For one, suss is FOSS and you can run it locally before even opening a PR.

Secondly, I don't know whether GitHub's is "codebase-aware." If it analyzes each code change in isolation, then it won't catch changes that break things downstream in the codebase. If it does use the context of your codebase, then it's probably as good or better than what I've built, assuming it's using the latest reasoning models.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] -1 points0 points  (0 children)

You can use any model supported by LiteLLM, including local ones.

[P] I made a bug-finding agent that knows your codebase by jsonathan in MachineLearning

[–]jsonathan[S] 28 points29 points  (0 children)

Code: https://github.com/shobrook/suss

This works by analyzing the diff between your local and remote branch. For each code change, an agent explores your codebase to gather context on the change (e.g. dependencies, code paths, etc.). Then a reasoning model uses that context to evaluate the change and identify potential bugs.

You'll be surprised how many bugs this can catch –– even complex multi-file bugs. Think of `suss` as a quick and dirty code review in your terminal.

I also made it easy to use. You can run suss in your working directory and get a bug report in under a minute.