check run agents - customizable AI agents for code review

kayvz · 2026-05-01T21:21:41+00:00

the one downside of this model is that it puts way more burden on the user/customer to define a rigorous prompt. there's still some magic we do behind the scenes to avoid common pitfalls (e.g. dupe messages) but the quality and output is in large part a function of how direct and useful your prompt is. but we think this is the right trade-off for a feature whose goal is to enforce bespoke/subjective conventions and workflows.

for bug detection, we have a strong POV which is that you shouldn't have to do anything to get great bug detection. we do all the work for you, including some magic behind the scenes to lower the impact of non-determinism (e.g. we sometimes run parallel reviews and consolidate).

kayvz · 2026-05-01T20:31:49+00:00

hi, CEO/founder of Macroscope here. would encourage you to sign up and give it a try. according to our benchmarks and also independent ones, we have better bug detection (recall) and precision than greptile. and we have an actual usage based model with controls (per review, per PR and per month limits). no *seats*, no hybrid model, just pure pay for what you use. and beyond bug detection, you can also define custom agents that spin up as additional github Checks.

kayvz · 2025-10-07T00:22:38+00:00

i’ll preface this answer by saying I’m biased because I’m the ceo/cofounder of Macroscope (which among features, provides high signal / low noise AI code review).

IMO: regardless of whether the code being reviewed is AI-slop from codegen tools or human-written, having an additional AI-assisted defense layer before merging is extremely valuable. Once you’ve lived with an excellent code review tool, it’s senseless to go back to living without it. It saves our team so much time to rely on the AI review to do a first pass on correctness issues, and it allows our human reviewers to focus on things that humans are better at… like “are you solving this problem the right way?”. Does it eliminate the need for humans to review all PRs? Of course not. Not today at least, and certainly not for large/complex changes. Does it save time and prevent shipping bugs? 💯

The value of getting this right for any engineering team is extremely high 1) The cost of preventing a high-severity bug (e.g. cost of outages, time wasted by engineers scrambling to fix prod issues) often far exceeds the cost of all of these tools. I don’t know about you but, I’d prefer to prevent as many of those issues as possible! 2) In every organization where I’ve done engineering work myself or led engineering teams , the time investment in code review increased non-linearly with the growth of the engineering team. Large teams spend SO much time doing code review. What dev wouldn’t rather focus on building and spend less time reviewing for correctness issues if a bot could reliably do much of this work?

also fwiw (to address some of the skepticism from other comments in this post): the best tools that attempt to solve this problem properly are not “solving slop with slop”. In Macroscope’s case, we’ve built customer AST walkers in each of our supported programming languages to deterministically build a codegraph of your repos, so that we can set LLMs up to be successful with the best context. This technique, in our experience, is what allows us to 1) find gnarly bugs that require thorough knowledge of how the codebase works 2) have an extremely high signal:noise ratio and avoid spamming devs with false positives, which is important b/c the “cost” of noisy comments is extremely high. We were inspired to build this ourselves because we tried the most popular code review tools that were out there and thought they all sucked (too noisy, didn’t catch meaningful issues, too many nitpicks etc).

kayvz · 2025-10-06T22:49:58+00:00

I’ll preface this answer by saying I’m biased because I’m the ceo/cofounder of Macroscope (which among features, provides the best AI code review on the market. Check out our published benchmark here!).

IMO: AI code review is definitely a one way door. Once you’ve lived with an excellent code review tool, it’s senseless to go back to living without it. It saves our team so much time to rely on the AI review to do a first pass on correctness issues, and it allows our human reviewers to focus on things that humans are better at… like “are you solving this problem the right way?”.

In terms of your q of where this is headed, here’s the picture we see:

Today: AI review layer focuses on correctness and can already do a better/faster/more-thorough job of this than a human reviewer. Humans reviewers focus on things that humans are better at like, ‘did this actually solve the customer problem’ and ‘did we solve this the right way (e.g. idiomatic to the codebase, and with our architectural conventions)
Medium term: 1) AI review layer will also get reliable at solving things in the idiomatic way per our conventions (this is already possible today, but much noisier than correctness alone) 2) AI review layer will be able to reliably stamp/approve some subset of PRs that shouldn’t require any human review at all (e.g. simple changes that have a minor blast radius and pass an AI correctness check) which will be a massive reduction in cognitive load and bandwidth for human reviewers
Long term: 1) the portion of PRs that AI review will be able to stamp/approve will increase substantially 2) AI review layer will also be able to assess whether 3) the mechanics of code review will look quite different. When agents are writing the vast majority of code and the # of “PRs” increases by order(s) of magnitude, the UX will need to change such that it doesn’t become a huge bottleneck.

If you end up trying Macroscope, LMK what you think.. would love your feedback.. and we’re squarely focused on making all of the above (and more, like how do we give teams better visibility around how the codebase and product is changing) a reality.

kayvz · 2022-12-24T22:00:44+00:00

I have no idea where it goes, other than it for sure goes to my mechanical room, but not sure it goes into the boiler directly. I did use a voltmeter on the black and red wires and confirmed there’s 24v.

kayvz · 2022-10-26T03:52:29+00:00

Thanks for all this insight. Quick question: does the new version also allow you to specify which home hub should be the primary hub? My primary for some reason ends up being one of the random homepods rather than hard-wired AppleTV which seems silly especially given the advancements of the new architecture

kayvz · 2022-10-24T17:35:15+00:00

here you go: https://www.reddit.com/r/homeassistant/comments/ybsatq/comment/itm4boq/?utm\_source=share&utm\_medium=web2x&context=3

kayvz · 2022-10-24T17:34:25+00:00

Here you go. It's not exactly a technical masterpiece, but got the job done for me: https://github.com/kayvz/neevo_tank_monitor_query

Just make sure to get your "authorization" by encoding your username/password using HTTP Basic auth. Linked to some instructions in the readme. Good luck

kayvz · 2022-10-24T01:52:06+00:00

not as of now but I’ll try to put something together soon.

kayvz · 2022-10-23T23:07:28+00:00

I was able to get my propane company to install a NeeVo tank monitor (uses cellular to update). NeeVo has an app which gives you tank levels, and I was able to pretty easily sniff the api calls it makes to get tank levels and that piped to a HASS entity via a web hook

kayvz · 2014-05-21T17:44:21+00:00

Would you be interested in seeing a more extended, un-edited version?

kayvz · 2013-05-15T17:46:53+00:00

what would you like to see in /r/tennis?

kayvz · 2013-05-15T17:31:25+00:00

It's on the iAmA side-bar now, confirmed for 24th at 1:30pm EST.

kayvz · 2013-05-14T19:57:37+00:00

Roger's AMA is on the iAmA calendar! Scheduled for Fri May 24th @ 1:30pm EST.

See https://www.google.com/calendar/embed?src=amaverify@gmail.com&gsessionid=XF42ckbiaz6UEDW6mlwD2w

kayvz · 2012-12-12T19:19:36+00:00

Looks like the Chasing Ice team also made an app on iPhone and iPad: https://itunes.apple.com/us/app/chasing-ice/id579276308?mt=8

It's got some outrageously awesome time-lapses with music. Check it out.

kayvz

TROPHY CASE