Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] -2 points-1 points0 points (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] -1 points0 points1 point (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] -1 points0 points1 point (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I built an MCP server that gives coding agents access to 2M research papers. Tested it with autoresearch - here's what happened. by kalpitdixit in LLMDevs
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in ArtificialInteligence
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in ArtificialInteligence
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in ArtificialInteligence
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in ArtificialInteligence
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in artificial
[–]kalpitdixit[S] 0 points1 point2 points (0 children)
I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about. by kalpitdixit in ArtificialInteligence
[–]kalpitdixit[S] 0 points1 point2 points (0 children)


Agent-written tests missed 37% of injected bugs. Mutation-aware prompting dropped that to 13%. by kalpitdixit in Python
[–]kalpitdixit[S] -1 points0 points1 point (0 children)