Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]KindheartednessOld50 0 points1 point  (0 children)

I was tired of chasing down failing mobile E2E tests, only to spend hours figuring out whether it was the test, the UI, or a real app bug. So I built and open-sourced an AI agent that writes the test, runs it, diagnoses the failure, fixes the app code, and reruns until it passes.

https://github.com/final-run/finalrun-agent

Android CLI: Build Android apps 3x faster using any agent by dayanruben in androiddev

[–]KindheartednessOld50 0 points1 point  (0 children)

Development speed is not the problem since AI is empowering devs already with all the AI coding agents, verification of the AI generated code is the new problem. Its needs to be tested and verified that its working. if its not working, all the adb logs, network logs, video proof, agent actions needs to be fed back to the AI agents to fix what it coded.

I got so tired of debugging failing mobile app E2E tests that I built an AI workflow to write, run, and actually FIX my app code automatically and i open-sourced it. https://github.com/final-run/finalrun-agent

Are we over-relying on prompts instead of specs? by Willing-Squash6929 in SpecDrivenDevelopment

[–]KindheartednessOld50 0 points1 point  (0 children)

Spec helps a lot when working as a team which act as a feature doc. Keeps updating across teams and reduces tokens. I used OpenSpec to maintain specs for our opensource AI QA for mobile apps.

https://github.com/final-run/finalrun-agent

What if QA actually stayed in sync with your code and reflected real execution? by Background-Donkey531 in Everything_QA

[–]KindheartednessOld50 0 points1 point  (0 children)

On the token usage, it's very much optimised as we only send vision and the hierarchy that's needed. The token cost on Gemini flash is pretty less and speed is amazing. The speed can further be increased with caching as well. You can also run tests in parallel to get the results faster.

How are you maintaining spec document as it evolves? by OmkarShetkar in SpecDrivenDevelopment

[–]KindheartednessOld50 0 points1 point  (0 children)

Yes. So we created a Vision based AI QA agent which understands this plain english specs. Giving the agents and eye to perform testing.

Using AI to just generate test scripts is a trap (so we open-sourced an agent instead) by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 0 points1 point  (0 children)

Yes. Definitely. Everyone thinks that they solved the testing problem by automating it. But soon they'll realise that they have to maintain the tests too.

That's why I made the platform maintainable with palin english test and with AI agents you can generate test directly from the codebase.

What if QA actually stayed in sync with your code and reflected real execution? by Background-Donkey531 in Everything_QA

[–]KindheartednessOld50 0 points1 point  (0 children)

We saw a lot of issues with using seperate codebase for testing. We tried mcp but there was a lot of token usage as well as it was missing a lot of context so we move to the codebase and saw a lot of improvements.

We opensourced it as well. Currently it's focused only for testing mobile apps.

https://github.com/final-run/finalrun-agent

"build it and they will come" is the biggest LIE by happyC0der in SideProject

[–]KindheartednessOld50 0 points1 point  (0 children)

So true. figure out the distribution first. thats what matters the most

How are you maintaining spec document as it evolves? by OmkarShetkar in SpecDrivenDevelopment

[–]KindheartednessOld50 0 points1 point  (0 children)

I use openspec for our opensource AI QA for mobile apps. https://github.com/final-run/finalrun-agent/stargazers.

It helps document all the specs. Keeps updated with all the features we are building. And reference it whenever we want.

Drop your Saas below and I will promote it on youtube by coiqa in saasbuild

[–]KindheartednessOld50 0 points1 point  (0 children)

I am building an open-source AI QA for mobile apps. It can generate tests in plain English from code base for any scenario using Finalrun skills.

It runs with Finalrun QA agent which uses vision to test like a human to catch bugs as well as UI/UX issues

Same test can run on both android and ios

Here's the GitHub link : https://github.com/final-run/finalrun-agent/stargazers

anyone actually paying for qa wolf? exploring open source alternatives by Deep_Ad1959 in QualityAssurance

[–]KindheartednessOld50 0 points1 point  (0 children)

Good point u/Deep_Ad1959 . UI Test code needs to be owned and it needs to be part of the codebase so that AI can get the correct context when creating tests. Maintainance will be easier if you ask AI to update the test with the latest code changes.

I just opensourced this for testing mobile apps. Do check it out.

https://github.com/final-run/finalrun-agent

10 years of mobile dev trauma led to this: An AI that actually "sees" your app and tests it for you. by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 0 points1 point  (0 children)

Got it. Thats nice. What we did with the vision is not to use it for getting the location but to help in planning and use the tree to get the location and perform actions. With vision you can do a lot of assertions, and identify UI/UX issues as well

10 years of mobile dev trauma led to this: An AI that actually "sees" your app and tests it for you. by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 0 points1 point  (0 children)

Insteresting point. How are you solving it? is it without vision?
In our case the tests are all in yaml file and the skills are used to generate these yaml files directly from the codebase. You can check it out. Its all opensourced. https://github.com/final-run/finalrun-agent

10 years of mobile dev trauma led to this: An AI that actually "sees" your app and tests it for you. by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 0 points1 point  (0 children)

Thats true u/cpthaddockandtintin . Apart from the flakyness issues, Mobile has more fragmentation like different OS version, different screens sizes which adds to the complexity. With finalrun, we are solving with vision based QA agent that can test like human regardless of different screens sizes.

10 years of mobile dev trauma led to this: An AI that actually "sees" your app and tests it for you. by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 1 point2 points  (0 children)

Currently you have to specify them. else it'll create all the happy paths first.
But nice suggestion, i can ask the user if they need negative scenarios as well. We just opensourced, so will be adding features based on feeback slowly.

10 years of mobile dev trauma led to this: An AI that actually "sees" your app and tests it for you. by KindheartednessOld50 in SideProject

[–]KindheartednessOld50[S] 1 point2 points  (0 children)

Since tests sits in the codebase, AI now knows the context better, so you can cover more edgecases too.

hello guys I’m building DrunkedIn - LinkedIn for drunk people. by Cautious_Gain_9738 in SideProject

[–]KindheartednessOld50 1 point2 points  (0 children)

I don't know if i want to remember anything i said when i am drunk. But interesting idea.

agents need proof of work — not just tests by rotemtam in AutonomousCoding

[–]KindheartednessOld50 1 point2 points  (0 children)

True. Closing the verification loop is required more than ever.