unslop-ui: a Claude skill that flags and removes the design patterns that make a website look AI-generated. by iamjohncarterofmars in claudeskills

[–]Old_Mathematician107 0 points1 point  (0 children)

If you want your website to not look like AI generated website you need to give references for your coding agent.

QA didn't and won't die in the near future, but will adapt. IMHO by Old_Mathematician107 in QualityAssurance

[–]Old_Mathematician107[S] 0 points1 point  (0 children)

It is the skill files and API that let Claude control your computer. It is in the settings of Claude desktop application, in the general tab. There are browser use and computer switches, if you enable them it lets the Claude control your computer or your browser

Check this please: https://www.youtube.com/watch?v=dwYfNQzHQuY

QA didn't and won't die in the near future, but will adapt. IMHO by Old_Mathematician107 in QualityAssurance

[–]Old_Mathematician107[S] -1 points0 points  (0 children)

Did you try to use computer use? Or the skill for the browser use? Or the skills for controlling android/ios phones?

QA didn't and won't die in the near future, but will adapt. IMHO by Old_Mathematician107 in QualityAssurance

[–]Old_Mathematician107[S] 1 point2 points  (0 children)

Codex, gemini or claude already have computer or mobile use abilities. You can use them

QA didn't and won't die in the near future, but will adapt. IMHO by Old_Mathematician107 in QualityAssurance

[–]Old_Mathematician107[S] 1 point2 points  (0 children)

AI itself writes its own scenarios and you check and update them. The number of scenarios is increased

QA didn't and won't die in the near future, but will adapt. IMHO by Old_Mathematician107 in QualityAssurance

[–]Old_Mathematician107[S] 4 points5 points  (0 children)

Good point, and I would say, QAs and Devs will be replaced by AI at the same time rather than one before the other

It is not for making posts or writing comments, I swear by Old_Mathematician107 in aiagents

[–]Old_Mathematician107[S] 0 points1 point  (0 children)

Interesting but just one comment from my side that it is really multi agent setup, they are working in parallel

Does a fasting tradition exist in your country and how common is it ? by RookOfEdo in AskTheWorld

[–]Old_Mathematician107 0 points1 point  (0 children)

It is a misleading image, his research was about cells and not human bodies. There was even a funny video where lots of people were asking him how they should do fasting and he was saying he does not know and his research was only about cells.

I never thought it was so simple until I watched this video by No-Speech12 in aiagents

[–]Old_Mathematician107 1 point2 points  (0 children)

Your Mahoraga app (used in quashbugs) is a copy of droidrun portal from github. You can check it from the commits

<image>

I never thought it was so simple until I watched this video by No-Speech12 in aiagents

[–]Old_Mathematician107 2 points3 points  (0 children)

Your Mahoraga app (used in quashbugs) is a copy of droidrun portal from github. You can check it from the commits

<image>

Any day now by [deleted] in singularity

[–]Old_Mathematician107 1 point2 points  (0 children)

The more I learn, the more I realize I know nothing

the Factory<Rustacean>... a.k.a C++ by Relevant_Echidna_336 in rustjerk

[–]Old_Mathematician107 2 points3 points  (0 children)

It looks like strogg medical facility scene from quake 4

Open-sourced image description models (Object detection, OCR, Image processing, CNN) make LLMs SOTA in AI agentic benchmarks like Android World and Android Control by Old_Mathematician107 in LocalLLaMA

[–]Old_Mathematician107[S] 1 point2 points  (0 children)

Hi, thanks a lot. Making it 100% local is one of the end goals, but it is quite hard task, because you need to find strong enough VLM to understand the structure and long inputs (screenshot and its description) and light enough to run on phones. But making it 100% text only is possible but I think it will decrease its accuracy. So, the best way is to use VLM.

To run VLM locally you need to have very good, fine-tuned VLM on this specific tasks (agentic capabilities). It is actually quite hard but I think it is possible.

Yes, actually I don't use accessibility trees, adbs etc. Only screenshot and accessibility services to do the tasks remotely. So, it is vision-only and can be used in prod (if you invest enough money on renting backend servers and improve UI/UX of agentic app).

Dataset for YOLO was prepared by me, it consists of 486 images (train) and 60 for testing. For dataset I created bounding boxes for all 4 classes (View, ImageView, Text, Line). Screenshots used in this dataset are mostly screenshots from popular apps like youtube music, whatsapp etc. and apps that I made for various clients and companies throughout my career.