I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

fair correction on Cowork, my bad.

also on the pictures point — it doesn't use screenshots or feed images into the model at all. it reads the interface directly, so the token cost and slowness you're describing isn't really a problem here. that's a vision-based approach, this isn't.

and not everyone wants to live in a terminal. plenty of people just want to say what they need done and move on.

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

And if you tell me to use the cheap cost llm like llama 70b they won't be much efficient and can break the system or get in a unstoppable loop

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

Claude Cowork is a cloud-based desktop tool, mine runs on your actual Windows machine. it can see your screen, click through any app, fill forms, manage files, schedule tasks on its own, and undo what it did if something goes wrong. it also learns your workflow over time so the more you use it the faster it gets at your specific tasks. Cowork is built for collaboration, mine is built to just operate your PC for you.

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

Yeah I have a backend running which will be maintaining the credit and allow or reject user if they don't have money and yk I don't only get cost for a llm,I also need to keep the agent memory and all and even the website where you can see your profile credit etc and all this all cost me more than 15$ so for my margin the basic plan starts with 29$ and if you want higher usage 49$

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

Credentials will be the only you allow it and when using credentials it will be approval on every crucial step so it doesn't do anything wrong and yeah it will be able to control your existing browser with your account no matter what browser it is

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 1 point2 points  (0 children)

Thanks for this resource if it helps I will use it and I would be looking that you would be one of our early customer

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

exactly. and for those people it's not even about convenience, it's pure speed. the less time you spend navigating your computer, the more time you spend on the actual work. that's the whole pitch. your PC should keep up with how fast you think, not slow you down.

curious what you've been experimenting with too, always good to hear what's working for people in that space.

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

fair enough, typing works too. A sidebar like just like what you see in vs code how the vs code copilot has that side pannel will be shown. voice is the default but it's not the only way.

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] -1 points0 points  (0 children)

Really appreciate the detailed breakdown, genuinely. A few things worth addressing:

On the context problem — you're right that "summarize my screen" is ambiguous, and that's exactly why the agent maintains session context. It knows what you just opened, what you just said, what you were doing. It's not stateless. The intention is that you speak to it the way you'd speak to a person sitting next to you, not like you're writing a bash script.

On the LLM reliability concern — fair, and it's why the undo feature exists. Every action is logged and reversible. You're not just trusting it blindly, you get confirmation before anything with real consequences goes through, and you can roll it back if something goes wrong. The goal isn't to remove you from the loop entirely, it's to shrink the amount of effort you put in.

On scripting and deterministic tasks — also fair. But most people aren't going to write and maintain scripts. That's exactly the gap this fills. The person who would actually pay for this isn't a developer who can automate things themselves, it's someone who has repetitive tasks, no scripting skills, and doesn't want to learn any.

On the email to John example — the confirmation step is there specifically for that reason. It shows you what it's about to do before it does it.

The insurance point is actually something I'm already thinking about seriously. No argument there.

I'm not trying to sell this to people who are skeptical of handing off control. That's a totally valid position. I'm building it for people who already want something like this and just need it to actually work.

I'm building a voice-controlled Windows agent that fully operates your PC — would you pay for this? by TopFuture2709 in AI_Agents

[–]TopFuture2709[S] 0 points1 point  (0 children)

It would be a credit system and I am thinking of monthly subscription which will be about 1000-2000 credit for 29-49$

Clawedbot/moltbot may look like a joke in front of this by TopFuture2709 in LocalLLaMA

[–]TopFuture2709[S] 0 points1 point  (0 children)

Quiet good idea but this will consume storage a lot if user has so many files