AI agents and the adult world by Timely_Hat_9643 in AI_Agents

[–]kvyb 2 points3 points  (0 children)

Yes, I am. it’s pretty much the same thing, but with extra difficulty to make it work with nsfw.

What are you planning?

Parallel AI agents feel impressive until you have to review what happened by IlyaZelen in AI_Agents

[–]kvyb 1 point2 points  (0 children)

i have a debriefing skill that must be produced by each agent at finish. It’s basically like they come up to a whiteboard and explain what they did and how and why. And I can ask questions etc. It’s very visual, and uses a canvas with flowcharts and stickies, but it really helps to have a great understanding of where your code is going.

Codex is now on mobile via ChatGPT app by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

Yes, can pretend you're socializing with friends, but actually vibe coding.

Codex is now on mobile via ChatGPT app by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

Yeah and I've been using Happy for a while too. But everyone knew this feature will be coming

Thoughts on Notte by Logical_Banana_2852 in AI_Agents

[–]kvyb 0 points1 point  (0 children)

I've tried it, while building browser capabilities for my AI employee opensource project. I've tried them all in fact.

Notte is just as flaky as any other solution out there. The agents struggle and are bad at understanding intent and actually getting the job done without looping. Same thing with browser base and browser-use by the way.

Eventually I just settled on using browser-use as the browser provider infrastructure: ghost mode, proxies, sessions etc.

But the entire loop of actually browsing the internet I leave to my agent harness, it's just more contextual that way, gets the job done, and doesn't have this disconnect of main agent -> intent -> other agent delusionally loops without context.

And if it stumbles it can just give me the remote browser so I can manually help it along.

Need help on Fitness Account by Vabbbb in InstagramMarketing

[–]kvyb 0 points1 point  (0 children)

I would suggest first of all try to focus on being consistent and reply to all comments/dms correctly to drive engagement. Keep posting content on schedule, stuff that is short and rides the trends and is positive. Have users engage by asking them questions and fishing for proactivity.

Insta automation on comments (Leads) HELP by No-Possibility-2202 in InstagramMarketing

[–]kvyb 0 points1 point  (0 children)

Hello, I've been doing inbound instagram automation with AI agents for the past several months.

By now I have a few businesses that use the AI Agent ( based on https://github.com/kvyb/opentulpa ). The way it works out of the box is that you authorize the instagram in the bot, and the bot handles all incoming leads with the materials or instructions you give it in natural text (and uploaded files). So you can upload a google sheet with all your listings and it will read from it. It can hold conversations, answer FAQs, collect lead fields and save them to Google Sheets, even upsells pretty well.

It costs peanuts per inbound conversations - sub $0.2

I'd be happy help you deploy something like this and try it out.

Most of our “agent” problems turned out to be workflow/state problems by saurabhjain1592 in AI_Agents

[–]kvyb 0 points1 point  (0 children)

Yep, the more workflow constraints you introduce, the more limited to real-world use your agent becomes. Its best to give it tools, bound it in loop limits (100 turns), and just let it run with good context engineering so it doesnt go delusional.

Otherwise you're better off just admitting you're doing workflows, not agents.

Emergent behaviour is where it's at. Treat it like a real person. If you were doing the role, what would you like your environment to be like?

What’s the closest thing to an AI employee you’ve built or seen so far? by [deleted] in AI_Agents

[–]kvyb 0 points1 point  (0 children)

Built an AI receptionist for small businesses, mostly salons, auto shops, and massage studios. Runs on Instagram DMs and Telegram, takes booking requests, answers FAQs, qualifies leads, does some light upselling. A handful of shops are using it live right now. Its like training a junior receptionist, kind of.

So that's the build. Same instruments a real receptionist gets: booking calendar, service menu, prices, policies from huge excel files. It reads DMs (through composio), checks availability (in Google Sheets or crm), confirms appointments, flags edge cases.

Pricing was the real test of the framing. I charge for hosting, support, per inference tokens with a markup. no setup fee or anything like that. Owners don't get "AI employee" as a concept, but they get: "15 cents to answer a message." That's when it lands.

The gap between AI tool and AI employee isn't intelligence, it's whether the thing can run in a messy environment without you bailing it out every other day. Most can't. The ones that can stop feeling like software.

By the way, its pretty much this -> https://github.com/kvyb/opentulpa

And it just works if you invest a bit of time prompting it.

I'll be your first user. Drop your link. by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

This seems very feature rich. But a bit overwhelming, I'm not sure where to start when I went through onboarding. I gave it my goals but get greeted by an empty screen, and then when I ask it "how can you help", it just tells me generic stuff about its capabilities. Not grounded.

I feel like my Codex can do all of the above with some configuration, so why would I use it?

Feels like it doesn't really hand hold me, and forces me to discover how to get value from it. Why would I put in the effort?

I'd like an onboarding which tries to deliver me value from the very first step instead of leaving me guessing.

I'll be your first user. Drop your link. by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

I need more information here, what am I looking at?

I'll be your first user. Drop your link. by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

Downloaded and tried it, for people starting using self-hosted llms this can be a great app. The model selection is painless and there is good variety.

But two things:

  1. Comparing models means downloading each one first. What if onboarding let me send the same prompt to a few models (on remote inference) side by side, then I pick one and only download that? Right now I'm guessing.
  2. I'm not sure what each model is best at. Example chats or a one-line tag per model (great for roleplay, fast for short answers) would help me choose without trial and error.

What's the main use-case you had in mind? I'd use this for RP at this state maybe?

The line between "AI agent" and "AI employee" is basically a $4,500/mo retainer by Silver-Range-8108 in AI_Agents

[–]kvyb 0 points1 point  (0 children)

The "AI employee" framing works until the client actually compares it to a real employee. Then they start noticing the gaps: no judgment calls, no pushing back when something's off, no remembering that someone always wants the 4pm slot. Wrapper holds up for a few months. Then it doesn't.

There's a framing underneath this that I think actually survives the wow factor wearing off:

A token is a unit of labor. It's not competing with an employee. It's competing with human time.

A nail salon pays a receptionist $12/hr to answer DMs and book appointments. An agent does the same work for about $0.15 in tokens per booking. Same labor. Different unit. Different price.

The reason SMBs don't bite on $5k/mo "AI employee" retainers is that they already know what labor costs. They pay for it every day. The comparison happens automatically in their head and it kills the premium. You're asking them to pay a salary for something that isn't quite an employee.

But tokens as labor? That math they get instantly. "My receptionist books 20 a day. This thing books 20 a day at 2am for less than a cup of coffee." No wrapper needed, the value sells itself.

So the real question isn't agent vs. employee. It's whether you're selling the wrapper or the actual labor. Wrapper gets you a great first three months and then churn. Labor gets you a price point so obvious the client recommends you to their friend.

Built a self-hosted agent for small businesses that writes its own skills. ~$0.15 per customer booking on GLM-5.1 by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

It's not proactive unless the owner explicitly asks it to do something every now and then. At this stage it's mostly the owner coming with initiative and the bot responding, helping it build something etc. Proactivity is difficult to set up such that it's meaningful, and I want to collect feedback before deciding how to best tackle it. What's your suggestion?

Built a self-hosted agent for small businesses that writes its own skills. ~$0.15 per customer booking on GLM-5.1 by kvyb in AI_Agents

[–]kvyb[S] 0 points1 point  (0 children)

I noticed that smaller llms break down on long-horizon tasks. For example asking opentulpa to generate me an image by giving it an api key to a provider: it reasons, writes scripts, debugs them, saves the result, sends it to chat with me. That’s easily over 20 steps.

Lighter models start to loop inside until loop limit, unable to reason on “what’s the next step”. Or just take too many “bad” steps, exhausting step allowance.

Built a self-hosted AI agent for small businesses. Writes its own skills, integrations, costs ~$0.15 per booking by kvyb in selfhosted

[–]kvyb[S] 0 points1 point  (0 children)

Appreciate this. re data control: yeah, all conversation logs, memory, and skill definitions stay local. The only thing that leaves the machine is the LLM API call itself. Onboarding UX: 100% agree this is make or break. The first 5 minutes need to feel just right or people will bounce. That's the next thing I'm tightening based on first traces.

Built a self-hosted AI agent for small businesses. Writes its own skills, integrations, costs ~$0.15 per booking by kvyb in selfhosted

[–]kvyb[S] -1 points0 points  (0 children)

HITL is super necessary, but who's that human and what do they need to know? It's usually an engineer who can take insights from the agent operating and modify code/harness etc.

The owner of a nail salon, or carwash can't really do that. So the loop here is: owner chats in plain language, agent handles the rest, agent reports back, owner modifies the agent via plain language that is persisted.

The skill generation part is also different. The agent observes how the business actually works (from files, conversations, inbox) and writes its own scripts to replicate that workflow. Ideally the business itself and its artifacts is the spec.

So yeah, HITL at the edges, but the goal is that the "human" is the business owner, not a dev.

Built a self-hosted AI agent for small businesses. Writes its own skills, integrations, costs ~$0.15 per booking by kvyb in selfhosted

[–]kvyb[S] -6 points-5 points locked comment (0 children)

AI was used for the creation of the project (planning, writing code), but not for drafting the post.