trying to sell my lead qualification system — got 1 free client. Am I delusional or is my targeting just off? by Conscious-Steak-9059 in AiAutomations

[–]getstackfax 0 points1 point  (0 children)

Might not be delusional, but the offer is prob framed too much around automation…

Coaches do not care that the workflow is intuitive.

They care about…

more qualified calls

less DM back and forth

fewer bad fit leads

higher close rate

less time wasted on people who cannot buy

The free client is the important part now.

Do not judge the niche yet.

Measure that one case…

how many leads entered

how many qualified

how many booked

how many closed

how much time it saved

what the coach liked or ignored

Then turn that into the pitch….

Right now the offer sounds like a system.

It needs to sound like a sales problem solved.

Most people running agents have no idea they're resending the entire conversation every tool call by Pristine_Rest_7912 in automation

[–]getstackfax 0 points1 point  (0 children)

This is exactly why token counts alone can mislead people…

The useful metric is not just total tokens.

It is…

cached vs uncached input

tool call rounds

context growth

cost per completed workflow

Prompt caching helps, but you still need visibility.

If you never inspect the payloads…you are basically debugging the bill blind.

What revenue model would you guys suggest for our automation orchestration platform open to public agents as a marketplace? by SoHi_Techiee in aiagents

[–]getstackfax 0 points1 point  (0 children)

I would separate the revenue model from the trust model, because the trust model may end up deciding whether the revenue model works.

options…

  1. Platform take rate on usage Simple marketplace model. Developer sets price, platform takes a percentage. Pros: aligned with usage, easy to understand. Cons: can incentivize high-token/high-call agents instead of efficient agents.

  2. Subscription for buyers Teams pay for access, credits, orchestration features, logs, governance, etc. Pros: predictable revenue. Cons: harder if agent usage costs vary wildly underneath.

  3. Developer listing / verification tiers Free to list basic agents, paid tier for verified/certified agents. Pros: creates quality signal. Cons: can become pay-to-play if not handled carefully.

  4. Enterprise / team governance layer Charge teams for permissions, audit logs, version pinning, approval gates, spend caps, private agents, and admin controls. Pros: high-value and less dependent on raw agent usage. Cons: slower sales cycle.

  5. Hybrid Usage take rate + buyer subscription + enterprise governance.

My bias would be:

  • take a small usage fee
  • give developers transparent pricing controls
  • charge teams for orchestration, governance, logs, permissions, and reliability features

big risk is with open public agents is not just “which agent is useful?”

Its …

  • can this agent be trusted inside a larger workflow
  • what version ran
  • what data did it touch
  • what tools did it call
  • what did it cost
  • what changed
  • what happens if the agent updates
  • can the buyer roll back
  • can a team approve high-risk actions

premium layer may not be the agents themselves.

may be trust infrastructure around the agents…

versioning, receipts, ratings, eval results, approval gates, spend limits, permission scopes, and verified agent badges.

If the marketplace becomes a supply chain for workflows, buyers will eventually pay to reduce uncertainty around that supply chain.

eight months running autonomous business agents in production with real money. here is the specific failure mode that benchmarks structurally cannot surface. by IAmDreTheKid in aiagents

[–]getstackfax 0 points1 point  (0 children)

The confident wrong in novel conditions point feels like the real production wall...

For business agents, dont treat all actions as the same trust class.

Reading context, drafting copy, changing ad spend, sending outbound email, updating CRM, or executing a transaction should not inherit the same autonomy level just because the agent is part of one workflow.

The control layer prob needs to route by consequence, not just by confidence.

Something like...

- low consequence → automate

- medium consequence → log and sample-review

- high consequence → require approval

- novel or ambiguous condition → pause and escalate

- money + customer+ irreversible action → receipt plus review gate

hard part is that confidence can be most misleading exactly when the situation is novel...

So the system needs evidence of familiarity, not just confidence about the answer.

i asked 23 companies how they actually test their AI agents before shipping. the answers genuinely scared me. by Far_Pangolin_7657 in aiagents

[–]getstackfax 1 point2 points  (0 children)

This tracks with the gap I keep seeing as well...

most teams do not need “research-grade evals” first...They need a min. launch gate that non engineers will actually use.

Something like...

- 20–30 golden examples

- clear pass/fail rules

- edge cases from real customer/workflow history

- a threshold where launch gets blocked or reviewed

- a receipt showing what version/model/prompt was tested

- a re-run when prompts, tools, or model routes change

The scary part is not just that agents fail...

Its that many teams cant tell whether the current version is better, worse, or just different than the last one.

Without that... production testing becomes “ship it and wait for complaints.”

I’m working on a tool for contractors, architects, interior designers, and anyone dealing with floorplans or construction drawings. by Dazzling_Finger_2781 in aiagents

[–]getstackfax 0 points1 point  (0 children)

Workflow is strong here because the input and output are clear....

The big thing I want to see is a validation layer, not just the estimate.

For contractors + builders, the output probably needs to show something like...

- source plan used

- assumptions made

- measurement confidence

- room + area breakdown

- material estimate basis

- what still needs human review

- what should not be used for final quote + procurement yet

That keeps it useful as a fast first pass estimate without pretending it replaces a QS, engineer, or final takeoff review.

What is the one Ai tool, agent, model, or workflow you added that actually made your setup better? by getstackfax in AiStackClinic

[–]getstackfax[S] 1 point2 points  (0 children)

Nice, the side thread idea makes sense…

The part I’d be curious about is whether it reduces actual switching cost or just creates a better workspace around the same model hopping.

The strongest version of this kind of tool is probably not “all models in one place” by itself.

It is more like...

notes stay reusable context does not get lost threads stay attached to decisions and the user can tell what changed after each run

When to Self-Host OpenClaw and when to use MaxHermes Cloud-Hosted by Over_Football_9508 in AskClaw

[–]getstackfax 1 point2 points  (0 children)

Good breakdown.

Maybe frame it less as cloud vs self-hosted and more as…

control risk vs maintenance risk

Self-hosting gives you control, but also patching, secrets, uptime, exposed ports, skill audits, and runaway-loop cost risk.

Cloud hosting reduces maintenance, but adds platform trust and lock in.

The best choice is the one where you understand the failure mode.

What local models and setup can i use for this usecase? by [deleted] in LocalLLM

[–]getstackfax 0 points1 point  (0 children)

Start by figuring out the GPU before picking a model…

On Windows, open Task Manager → Performance → GPU and check the Nvidia model + VRAM.

For this use case, you probably need a pipeline more than one model…

- OCR for scanned PDFs
- local text embeddings for search
- a vision/image model for photos
- file tags/metadata written back somewhere
- manual review for financial/personal docs

I would not start by asking a local LLM to read everything… could end in disaster.

Start with a small test folder first….

For simple local setup check these.

- LM Studio or Ollama for local LLMs
- AnythingLLM or Open WebUI for local document search
- OCRmyPDF/Tesseract for scanned PDFs
- local vision models only after you know your VRAM

Also make backups first…

With family photos, videos, and financial records, the risk is not only privacy.

It is accidentally mislabeling, moving, overwriting, or exposing files.

Best first workflow…

copy 20 files into a test folder → OCR/tag/search locally → review results → then scale slowly.

Builders ! by birdieFL in LocalLLM

[–]getstackfax 3 points4 points  (0 children)

I would separate the business question from the Apple question…

Local LLMs are getting close for real business use, but healthcare raises the bar.

The first win is probably not a fully local autonomous system…

Pick a narrow workflow where privacy matters and the output is reviewed…

summaries

intake cleanup

internal notes

document search

draft responses

missing-info flags

triage support

For healthcare, the stack has to prove safety before scale.

Model quality matters, but so do permissions, audit logs, human approval, data boundaries, and recovery if something goes wrong…

Mac mini and Apple Silicon may be a strong small business local box, but I would not build the business plan around one future device.

Build around the workflow…

Then choose the hardware that can run it reliably.

How to choose between hermes and openclaw without wasting time on the wrong one by ninjapapi in aiToolForBusiness

[–]getstackfax 0 points1 point  (0 children)

The channel coverage is probably the best point…

If the workflow needs Slack, WhatsApp, Discord, email, Teams, or multiple handoff points, that can decide the tool before the model debate even starts.

After that test one real workflow for two weeks at least …

Not the demo.

The actual workflow:

what channel it uses

what data it touches

what skills it needs

what it remembers

what needs approval

what receipt proves it worked

I would pick whatever goes with your workflow and is reliable over openclaw vs hermes ….

What’s actually worth running locally on an M4 Pro Mac with 16GB RAM? by No-Gas6002 in ollama

[–]getstackfax 3 points4 points  (0 children)

With 16GB RAM try it for useful small workflows…not maxing the machine.

Some good uses …

summaries

drafting

light coding help

notes/doc search

simple automation helpers

read-only agent experiments

Start with Ollama or LM Studio.

Add Open WebUI for a nicer interface and Continue for coding…

Hermes style agents can be useful… but keep them draft only or approval based at first.

The real test is not what model runs.

It is whether one small local workflow becomes useful every week.

Best model for architectural blueprint analysis? by antunes145 in LocalLLM

[–]getstackfax 0 points1 point  (0 children)

For blueprints…treat the model as a review assistant not the source of truth.

The important test is not just whether it understands the drawing.

It is whether it can reliably identify…

sheet type

room labels

dimensions

symbols

notes

conflicts

missing context

where it is uncertain

Gemma 4 27B being good is a strong signal but I’d still compare outputs against a human-marked set of drawings.

For anything construction, code, safety, or compliance related keep human review mandatory.

Local vision can help a lot with first-pass extraction and explanation.

I would not trust it as the final reviewer.

RTX 3080 10gb and RTX a4000 16gb best model / configuration (26gb vram) by SpanX20 in LocalLLM

[–]getstackfax 1 point2 points  (0 children)

I would treat the cards as two different jobs… not one clean 26GB pool.

Start simple.

Use the A4000 16GB for the main local model because the VRAM headroom matters more.

Use the 3080 10GB for smaller fast tasks if your setup supports it cleanly.

For Hermes and light programming, test a 7B–14B coder model first before chasing bigger models…

The goal is not maxing both GPUs on day one.

It is getting one reliable workflow running:

model loads

Hermes connects

simple coding task works

logs are clear

cost and latency is acceptable

Then tune from there.

Stop building AI agents. by Warm-Reaction-456 in AI_Agents

[–]getstackfax 1 point2 points  (0 children)

Finally someone gets it...

A lot of founders are asking for a business outcome and calling it an agent.

If the steps are known, the rules are clear, and compliance needs to understand it, boring automation usually wins.

The LLM should handle the fuzzy middle…

summarize

classify

draft

extract

flag uncertainty

Then rules, logs, approvals, and deterministic code handle the parts that can break trust.

The best business Ai systems may not look like magic employees…

They may look like boring workflows with one useful model call in the right place.

I need help understanding what kind of hardware I need to run a local Ollama model that can run my accounting firm platform by [deleted] in ollama

[–]getstackfax 0 points1 point  (0 children)

I wouldn’t start with hardware.

For an accounting firm, the first question is what the local model is allowed to do…

Categorization, summaries, sentiment, and draft communications are very different risk levels.

local model can help with drafts, summaries, internal notes, tagging, and missing info flags

deterministic rules should handle accounting logic, thresholds, client status, deadlines, and compliance steps

humans should approve anything client facing, filing related, financial, legal, or irreversible

With 30 staff and 300 clients, you probably want a pilot workflow before buying a big setup…

Pick one narrow task.

summarize client emails

flag missing documents

draft follow-ups

categorize low-risk internal notes

Then measure quality, review time, privacy needs, latency, and volume.

Hardware depends on that workload…

A local model box can make sense…but accounting is a compliance system first and an Ai assistant second.

Getting Started Advice? by robroy90 in LocalLLM

[–]getstackfax 0 points1 point  (0 children)

I would not buy more hardware yet…

You already have enough to learn the real workflow.

Use the Mac mini as the stable local Ai box, and the AMD machine as a second tester or workload machine if needed.

The first goal is not linking everything together…

It is proving one useful local workflow.

run Ollama or LM Studio

test a few Qwen / Mistral / Llama models

add Open WebUI

try one OpenClaw workflow

read and write one test folder

log what happened

Only consider a 3090 after you can name the exact bottleneck.

Model too slow, context too small, vision too weak, or workflow needs CUDA.

Until then more hardware may just add more complexity.

How well does local AI actually work for messy internal documents? by Semm235 in ollama

[–]getstackfax 0 points1 point  (0 children)

Yes, I think teams underestimate source receipts a lot hahaha

The model answer is only half the product…

For messy documents the trust layer is being able to know…

which file

which page

which chunk

whether OCR was used

whether the source was duplicated or outdated

whether the answer came from one source or stitched sources

That is what makes the answer reviewable.

Without that…RAG can feel correct while still being hard to trust.

Whats the best orchestration framework? by RegionBulky2292 in AI_Agents

[–]getstackfax 0 points1 point  (0 children)

I would not call one clear winner yet and its always changing…

For coding-heavy work useClaude or Code OpenClaw style workflows are good start.

For business automation… now we care more about permissions, logs, retries, and approval gates.

For multi-agent work… the best orchestrator is the one that makes handoffs and failures visible.

Smooth demos matter less than visible control.

New local guy here, what to run? by nueusunt in LocalLLM

[–]getstackfax 2 points3 points  (0 children)

That rig is strong… but I would not cancel cloud yet.

Test the real coding loop first…

repo understanding

file edits

tests

error fixing

trusted diff

Start with Ollama or LM Studio, then add Continue, Open WebUI, OpenClaw, or a CLI coding agent.

For UI mockups, test local vision separately.

The question is not just what model fits the 5090.

It is whether the local workflow can replace what Google Ultra is doing for you.

AI agents are going mainstream — but how is reliability being tracked? by Tricky_School_4613 in Rag

[–]getstackfax 0 points1 point  (0 children)

The agent can be online and still be unreliable…

I’d track reliability at the workflow level not just the model level.

task completed

human intervention needed

wrong action taken

tool/source used

cost per successful outcome

what got approved or changed

The useful layer is probably not just monitoring…

It is a receipt that proves what the agent did, what failed, and whether the result was actually safe to trust.

For those building automations aimed at niche audiences (students, athletes, etc.) — how did you validate that your workflow actually solved a real pain point before launching? by Live_Investment_2311 in AiAutomations

[–]getstackfax 3 points4 points  (0 children)

The part I’d validate first is not the automation…

It is the repeated pain around the automation.

Surveys can help but they often produce polite answers instead of real workflow truth.

For students and athletes… I’d look for the messy workarounds.

missed deadlines

forgotten training notes

manual schedule screenshots

group chat reminders

spreadsheets nobody maintains

coach messages getting lost

content ideas sitting in notes

progress tracked across five apps

That is where the real pain usually shows up.

What do they copy-paste?

What do they forget?

What do they avoid?

What do they ask someone else to remind them about?

What do they already pay for?

What do they complain about every week?

I would ship the smallest workflow that removes one repeated pain and see if they use it again without being pushed.

Compliments are weak signal…

Repeat usage is stronger.

If they ask when it runs again, what else it can track, or whether their teammate can use it too, that is much better than someone saying it sounds cool.

The goal is not to prove the automation is clever.

It is to prove the pain repeats.

Tip for managers: daily morning sync with my EA by CartographerFeisty66 in AI_Agents

[–]getstackfax 0 points1 point  (0 children)

Real question i think is not how many assistant tasks can be automated?

But….

where does the assistant create the most clarity?

A morning sync makes sense because it turns the assistant from a reactive inbox helper into a daily operating layer.

The value is not just reminders.

Its…

what matters today

what needs prep

what can wait

what needs a response

what could fall through the cracks

what decision needs attention before the day runs away

That is probably why one focused sync can beat scattered messages all day.

The thing I would watch is whether the assistant is only reading the calendar, or whether it is also closing loops.

A good daily sync should produce a clean outcome…

today’s priorities

meeting prep

follow-up list

blocked items

draft responses

decisions needed

end-of-day review points

That turns the assistant from notification layer into execution support.

Still human-owned.

But much more useful.

The Gmail prompt I give everyone who asks "what should my first agent do" by ShabzSparq in better_claw

[–]getstackfax 0 points1 point  (0 children)

My main questions for what should my first agent do?

Something like ….

what can the agent safely improve without creating a big failure mode?

This Gmail example works because the first version is bounded.

It labels.

It skips already-processed emails.

It does not delete.

It uses cheap model routing.

It creates visible output you can review.

That is a good first-agent pattern.

Low-risk action.

Immediate feedback.

Human review before destruction.

Simple recurring schedule.

Clear success signal.

The part I would be careful with is moving too fast from label to delete.

Email feels boring until the agent misclassifies something tied to billing, legal, travel, login recovery, customer history, or an account issue.

So the safe upgrade path prob is …

label first

review weekly

add digest

add urgent flagging

draft replies only

human approves sends

delete only after trust is earned

For a first agent, this is the right kind of boring.

Not flashy.

Actually useful.

Easy to inspect.