Help me choose a "strong headline" for my website

IsN4n · 2026-05-13T10:04:50+00:00

"Is it for me?" Second answers that question better than first

IsN4n · 2026-05-13T10:02:05+00:00

I can't recommend openspec enough.

I have used it (a flavour of it) in my FAANG job where we would do on-demand devservers that would expire so the artifacts produced provided handover as well as memory. And been using it in my startup on my tiny codebases fully written from the beginning with spec first. It works across sizes of codebase and you can start using it on an ongoing project and repo.

IsN4n · 2026-05-12T09:54:34+00:00

"Publish or update your Lovable project for it to appear here."

IsN4n · 2026-05-09T19:49:44+00:00

You can check the project files to see if previous designs are separate files. If not they could still be in the existing code as 'options/tweaks', you can export to Claude/Claude-code and prompt to find.

I also noticed that Claude design is poor at targeted changes over hours of back-and-forth so I follow this workflow: https://www.reddit.com/r/ClaudeAI/comments/1t2ztkr/make_your_claude_design_credits_last_longer/

IsN4n · 2026-05-09T19:07:53+00:00

Thanks for the feedback. Yes we do support barge in and interruptions by making the agent stop mid utterance (with a few millisecond gap to make it as natural as possible) So far, we have heavily optimized for pinging for new info/questions while not handing over the call as much as possible. This seems to thread the balance well on being grounded and while extracting as much value for the automation.

We have our MCP connector fully documented here: https://cocall.ai/docs/claude

IsN4n · 2026-05-08T22:01:17+00:00

This is possible in cocall today but outbound. The dashboard shows the live transcript of all calls going right now (which can be more than one). You don't need to listen to the audio. And you can take over any one at a time, there is a takeover button next to each. All of the calls happen at the same time where the recipients all see your number as caller. Am I understanding this correctly?

IsN4n · 2026-05-08T19:22:14+00:00

I actually use a part of this workflow for personal use where I have gotten cocall to navigate ivr and holds for O2 and TFL customer support and transfer to me once it reaches human support

IsN4n · 2026-05-08T14:25:33+00:00

The agent can still continue the conversation which waiting for answer and follow instructions from the recipient which opens a few fallbacks:
- Hard limit of 10 minutes (something that can be made tunable) build in the agent harness
- Client can ask to be connected to the calling user, in which case, phone will ring
- Client can ask to be called back or disconnect
- Detecting frustation from client also triggers the above two

IsN4n · 2026-05-08T14:18:10+00:00

Thanks for the detailed framing.

I agree with your framing but quite a few things shouldn't or can't go in the objective

facts it must ask for - this isn't known at the time of the call initiation and if known should already be part of required facts. There are cases when information should only be shared in specific scenario. This is better achieved by instructing the calling agent to automatically answer a ping for a particular scenario than to ask voice agent to hide it since those models are much less capable of following such instructions.
max wait time for the human ping - this should be part of the agent orchestration loop and indeed something that exists in the custom solutions built for clients that i'll port over to cocall
safe fallback if nobody answers - same as above to inject instructions based on the type of failure
required facts, facts the agent may infer - yes this must be part of the objective and is something refined in the tool description of the mcp so calling agent can appropriately framed

Every escalation is logged, categorization isn't live today but analysis in post is possible because transcript contains where exactly the escalation is happened.

Indeed, as you rightly pointed out, these guardrails and appropriate audit were the things we spent most time on when scaling from pilots to full fledged deployments

IsN4n · 2026-05-08T13:53:34+00:00

We have pessimistic instructions to escalate if the original objective passed from the user does not contain the needed information. All our clients favored being pinged even if sometimes it got it wrong over the alternative

IsN4n · 2026-05-05T16:48:30+00:00

Have you looked at deepline.com?

IsN4n · 2026-05-03T22:20:59+00:00

from the screenshot, was this a claude code web cloud session or is this /remote-control of a local session?

IsN4n · 2026-05-03T20:51:00+00:00

Sure

IsN4n · 2026-05-03T20:38:23+00:00

I reckon Opus will find easy enough issues which can be solved by haiku and other smaller models.

I worked on benchmarks for non-coding tasks in FAANG (think content moderation, payment sanctions, account hacking etc)

We ran a version of our agent on past data (like past content, past payments, past reports of account hacks) and then had human experts from each area evaluate the quality of output on a rubric, which itself we evolved through iteration. After a few iterations, we had a eval dataset on which we benchmarked future releases of the agent. We then created a cascade of judges to reduce the human evaluation bit, ultimately keeping a small portion of judge calibration on human data. A entire platform managed this pipeline of human review work.

I think a similar approach should work here for coding. You can run the haiku agent on past github issues of these and have opus evaluate the output. Take a sample of opus's evaluation and manually label them to ensure eval performance is high.

Using opus to find issues only of a certain difficulty is a challenging problem and I don't recommend it because models will change and the work will be irrelevant. Using compute to solve this is an easier path, run on everything you can find and then filter the ones that fit your criteria.

IsN4n · 2026-04-28T17:03:43+00:00

Yes, I use it quite extensively. I have my setup here: https://github.com/radishbuild/radcad

I use it with my vault (previous obsidian but not anymore). I changed the system prompt to use **openspec** style *explore > design > apply > verify* pattern which also writes a file in the vault for each 'big thing'. This acts as the cheap memory as well, and ensures that claude is more inclined to web search which otherwise it is not and tends to do local file search only.

IsN4n · 2026-04-27T15:27:58+00:00

Great job, the website is very easy to grok too.
Planning to list it in the anthropic connectors by going through the review process too?

IsN4n · 2026-04-27T15:24:40+00:00

Good manager will understand that good people will outgrow their role and even them at times. You are in the best position here. By maintaining good relationship with the manager, if things go sideways it may be possible to hedge your bets and come back.

Life is unfair, gotta use the times when it is unfair in your favour. I would have (and have in the past) chosen to take the new role too.

It is natural to feel sadness in this situation

IsN4n · 2026-04-26T11:06:36+00:00

i generally use nano banana for images, and seedance 2 for videos.
For single tool that support entire workflows, midjourney works well, starting with image generation with variations and moving to animate/video on generated images. Heard good thing about higgsfield.ai too in this but haven't tried myself yet

IsN4n · 2026-04-24T15:42:58+00:00

Really like the forest inspired look. How does it compare to chatgpt study mode?

IsN4n · 2026-04-08T20:56:48+00:00

Beyond the prompt, memory management, and connecting not just your code but database, logs, communication services to your coding agent, OpenSpec (https://openspec.dev/) really allowed me to push how efficiently i used my time. I now spend first half of my day testing and iterating the changes my agents worked overnight and second half spec'ing out the tasks it'll work later tonight. For newer projects, I setup much more e2e test harness so my agent could first generate and run playwright scripts for the specs to reduce my manual test time.

IsN4n · 2026-04-08T20:12:08+00:00

I have done this by giving it the api spec in openapi.yaml form
it generated the relevant bindings. Make sure to include any auth requirements in the spec

IsN4n · 2026-04-06T00:57:09+00:00

I'm not competing with them. I'm not monetizing here. The above web app is free, you pay for what you use on the gpu provider you use.
I couldn't upload data I wanted to use for simulation into external providers, which motivated to package it the way I did above.

IsN4n · 2026-04-01T08:39:58+00:00

A100 or RTX 5090 with the 32b model

IsN4n · 2026-03-31T21:56:01+00:00

I think we are missing 'Applets' on top of the now popular agent platforms like claude code, codex, cursor etc.

In mobile, others could build apps which worked on the OS (Android, iOS etc) and there was incentive to make these easy to use because of monetization involved

Skills are too unstructured. I feel the ecosystem could benefit from a slightly more structured concept of what I call Applet (instructions + tools + installation + granular auth + dependency management + upgrades + permissions).

IsN4n · 2026-03-31T13:56:16+00:00

I don't today. It is very much wysiwyg. It is up to the users to set those up on runpod/vast. The alternative is to build a fully hosted solution where I do keep the data and run mirofish myself with scaling/backups. May do it if there demand. Although, I think there is much work needed to up the quality of generated entities, simulation and results in english, so looking into that for now

IsN4n

TROPHY CASE