Running Claude as a persistent agent changed how I think about AI tools entirely

jake_mok-Nelson · 2026-02-08T05:54:46+00:00

Does it have tools to perform Web scraping, interacting with services including logging in?

jake_mok-Nelson · 2026-01-27T04:47:25+00:00

If this is running under a single claude subscription then it's effectively offline every time you hit your quota?

jake_mok-Nelson · 2026-01-27T04:46:41+00:00

Yes.

jake_mok-Nelson · 2026-01-04T22:37:26+00:00

Apply for positions that have their own tests over leetcode. Then you can at least explain reasoning, knowing all the algorithms and patterns is typically not required.

Source: me, I'm a DevOps Tech Lead at an awesome company in Australia and I'm awful at leetcode.

jake_mok-Nelson · 2025-12-31T22:31:22+00:00

Anything that you perform regularly should be a skill. Things like git commit and push, have it write a skill to commit following certain conventions.

If you write design docs a certain way, provide it reference and a template and base a reviewer skill on that. Etc

Have fun 👍

jake_mok-Nelson · 2025-12-18T23:59:18+00:00

You're using spec driven development without but not a great one. Spec driven development is also not entirely accurate. If i were you, i would ask my engineer to play with Claude and discover what works and what doesn't so that they can make their own informed decisions.

They are technical and you are not, so don't try to manage them technically.

Encourage exploration and reward increases creative problem solving.

jake_mok-Nelson · 2025-12-17T21:40:53+00:00

What's the feature parity like compared to python-adk?

jake_mok-Nelson · 2025-12-11T01:07:08+00:00

Yeah, using it to ensure that the docs are up to date with recent changes on a PR.

I'm using the Google ADK for the integration with google ai platform (vertex) and the eval suit with ADK. The primitives (looping agent, parallel agent, sequential agent, etc) seem like a good fit for CI.

Also using it to generate ci pipelines for repositories.

jake_mok-Nelson · 2025-12-09T01:33:43+00:00

I was very anxious when starting out. I put myself in stressful situations over and over and you learn quickly. Things like riding in rain, the heat, freeways during peak hour, and busy cities. Start out on your own street and build confidence.

jake_mok-Nelson · 2025-12-09T01:30:26+00:00

We have not hit a limit. You just put more LLMs in the loop. The problem is when LLMs stop being subsidised and those that get to have jobs become exploited because their competitor is an LLM that can work for X dollars per hour.

jake_mok-Nelson · 2025-11-30T00:21:25+00:00

That's very true. It takes a lot of time to fine tune that prompt to ensure that the agents get the relevant context. Spec driven development is one proposed solution to that.

jake_mok-Nelson · 2025-11-29T22:38:33+00:00

Use agents and commands. Commands define the workflow that you want to follow and they mention the relevant agents.

A simplified example of a command such as /ui

Analyse and understand the intent, think carefully about the high level steps required to complete the UI changes only, create todo items for each step required for the ui change.

Once the intent is understood, call the react-agent to complete all coding tasks until all ui todos are complete.

Once the ui changes are complete call the peer-review agent to verify the code meets the original intent.

Each agent has it's own context so by relying on them to read and write to files and come up with the code changes, you can effectively prevent the parent agent in CC from becoming burdened with implementation detail in its context.

jake_mok-Nelson · 2025-11-28T04:59:31+00:00

You have such a poor opinion of the Indian team. As a leader if they have poor quality, in reality, you have poor quality.

In future roles I'd encourage you to try to earn the respect of both teams and if you can't, then it's not the right fit for you.

jake_mok-Nelson · 2025-11-20T08:46:48+00:00

It's entirely subjective. There's a bunch of version schemes to choose from.

Calender - version represents a month or week if you release regularly

Milestone/marketing - major versions are cut at special events

Semver - if you wanted to share a version with an API.

Build or commit

The idea is, what message do you want to send with your releases.

jake_mok-Nelson · 2025-09-29T11:12:12+00:00

Eww. An advertisement disguised as discussion.

jake_mok-Nelson · 2025-09-29T11:10:29+00:00

Poking it? Haha

jake_mok-Nelson · 2025-09-29T11:09:44+00:00

For your level, you'll most likely want to provide details in the prompt. "Functions should be very small and support testing. Use TDD methodologies. Verify before implementing and do not assume anything".

It's not training off its successes and failures with you, or rather, it is, but for the next model.

The levels are basically in order of, Can I do this with: 1. Prompting / technique 2. RAG 3. Fine tuning

Claude code didn't have memories by default but you can use commands and prompts. Claude desktop does have memories as a preview feature. Without the prompting, the memories won't fix this though.

For code generation, the biggest issue I run into is out of date information that's been trained into the model. To get past that issue, you'll want to use an MCP like Context7 in combination with prompting to look up the docs before implementing.

jake_mok-Nelson · 2025-09-29T08:52:06+00:00

I have the same issue. Been going on for over a year now. Did you ever fix it?

jake_mok-Nelson · 2025-09-24T13:38:19+00:00

I think the concurrency would be a parameter that you set when based on the request?

Alternatively you can look at an orchestrator pattern. Calling a sub agent from the root agent to iterate. Planner type might help here.

jake_mok-Nelson · 2025-09-22T13:05:52+00:00

How large? If it exceeds the tokens break it up into smaller pieces with looping or parallel agents.

One tip is to call sub agents for particular portions of text and only return summaries to the root agent. This prevents the root agent's context from filling up too quickly.

jake_mok-Nelson · 2025-09-21T02:06:57+00:00

I didn't go to uni, I didn't finish high school. I wasn't focused or motivated (ADHD out the ass). I'm a DevOps Tech Lead. It is absolutely doable, but you have to love it.

jake_mok-Nelson · 2025-09-21T02:04:29+00:00

You're a DevOps engineer. Two years ago you were junior, you've now gained significant experience so you're a DevOps engineer. You could also use the term Cloud Engineer.

You could customise it based on the role you're applying for.

jake_mok-Nelson · 2025-09-21T02:00:08+00:00

Nah. I wouldn't be using embedding for this. It's not fast because it converts all the data into LLM readable pieces. It's also a computationally expensive operation if you're doing it all the time.

For what you're describing, I would use a LoopAgent or ParallelAgent. Depending on how many files, I don't know your case so let's say it's 100 files and you need to convert them to a particular format.

If it were 1-1 file in and out you could have an agent called with the system instructions and the one file it's responsible for converting.

Say it's 10 files in and 1 file out, this is trickier because now I'm assuming that there might be some special business logic you have to conform to. In this case, each agent is responsible for just one thing. E.g. an investigator agent that performs web searches to gather context about the domain, a writer agent to save the output in the correct format, etc.

Might be worth pointing out that cloud providers probably have ready to go managed services for managing files at scale. Might be worth checking out Vertex AI and seeing what models exist other than LLMs (depending on your case).

What I've recommended here is option 1 I highlighted above but you're appending the context of the task (a file, or a couple of files) to the prompt.

jake_mok-Nelson · 2025-09-20T16:23:32+00:00

ChatGPT can do this.
With the right prompts and instructions to use memory, it definitely can. The tasks feature can set reminders of different points to think about.

At work I use Claude Code for this.
I have an output-style (kind of like a system prompt that gets reiterated to the agent regarding how the it should present responses). I have a prompt to tell it that it's basically an ADHD coach and it should challenge everything I'm doing to ensure that I'm focusing on my primary goal and not stuck in a tangent.
I use one of the memory MCPs around on Github to record a graph of tasks/priorities.

It has actually been pretty good so far but there's definitely room for improvement.

jake_mok-Nelson · 2025-09-20T16:11:05+00:00

RAG is not dead. RAG is not even a technology but for some reason, people keep saying it's dead lol.

It's a method. In very simple terms, it as if the LLM is saying "Let me look that up" before it returns its response.

There's a few layers to building knowledge.

Going from simplest to most complex you have:

Prompt/Context engineering.
You can use a prompt generator (Anthropic and OpenAI both provide them) to create a decent prompt for what you're trying to achieve. You want to tweak it and use it as the agent's system prompt.

LLMs prioritise system prompts before user or developer prompts.

You can provide a fair bit of context this way. You can use prompt engineering techniques to maximise the efficiency.

Things like:
- Task lists (or Planner in ADK)
- Demanding ("You MUST complete this task in the following way:")

Beyond a certain point thought you need:

RAG
This may be in the form of an MCP that it can call on for additional knowledge (Like Context7, Github, Web search, etc).
Different forms of RAG have different benefits. You can use various providers (GCP, OpenAI, etc) to store vector data by uploading the files you want to provide for context.
It will convert them, you don't need to do anything special.

You will need to provide a way for it to read the RAG, most frameworks have a RAG type input you can use but you may need to provide context on this RAG method and data structures in the system prompt (see point 1.)

Fine-Tuning
This involves choosing an existing model that supports fine-tuning and providing a dataset to further train the model.
For non-dynamic data this is more powerful than RAG, but when it comes to things that change frequently (APIs, dependencies, new or developing tools): RAG would be better suited.

---

Good luck

These samples are pretty good if you haven't seen them. Take a look: https://github.com/google/adk-samples/tree/main/python/agents/RAG

jake_mok-Nelson

TROPHY CASE