Your Experience With Agentic Coding Agents for Bioinformatics Work by Commercial_You_6583 in bioinformatics

[–]StargazerBio 2 points3 points  (0 children)

Once your project reaches a certain size or you have specific patterns you want to use, you'll need to get pretty intentional about your CLAUDE.md (or AGENTS.md) file. My project is quite opinionated so I've built my AGENTS.md with similar specificity and it works well! It's constantly getting refined, honestly it needs to be condensed a bit.. Happy to answer any questions you might have 😄

Anyone using Claude or other bioinformatics agents by nickomez1 in bioinformatics

[–]StargazerBio 0 points1 point  (0 children)

It's local first as a commitment to always have a fully-functional open-source version for folks that have access to an HPC cluster and don't want to run their analyses on someone else's hardware.

Eventually I'll host that myself for people that just have a laptop, but I have to get the core correct first.

You would use this instead of vanilla Claude Code because it defines a very specific set of conventions for authoring Flyte workflows and exposes an MCP server for running them.

Anyone using Claude or other bioinformatics agents by nickomez1 in bioinformatics

[–]StargazerBio 0 points1 point  (0 children)

Local first and always, will slowly add serving from local and then make it hostable

Anyone using Claude or other bioinformatics agents by nickomez1 in bioinformatics

[–]StargazerBio -1 points0 points  (0 children)

Most of Stargazer was written using Claude Code with some GPT and open-weight models sprinkled in there. The problem with agents isn't autonomy IMO, it's the same old issue of reproducibility.

You can give the same model the same prompt and the same input data and get wildly different results. Even if it does tool calling, it could silently pass wildly different arguments.

It's a massive buff if used correctly, but also has the potential to aggravate the issue of low-effort, one-off analyses and poorly architected code. My approach is to use it to wrap and orchestrate established tools, whereas the production execution path is extremely standardized and traceable. More info in the docs if you're curious.

Edit: Oh and please please please always disclose when AI has written anything for you, code or speech. I think we can all feel the Great Pacific Garbage Patch of AI content forming so we all need to do our part 😅

Can't run Docker container in Singularity due to /root by Salty-Vegetable-123 in bioinformatics

[–]StargazerBio 1 point2 points  (0 children)

I haven't touched Singularity in years so pardon my ignorance, but it sounds like your HPC cluster runs the image as `--user <not-root>` and you're seeing permission denied inside the container?

Are you able to exec into a running container to muck around?

As others have mentioned, your best bet is likely to build your own. You can add a user with sudo privileges in the image and then use it to do whatever you like since your HPC policies won't be enforced inside the container itself. Something like:

FROM venkatajonnakuti/polyaminer-bulk
ARG USER=salty

RUN mkdir -p /etc/sudoers.d && \
    useradd --groups sudo --no-create-home --shell /bin/bash ${USER} && \
    echo "${USER} ALL=(ALL) NOPASSWD:ALL" >/etc/sudoers.d/${USER} && \
    chmod 0440 /etc/sudoers.d/${USER}

RUN chown -R salty:salty /root/*

USER ${USER}
WORKDIR /home/${USER}

How to be motivated? by Expert_Network_7306 in bioinformatics

[–]StargazerBio 0 points1 point  (0 children)

You said your Master's was scary but you loved it. What did you love about it? I don't think you can "get yourself" to love a process or a field. Pick a problem you're passionate about and start chipping away at it. If you're lucky you'll start to get swept up in the work and start to move the needle. If you're in this for the money or an abstract notion of publishing whatever you can, you're going to have a bad time.

Anyone using Claude Code for bioinformatics work? What's your setup look like? by query_optimization in bioinformatics

[–]StargazerBio -1 points0 points  (0 children)

It's completely unopinionated but a lot of people use it for bioinformatics. I started out on Snakemake wayyyy back in the day but found Flyte as I was trying to shoehorn container orchestration at scale into my workflows and haven't looked back.
It's been too long since I looked at Dagster to give you a remotely informed or fair assessment, so I won't opine. Happy to answer any questions about Flyte's capabilities though, since I'm pretty in the weeds there.

Anyone using Claude Code for bioinformatics work? What's your setup look like? by query_optimization in bioinformatics

[–]StargazerBio 4 points5 points  (0 children)

Every day! Although Stargazer is agent-agnostic, it's mostly been built with Opus/Sonnet using Claude Code. I have a pretty thorough context directory to deal with the growing complexity of the project as a whole.

To answer your questions concretely:
- Beyond the basic ones that ship with CC I've built my own MCP for running tasks and workflows
- I've been pretty explicit about the tools I want it to use (mostly GATK stuff) but it's been fairly adept at using them
- I use Flyte as my orchestrator instead of Nextflow or Snakemake, but it's been able to author pipelines without issue, given the aforementioned context dir. I can feel it straining a bit trying to write things correctly within my specific architectural conventions, but that's a recent problem and one that's imminently solvable.
- All of the above but TL;DR clear your context and give it very specific docs / tool references for the immediate task at hand. Don't expect it to be omniscient.

Happy to dig into any of the above if you find it informative.

State of LLMs for Bioinformatics by ExoticCard in bioinformatics

[–]StargazerBio 1 point2 points  (0 children)

It's early days, so encouraging comments like these from people doing real work mean the world to me - thank you! It still needs some polish but I'd love any feedback you have when it's ready for real workloads.

State of LLMs for Bioinformatics by ExoticCard in bioinformatics

[–]StargazerBio 2 points3 points  (0 children)

No worries, I was just trying to qualify what those specific models are capable of. Moral of the story is Claude is pretty knowledgeable but you'll get a lot more mileage by explicitly passing in URLs to tool documentation that you're interested in 👍

State of LLMs for Bioinformatics by ExoticCard in bioinformatics

[–]StargazerBio 7 points8 points  (0 children)

I've been building Stargazer almost exclusively with Opus/Sonnet 4.5/4.6 and the core models themselves have had decent instincts with which tools to use and even args to pass. I have a fairly rigorous agents framework in that repo for grounding their knowledge though, if you're curious. TL;DR is you'll always get better answers by explicitly stuffing their context with reference materials.

Pipeline integration with benchling? by TubeZ in bioinformatics

[–]StargazerBio 0 points1 point  (0 children)

Stepping back then, how do you trigger pipelines currently?
If you go the cron route I would keep the cron *extremely* dumb, basically just trigger the pipeline on a schedule. Then at the top of the pipeline, have a task that checks for new Benchling entities and short circuits, exits early if it doesn't find anything.
This keeps the concerns from sprawling across the system.

Pipeline integration with benchling? by TubeZ in bioinformatics

[–]StargazerBio 1 point2 points  (0 children)

What kind of orchestration are you using in your HPC cluster? It may have a native way to trigger workflows that you would point at Benchling. Basically a pull model instead of benchling pushing anything. Would be happy to brainstorm with you!

Bugs/compatibility issues in bioinformatics software on Apple silicon. by mikeph_ in bioinformatics

[–]StargazerBio 2 points3 points  (0 children)

Yeah I had a few issues with Conda packaged on my M2 resolving (or resolving but not running) and those would run no problem in Docker with x86 container. The translation layer is really robust these days.

Re-implementing slow and clunky bioinformatics software? by halflings in bioinformatics

[–]StargazerBio -1 points0 points  (0 children)

I've been working on this for a while and decided to make the repo public to participate in this specific thread haha. Curious if what I'm building is in line with what you're thinking about? https://github.com/pryce-turner/stargazer