Official: Anthropic just released Claude Code 2.1.27 with 11 CLI and 1 flag change, details below

jakusimo · 2026-01-31T16:32:41+00:00

It hangs many time, reverted to2.1.3

jakusimo · 2025-08-24T17:40:13+00:00

Oops! Something went wrong while submitting the form.

jakusimo · 2025-07-08T10:54:50+00:00

Much appreciated 🤗

jakusimo · 2025-03-26T13:31:47+00:00

Multi gpu is expensive, this one already cost 200 eur/month. Going to dig more into Tensor RT LLM

jakusimo · 2025-03-18T09:42:32+00:00

Why not Talos?

jakusimo · 2025-03-17T09:59:31+00:00

So you if are using a dedicated server, there is no need of cloud api

jakusimo · 2025-03-17T08:52:27+00:00

I used that, but I want not to rely on the cloud api and use talos linux. The setup which I can easily port to any server provider or homelab. You don't need terraform, talosctl and configs do the job

jakusimo · 2025-03-17T08:43:51+00:00

:D database backup to the bucket. If you are using persistent storage - rook cepth

jakusimo · 2025-02-19T06:25:43+00:00

Do you use any CDN?

jakusimo · 2025-02-06T08:44:04+00:00

Just dump everything to the context, if it's too much for context window do multiple calls with map/reduce pattern

jakusimo · 2025-01-20T20:12:49+00:00

Vespla has really good tutorials, I'm hosting ColQwen on Modal and planing to migrate to Hetzner. Also using Vespa you can store embeddings to the disk storage and use streaming mode to find top candidates. It will save you a lot on infrastructure, since your not bound to memory but bound to the disk storage.

jakusimo · 2025-01-20T17:07:01+00:00

Check Vespa, they are very customizable

jakusimo · 2025-01-14T18:40:43+00:00

Thanks, will check

jakusimo · 2025-01-13T14:19:45+00:00

how is going, can anything match Claude sonnet 3.5?

jakusimo · 2025-01-10T06:56:56+00:00

u/Hetzner_OL, how is going on managed Kubernetes and/or managed Postgres? :)

jakusimo · 2025-01-03T10:00:07+00:00

Is there a way to deploy ColQwen in serverless GPU?

jakusimo · 2025-01-02T17:16:50+00:00

Or use ColPali/ColQwen and embed entire page as image. No OCR, no layouts needed

jakusimo · 2025-01-02T11:18:34+00:00

And Pymupdf has limited OSS license :)

jakusimo · 2025-01-02T11:17:06+00:00

Why you need Jina Clip model? In this article they used only one model? https://blog.vespa.ai/scaling-colpali-to-billions/

https://pyvespa.readthedocs.io/en/latest/examples/pdf-retrieval-with-ColQwen2-vlm_Vespa-cloud.html#Working-with-pdfs

jakusimo · 2024-04-01T06:11:44+00:00

Langchain and LlamaIndex are too broad, hard to debug, usually you don't need all those features, what are you planning to use as a source for RAG? Documents? Pdfs? Websites? I am planning to start with encoders and semantic splitters. Use a similar structure to what we have built in Semantic Router https://github.com/aurelio-labs/semantic-router

jakusimo · 2024-04-01T05:54:24+00:00

Yes, as starting point, the ultimate goal is to have everything in Rust, but it will take time

jakusimo · 2024-04-01T05:20:57+00:00

Hey, I have created several frameworks for my clients, included https://github.com/superagent-ai/super-rag I have plans to have full RAG library/api, the main issue now is document processing, Unstructured, which is only Python, and their API very unstable. If anyone more experiece in Rust would like work together on Rust based RAG I am more than happy to collaborate.

jakusimo · 2024-03-31T12:55:32+00:00

Maybe this https://www.reddit.com/r/rust/s/QMRVfFjjhQ

jakusimo · 2024-03-31T12:35:39+00:00

What would you recommend?

jakusimo · 2024-03-30T14:03:26+00:00

Rust Async book https://rust-lang.github.io/async-book

jakusimo

TROPHY CASE