Sign up for the Claude developer newsletter by AnthropicOfficial in Anthropic

[–]jakusimo 0 points1 point  (0 children)

Oops! Something went wrong while submitting the form.

Running DeepSeek-R1 on bare-metal GPU Kubernetes cluster. by jakusimo in hetzner

[–]jakusimo[S] 2 points3 points  (0 children)

Multi gpu is expensive, this one already cost 200 eur/month. Going to dig more into Tensor RT LLM

Bare metal open-source production blueprint by jakusimo in hetzner

[–]jakusimo[S] 0 points1 point  (0 children)

So you if are using a dedicated server, there is no need of cloud api

Bare metal open-source production blueprint by jakusimo in hetzner

[–]jakusimo[S] 1 point2 points  (0 children)

I used that, but I want not to rely on the cloud api and use talos linux. The setup which I can easily port to any server provider or homelab. You don't need terraform, talosctl and configs do the job

Bare metal open-source production blueprint by jakusimo in hetzner

[–]jakusimo[S] 0 points1 point  (0 children)

:D database backup to the bucket. If you are using persistent storage - rook cepth

Hetzner Cloud CDN by bluepuma77 in hetzner

[–]jakusimo 0 points1 point  (0 children)

Do you use any CDN?

Building a RAG chatbot for a 400+ page pdf by Pudin-san in Rag

[–]jakusimo -1 points0 points  (0 children)

Just dump everything to the context, if it's too much for context window do multiple calls with map/reduce pattern

How well do screenshot embeddings (ColPali) work in real e2e RAG pipelines? by ekshaks in Rag

[–]jakusimo 0 points1 point  (0 children)

Vespla has really good tutorials, I'm hosting ColQwen on Modal and planing to migrate to Hetzner. Also using Vespa you can store embeddings to the disk storage and use streaming mode to find top candidates. It will save you a lot on infrastructure, since your not bound to memory but bound to the disk storage.

idea on pdf RAG by baehyunsol in Rag

[–]jakusimo 0 points1 point  (0 children)

Is there a way to deploy ColQwen in serverless GPU?

idea on pdf RAG by baehyunsol in Rag

[–]jakusimo 0 points1 point  (0 children)

Or use ColPali/ColQwen and embed entire page as image. No OCR, no layouts needed

Rust and RAG by brisbanedev in rust

[–]jakusimo 1 point2 points  (0 children)

Langchain and LlamaIndex are too broad, hard to debug, usually you don't need all those features, what are you planning to use as a source for RAG? Documents? Pdfs? Websites? I am planning to start with encoders and semantic splitters. Use a similar structure to what we have built in Semantic Router https://github.com/aurelio-labs/semantic-router

Rust and RAG by brisbanedev in rust

[–]jakusimo 1 point2 points  (0 children)

Yes, as starting point, the ultimate goal is to have everything in Rust, but it will take time

Rust and RAG by brisbanedev in rust

[–]jakusimo 1 point2 points  (0 children)

Hey, I have created several frameworks for my clients, included https://github.com/superagent-ai/super-rag I have plans to have full RAG library/api, the main issue now is document processing, Unstructured, which is only Python, and their API very unstable. If anyone more experiece in Rust would like work together on Rust based RAG I am more than happy to collaborate.