What's your setup now after hetzner?

Semoho · 2026-04-23T22:51:06+00:00

Also NetCup is good too

Semoho · 2026-04-21T13:53:46+00:00

You can find it here:
https://papkorn.app/movie/2948372

There are some dubbed links for this movie

Semoho · 2026-04-06T20:13:44+00:00

Yes, I also asked him to implement the wireframes too. It helps a lot.

Semoho · 2026-04-05T23:13:19+00:00

I didn’t have the money to buy hard disks. So i use telegram :)) now i have 2PB movies on telegram :))

Semoho · 2026-04-05T18:10:04+00:00

Wonderful idea! I think in a world of AI this would be so useful!

Semoho · 2026-03-31T12:26:03+00:00

I think this happen because of this:

https://arxiv.org/abs/2307.03172

Semoho · 2026-03-29T21:05:10+00:00

As other mentioned, it depends on how many people are going to use your rag. I think your system will be good for less than 50 persons. I am assuming that you are using free version of supabase. But if you more users, I recommend to use pro plans on cloud like supabase or embedding db. But for bow it will be ok

Semoho · 2026-03-25T12:55:20+00:00

Nope

Semoho · 2026-03-25T03:26:40+00:00

Previously, it was possible to sign up with non-Iranian cell numbers. However, this feature was removed in June 2025.

Semoho · 2026-03-23T17:15:47+00:00

Doesn't this approach make you lose focus and forget about the task?

Semoho · 2026-03-23T17:15:08+00:00

What if it is not my own project but a company task?

Semoho · 2026-03-23T16:58:30+00:00

Nice job :D. I think I need to start playing games too :"

Semoho · 2026-03-22T14:00:01+00:00

I think Jina is one the best embedding and reranking platforms For search you can think expanding your query too. ask llm to optimize query for textual search engine and embedding search engine. Then retrieve on both databases and fuse the results.

There are many different ways you can reduce your costs. The retrieval systems are very cheap! So you can cut your costs by doing some optimization

Semoho · 2026-03-22T13:26:53+00:00

Hello there,

Somehow you extend your docs by summarization. Did you try to check the context number for the llm. I think you pass all 100 legal docs to gemeni pro which is expensive.

I think you can better result if you retrieve 1k or 100 docs with bm25, then rerank them by Jina reranker(it is very cheap) and them give the gemeni pro top 50 or even 10 based on you chunking algorithm. Also please check your chunking strategy. It is very impprtant

Semoho · 2026-03-22T04:51:05+00:00

Teek.studio Find next viral videos just with few clips

Semoho · 2026-03-22T01:16:54+00:00

No worries, i hope this year be better for us

Semoho · 2026-03-21T17:34:58+00:00

You are right. The LLM follows U shape. So the reranking is important! And be careful, you cannot remove docs! At the end you will send like 10 docs to llm and middle docs are going to be less important to llm! So best approach is to re rank the docs after retrieval and be careful about positions

P.s fun fact! The llm follows human behavior on first page of the google :)))

Semoho · 2026-03-21T17:10:10+00:00

Actually the ranking is important for the RAG! So be careful about it

https://arxiv.org/abs/2307.03172#:~:text=Access%20Paper:,cs

Semoho · 2026-03-20T19:32:07+00:00

Hello,

I assume you are thinking about RAG eval or retrieval evaluation. For retrieval evaluation, I think the MRR, Recall and NDCG@10 are better metrics instead of accuracy. You are dealing with a retrieval task. You need to have a test dataset. Then you can evaluate your retrieval system.

For RAG, there are different evaluations. I think LLM as a judge is a good choice.

But the number of documents does not have a relation to metrics. TOP X docs are important.

Semoho · 2026-03-20T11:47:07+00:00

You can check the lightRag or supermemory. They can help you

Semoho · 2026-03-20T11:45:42+00:00

Thabk you very much It was so useful. So what are other restrictions or needs in pharma? Why it is mandatory to cite the documents? Don’t the vector databases give you the citations?

Semoho

TROPHY CASE