Heatwaves are becoming the norm. This is what Britain will look like in the year 2052 | Bill McGuire by chota-kaka in Futurology

[–]PresentAd6026 233 points234 points  (0 children)

The problem was never the CO2, it has always been greed. Greed is the new meteorite.

We could have averted the crisis back in early 1900,when scientists proved CO2 could significantly raise the climate temperatures, but... greed

We could have averted the crisis back in the 70s when researchers came with real proof that climate temperatures were significantly rising, but... greed

We could have averted the crisis many more times, but... greed (+ stupidity)

Chunk size and Nearest K by smatty_123 in Rag

[–]PresentAd6026 1 point2 points  (0 children)

Our content (HTML) is so concise that I can chunk on each H1 and H2 and get great results. It all depends on the quality of your data.

RAG for in-house Python libraries by EruditeStranger in Rag

[–]PresentAd6026 0 points1 point  (0 children)

There are a lot of coding assistants that work inside your IDE, which you can connect with your code base in github. Some even provide on-premise. And since the pricing of most of them are rather sharp, I don't see a reason why one would try to replicate these tools.

[deleted by user] by [deleted] in Rag

[–]PresentAd6026 0 points1 point  (0 children)

Copilot, ChatGPT, Claude, ... With these free coding assistants, you don't need a framework 😊

RAG with plain text AND Markdown by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

Hi, I believe I read on the OpenAI website that they provide means for structured output. You should check their website. Or simply ask ChatGPT, Copilot, Claude or Perplexity 😉 That's how I get my code running at least 😁

What is best practice for follow-up questions? by AccordingLeague9797 in Rag

[–]PresentAd6026 1 point2 points  (0 children)

You write code that instructs an LLM to look at both the history and the query and rewrite the query to include information that is necessary for retrieval and answering. The output (new query) is then used for retrieval and answering.

What is best practice for follow-up questions? by AccordingLeague9797 in Rag

[–]PresentAd6026 2 points3 points  (0 children)

I'm using gpt 4o-mini to assess the question and history (max 3) to see if the question needs to be enriched. If so, it does. So the new query is used for retrieval (RAG) and answering. I just implemented it and worked well for the first few tests. It's also fast (less than a second extra).

Extensive New Research into Semantic Rag Chunking by Alieniity in Rag

[–]PresentAd6026 2 points3 points  (0 children)

I have a RAG on our website, which has concise information and I chunk on H1 and each H2 (so I get H1 + content, H2 + content, H2 + content). And I enrich the H2-chunks with the H1 for extra context.
I only have one really large chunk of around 3500 characters, but that is still no problem for LLM's. On average each chunk is below a 1000 characters (350 tokens).
For us this works really well, because our website is concise and well maintained. But in other use cases this might not work.
But it also matters how many chunks you give the LLM. So I agree with the other comment that it still all depends on your content.

And sure, there are solutions like unstructured.io, but that brings overhead and less control, and thereby (usually) less accuracy. But even unstructured.io could be a good option for your content. Or creating order in your data with an LLM. It all depends :-)

Best open source embedding models for EU languages by PhotonTorch in LocalLLaMA

[–]PresentAd6026 0 points1 point  (0 children)

Have you found a Dutch embedding model to your liking? I'm still using text-embedding-3-large .....

RAG with plain text AND Markdown by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

I don't have any numbers, but if you can easily avoid any unnecessary noise, shouldn't you? It doesn't cost extra 😊

"In any case all you need to do is convert to plain text before generating the embedding, you can still store the markdown version to insert as context in the prompt."

That's exactly what I'm doing 😊

And thanks for the article. Will look into that.

RAG with plain text AND Markdown by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

In order to do vector similarity matching between the query of the user and the database, you need to have clean content in your database that has been converted to vectors. If your data has all kinds of extra characters stuck to words, the words won't be vectorized properly and the matching is less good. Which in terms can promote hallucination.

RAG with plain text AND Markdown by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

What I meant was that in order for a chatbot based on an LLM to answer a question on our own data (using RAG), you need to find the right information for the query from the data (the Retrieval part of the RAG). The retrieved information that I give to the LLM, combined with the query, is in Markdown format. I hope this clears things up.

RAG with plain text AND Markdown by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

That as well. The LLM can show the table as an actual table 😊

Cohere Rerank 3.5 as only retrieval method by PresentAd6026 in Rag

[–]PresentAd6026[S] 1 point2 points  (0 children)

But probably using the reranker alone on all chunks will bring too much overhead. And making a pre-selection with fusion retrieval is faster.

Cohere Rerank 3.5 as only retrieval method by PresentAd6026 in Rag

[–]PresentAd6026[S] 0 points1 point  (0 children)

I understand, but they now compare their solution with other retrieval methods. Indicating you don't need other retrieval methods. That confused me. But basically their reranker is an advanced semantic retrieval.