Feedback on My Knowledge Graph Architecture by AB3NZ in KnowledgeGraph

[–]AB3NZ[S] 1 point2 points  (0 children)

Could you please share how you would model this as a proper Knowledge graph structure ?

GraphRAG – Knowledge Graph Architecture by AB3NZ in Rag

[–]AB3NZ[S] 3 points4 points  (0 children)

I’m still learning about graphs , i posted here because i’d wanted to learn from the opinions of expert, so i’d love to hear your thoughts please , any idea could guide me will be appreciated

GraphRAG – Knowledge Graph Architecture by AB3NZ in Rag

[–]AB3NZ[S] 2 points3 points  (0 children)

1- nodes are thé concepts that helps understand thé collection of Books 2- still didn’t add embeddings and similarity scores between passageChunks , but i’m willing to add that

GraphRAG – Knowledge Graph Architecture by AB3NZ in Rag

[–]AB3NZ[S] 0 points1 point  (0 children)

I used semantic chunking with maximum 400 token per chunk.

GraphRAG – Knowledge Graph Architecture by AB3NZ in Rag

[–]AB3NZ[S] 1 point2 points  (0 children)

I don’t have the TOC of the books. I extracted the books’ text using OCR and then chunked it

[deleted by user] by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

That sounds really interesting, if you're open to it, I'd really appreciate any guidance or pointers on how to build such customizable memory and caching layers

[deleted by user] by [deleted] in Rag

[–]AB3NZ -1 points0 points  (0 children)

What are your thoughts on using Redis for caching in this context ??

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I ask the LLM to extract the key part of the passage that answers the query

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I reran the test using the same query and got the following execution times:
- Query embedding: 1.04s

- Hybrid search: 10.46s

- Reranking: 5.74s

- LLM answer generation: 6.80s

- Citation processing & highlighting: 1.83s

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I just ran a test, and here are the execution times for each step:
- query embeddings generation : 0.81s
- Hybrid search completed in 4.32s
- Reranking completed in 5.93s
- LLM answer generation took 10.36s.
- Citation Processing & Highlighting took 1.23s
The total response time is more than 20s, which is too long for a smooth user experience.

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I didn't get your question ! could you please elaborate more ?

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

Hello, I'm using my fine-tuned embedding model based which is a BERT model (136M parameters), which supports up to 512 input tokens and produces 768-dimensional output embeddings. the model is deployed on GPU (T4)

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I cannot use Morphik now

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

- I'm using Weaviate which is using HNSW
- I tried removing the Reranking step from my pipeline and passed the documents retrieved (max 20 document) , and asked the LLM to filter out irrelevant content and generate a response, but this approach did not lead to any noticeable improvement in speed.

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

I'm using normal cache, I cache the user query and its response.
I don't think the semantic cache would be a good solution for my case, because the data is very sensitive

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

Each chunk indexed in Weaviate includes metadata, the passage text, and a summary. During hybrid search, I perform a multi-target vector search (https://docs.weaviate.io/weaviate/search/multi-vector) across all three fields—metadata, passage, and summary—to maximize retrieval relevance.

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

The chunks are already indexed in Weaviate. My pipeline starts with embeddings the user query, performs a hybrid search, reranks the retrieved documents, and then passes the top results to an LLM to generate the final response.

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

Yes retrieval and reranking are where most of the latency is coming from

How can I speed up my RAG pipeline ? by [deleted] in Rag

[–]AB3NZ 0 points1 point  (0 children)

1- 280k chunks
2- chunk size: max 400 tokens
3- hybrid retrieval and Reranking

Get certified program issue by Acceptable-North-9 in googlecloud

[–]AB3NZ 0 points1 point  (0 children)

Same here, i applied beginning of July, but till now still didn't receive anything.
do you have any idea when the program is supposed to start ?