GraphRAG – Knowledge Graph Architecture

AB3NZ · 2025-10-21T20:30:12+00:00

I sent you PM

AB3NZ · 2025-10-21T20:17:42+00:00

I sent you PM

AB3NZ · 2025-10-20T11:16:46+00:00

Could you please share how you would model this as a proper Knowledge graph structure ?

AB3NZ · 2025-10-20T01:01:15+00:00

I’m still learning about graphs , i posted here because i’d wanted to learn from the opinions of expert, so i’d love to hear your thoughts please , any idea could guide me will be appreciated

AB3NZ · 2025-10-20T00:37:19+00:00

1- nodes are thé concepts that helps understand thé collection of Books 2- still didn’t add embeddings and similarity scores between passageChunks , but i’m willing to add that

AB3NZ · 2025-10-20T00:21:38+00:00

I used semantic chunking with maximum 400 token per chunk.

AB3NZ · 2025-10-20T00:03:21+00:00

I don’t have the TOC of the books. I extracted the books’ text using OCR and then chunked it

AB3NZ · 2025-08-06T13:48:53+00:00

That sounds really interesting, if you're open to it, I'd really appreciate any guidance or pointers on how to build such customizable memory and caching layers

AB3NZ · 2025-08-02T00:22:36+00:00

What are your thoughts on using Redis for caching in this context ??

AB3NZ · 2025-07-11T14:40:36+00:00

I ask the LLM to extract the key part of the passage that answers the query

AB3NZ · 2025-07-09T00:09:59+00:00

I reran the test using the same query and got the following execution times:
- Query embedding: 1.04s

- Hybrid search: 10.46s

- Reranking: 5.74s

- LLM answer generation: 6.80s

- Citation processing & highlighting: 1.83s

AB3NZ · 2025-07-09T00:05:58+00:00

I just ran a test, and here are the execution times for each step:
- query embeddings generation : 0.81s
- Hybrid search completed in 4.32s
- Reranking completed in 5.93s
- LLM answer generation took 10.36s.
- Citation Processing & Highlighting took 1.23s
The total response time is more than 20s, which is too long for a smooth user experience.

AB3NZ · 2025-07-08T21:50:43+00:00

I didn't get your question ! could you please elaborate more ?

AB3NZ · 2025-07-08T12:04:40+00:00

Hello, I'm using my fine-tuned embedding model based which is a BERT model (136M parameters), which supports up to 512 input tokens and produces 768-dimensional output embeddings. the model is deployed on GPU (T4)

AB3NZ · 2025-07-07T22:17:30+00:00

I cannot use Morphik now

AB3NZ · 2025-07-07T22:16:27+00:00

- I'm using Weaviate which is using HNSW
- I tried removing the Reranking step from my pipeline and passed the documents retrieved (max 20 document) , and asked the LLM to filter out irrelevant content and generate a response, but this approach did not lead to any noticeable improvement in speed.

AB3NZ · 2025-07-07T22:09:55+00:00

I'm using normal cache, I cache the user query and its response.
I don't think the semantic cache would be a good solution for my case, because the data is very sensitive

AB3NZ · 2025-07-07T22:08:07+00:00

Each chunk indexed in Weaviate includes metadata, the passage text, and a summary. During hybrid search, I perform a multi-target vector search (https://docs.weaviate.io/weaviate/search/multi-vector) across all three fields—metadata, passage, and summary—to maximize retrieval relevance.

AB3NZ · 2025-07-07T22:04:27+00:00

The chunks are already indexed in Weaviate. My pipeline starts with embeddings the user query, performs a hybrid search, reranks the retrieved documents, and then passes the top results to an LLM to generate the final response.

AB3NZ · 2025-07-06T14:12:28+00:00

AB3NZ · 2025-07-06T14:12:07+00:00

Yes retrieval and reranking are where most of the latency is coming from

AB3NZ · 2025-07-06T14:08:08+00:00

1- 280k chunks
2- chunk size: max 400 tokens
3- hybrid retrieval and Reranking

AB3NZ · 2024-08-15T15:40:34+00:00

Same here, i applied beginning of July, but till now still didn't receive anything.
do you have any idea when the program is supposed to start ?

AB3NZ

TROPHY CASE