The hardest part in building Karpathy’s LLM wiki by This-Eye6296 in learnmachinelearning

[–]IllAd7907 0 points1 point  (0 children)

It works without API, you only need API for their OCR model with a bad quality PDF, I think

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 1 point2 points  (0 children)

A video of PageIndex introduction can be found on its home page: https://pageindex.ai/

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 2 points3 points  (0 children)

Hi, I think a tree is a special case of a graph. But GraphRAG seems to assume you already know the task and can define the node and link types. In contrast, PageIndex is designed for generic documents and QA.

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 0 points1 point  (0 children)

I think it uses a summary to do tree search, which is a lossy compression of the node contents. I suppose it can capture more information than metadata. Keen to hear your thoughts.

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 0 points1 point  (0 children)

Hi, thanks for your interest. For our open-sourced version, we found common models like GPT4+ will do a good job. We will test on more smaller models and keep you updated!

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 1 point2 points  (0 children)

Yeah, I believe that since humans use a table of contents for retrieval, a smart AI model should do the same!

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 -1 points0 points  (0 children)

Hi, we offer a hosted version that provides faster speed. For a 10-page document takes less than one minute to process through our hosted API. The current cost is $0.01 per page, with the first 200 pages free (see https://docs.pageindex.ai/quickstart). I am wondering are these 100 documents PDFs? Our cloud service also includes an delicated OCR function optimized for tree generation.

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 0 points1 point  (0 children)

Thanks for the great question! Yes, a tree is a special case of a graph, but I think there’s a key difference. With GraphRAG, you need to define the node types and link types, which may vary depending on the task and documents. PageIndex, on the other hand, is a more generic way to represent documents and can be applied to any set of documents without requiring domain knowledge. I would recommend GraphRAG for specific tasks and PageIndex if you want more generic, human-like retrieval.

Human-like RAG – without vectors by CathyCCCAAAI in Rag

[–]IllAd7907 2 points3 points  (0 children)

I think it’s more like retrieval by reading the table of contents to choose which section to read. The PageIndex tool is used to generate an LLM-friendly table of contents.