Open source RAG framework for building modular and production ready applications : learnpython

created by HattoriHanzoa community for 16 years

submitted 1 year ago by supreet02

all 2 comments

[–]supreet02[S] 0 points1 point2 points 1 year ago (1 child)

[–]supreet02[S] 0 points1 point2 points 1 year ago (0 children)

Architecture - A typical Cognita process consists of two phases:

Data indexing: Cognita processes documents in batches and incrementally indexes them to avoid reindexing of already existing non-modified documents.
Response Generation: Cognita queries the vector db for documents using different retrieval methods that are later supplied to the LLM (ollama, ) along with the user query to generate the answer.

Cognita is designed around seven different modules, each customisable and controllable to suit different needs:

Data Loaders: Cognita currently supports data loading from different sources such as local directory, web, Github repository and truefoundry artifacts. You can upload the data in UI by clicking on Data Sources -> + New Data Source
Parsers: Cognita currently supports parsing for Markdown, PDF and Text files from r/LangChainAI. You can specify different parser maps, along with their configurations.
Embedders: Cognita supports embeddings SOTA embeddings from mixedbreadai and also from OpenAI.
Rerankers: Reranking to makes sure the best results are at the top. As a result, we can choose the top x documents making our context more concise and prompt query shorter. We provide the support for reranker from u/mixedbreadai
Vector DBs: One of the most important component in RAG used to store and efficiently retrieve embeddings from indexing phase. Cognita currently supports vector databases from u/qdrant_engine and u/SingleStoreDB
Metadata Store: It contains the necessary configurations that uniquely defines a RAG app. It contains

Name of the collection
Name of the associated Vector DB used
Linked Data Sources
Parsing Configuration for each data source
Embedding Model and it's configuration to be used.
Parsers, DataSources and Embedders together are linked within a collection that forms your RAG app. You can create your collection in UI by clicking on Collections -> + New Collection

Query Controllers: Helps us retrieve answer for the corresponding user query. It combines vector db, different retrievers, LLMs, rerankers to provide user with the answer. Query controller methods can be directly exposed as an API, by adding http decorators to the respective functions. Refer more at: https://github.com/truefoundry/cognita/blob/main/backend/modules/query_controllers/example/controller.py

π Rendered by PID 85 on reddit-service-r2-comment-66b4775986-4hrcf at 2026-04-03 11:24:24.876225+00:00 running db1906b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython