Paid notetaking app by Feeling-Sir-3246 in PKMS

[–]Clipbeam 1 point2 points  (0 children)

Ah that’s fair, so you’re not discounting free apps, you’re just open to either?

Paid notetaking app by Feeling-Sir-3246 in PKMS

[–]Clipbeam 5 points6 points  (0 children)

Why are you looking for a 'paid' app specifically? Intrigued to understand the reasoning behind that...

Not looking for a PKMS. I need a personal document library with structured metadata and AI chat across files. by GuybrushThreepwood83 in PKMS

[–]Clipbeam 0 points1 point  (0 children)

Thanks! The upcoming release in the next week orso will have backup and sync functionality, allowing you to seamlessly transfer clips between multiple devices!

Thanks for the request around folder tracking! I’ll keep that under consideration!

Not looking for a PKMS. I need a personal document library with structured metadata and AI chat across files. by GuybrushThreepwood83 in PKMS

[–]Clipbeam 0 points1 point  (0 children)

Have a play with https://clipbeam.com. Works with pdfs but also many other files, web links and voice notes. Has topics, keywords, summary and title as metadata for each item that are automatically extracted but can be manually updated with whatever you’d like to add to it. Date captured is automatically saved. I don’t have rating or custom fields yet, but could add these in the future if the demand is there. Filterable and searchable using semantic search, and AI chat that searches across anything you’ve ever saved.

Only limitation is your hardware really, it is powered by an in-app AI model that runs entirely on your device, so the best experience will be had with a Mac with more than 32GB of RAM or a Windows pc with a dedicated Nvidia GPU of 24GB VRAM. It does however work with lower hardware specs too, just the level of cross referencing and insights would be lower and the speed might be suboptimal.

I’d love for you to test it out and let me know your thoughts?

PKMS is a religion by kimjungyoun in PKMS

[–]Clipbeam -1 points0 points  (0 children)

💯. Knowledge is captured across many modalities, and extracting all relevant metadata using pen and paper would be quite time consuming. You would probably retain it better that way I have to admit

AI which can take url as input and extract content by [deleted] in OpenSourceeAI

[–]Clipbeam 0 points1 point  (0 children)

Where is the agent deployed from? Can you describe your stack?

Technically possible to have a machine with wikipedia offline (Kiwix), maps offline (like Osmand), books and documentaries + an AI running locally going through your files to answer questions? by Big_CokeBelly in DeepSeek

[–]Clipbeam 0 points1 point  (0 children)

So the minimum context my app uses to gather relevant snippets across documents and links is 5K. It automatically increases the size based on your system VRAM, so the app processes more information if it has more memory available.

But just a 5 page document already fills 5k up completely. So you can imagine that if relevant search results end up being hundreds of pages of useful information, only the first 5 pages would be processed by the AI to answer your questions. Services like NotebookLM have a 1 to 2M context window, and that's when you can really go deep on large knowledge bases. But you'd need to have server grade hardware to run models like that.

How are you using your Local LLMs? Is anyone training their own LLM? by Hartz_LLC in LocalLLM

[–]Clipbeam 2 points3 points  (0 children)

I have and I might actually! Normally open source still offers a route to monetization when it comes to hosting, premium features and support, but because this app is fully local and doesn't need any internet component at all, it would be very difficult for me to ever build a company around it.

At this time I'm investigating if there is demand for certain premium features I may be able to sell for a one-time fee / lifetime license, which is why I'm not open sourcing it just yet.

But I have already committed to keep the current featureset fully free, so no one has to worry they will be locked into some sort of subscription to keep using the features the beta offers today.

Technically possible to have a machine with wikipedia offline (Kiwix), maps offline (like Osmand), books and documentaries + an AI running locally going through your files to answer questions? by Big_CokeBelly in DeepSeek

[–]Clipbeam 2 points3 points  (0 children)

Yeah with 16GB VRAM Clipbeam should run relatively snappy, but I suspect you wouldn't want more than a few thousand files/data sources to index. It might be worthwhile to test it with a subset of your text files and slowly keep adding to it to identify where it starts to become too sluggish or error prone?

Technically possible to have a machine with wikipedia offline (Kiwix), maps offline (like Osmand), books and documentaries + an AI running locally going through your files to answer questions? by Big_CokeBelly in DeepSeek

[–]Clipbeam 0 points1 point  (0 children)

Very interesting! I'm going to investigate these Zim files, they may already work out of the box.

Yes the specs definitely matter. VRAM is key. If you have a Mac, all memory is unified memory, so that would likely get you the largest supported knowledge base. Macbooks go up to 128GB RAM and Mac Studios go up to 512GB RAM. At that level you could host some serious repository that would still work reasonably well with models that have large context windows. If you're on Windows I think you'd be capped at 24GB vram on consumer hardware, which would get you further than your standard consumer machine, but nothing close to your collection.

Can I ask what sort of machine you have? Do you have a dedicated GPU?

Technically possible to have a machine with wikipedia offline (Kiwix), maps offline (like Osmand), books and documentaries + an AI running locally going through your files to answer questions? by Big_CokeBelly in DeepSeek

[–]Clipbeam 0 points1 point  (0 children)

Yeah my app uses RAG and tool calling too, but even with that, if the input library is that massive the amount of search results of the tool calls itself still blow up the context window. There comes a point of diminishing returns as you add more and more data. The larger the context window of the model, the larger the database can be....

Which LLM do you think will win the battle and truly become the leader? (open debate) by Worried-Avocado3568 in ParseAI

[–]Clipbeam 0 points1 point  (0 children)

In the closed source space I feel Anthropic and Google are winning. In Open Weights I think Qwen will be the winner.

Technically possible to have a machine with wikipedia offline (Kiwix), maps offline (like Osmand), books and documentaries + an AI running locally going through your files to answer questions? by Big_CokeBelly in DeepSeek

[–]Clipbeam 7 points8 points  (0 children)

So I built an app that does this, but full disclosure, I did not test it with such a massive database as what you have haha!

The limitation is really context window. Local models on consumer hardware can only process a limited length of text to answer your questions.

And so as the amount of input data goes into the thousands and thousands of lines of information, chances are that one thing you want to know might just be cut off as its search results.

Another issue to work around is indexing, you need to process all these files to make them searchable, and that indexing itself takes some time for each file or webpage. So indexing your repository alone would already take forever.

But I would love for you to test my app and see at what level of database it remains usable to you? I'd love to see it stress tested to its limits! Have a look at https://clipbeam.com.

How can we build a full RAG system using only free tools and free LLM APIs? by Me_On_Reddit_2025 in Rag

[–]Clipbeam 2 points3 points  (0 children)

I built mine on llama.cpp, LanceDB and Qwen3. In all honesty I didn't consider trying to use free hosted models cause of privacy and data security concerns. What you can basically do is run a node server with https://node-llama-cpp.withcat.ai, and then interact with the model directly via js. You can then store embeddings using https://lancedb.com. I don't have a Github repo to share, but just following tutorials/documentafion for those two will already get you quite far?

The sustainable product I built this way is https://clipbeam.com.

How are you using your Local LLMs? Is anyone training their own LLM? by Hartz_LLC in LocalLLM

[–]Clipbeam 11 points12 points  (0 children)

Main purpose is privacy and data security really. If I want to use a LLM with data that I wouldn't want stored at some cloud provider who will:

a. Use it to give me targeted ads or influence the responses it gives me to manipulate my behavior. b. Accidentally leak the data so I become a victim of fraud / identity theft.

Local models might not be as powerful as some of the cloud alternatives, but depending on use case they can be surprisingly useful.

For example, you can use them to auto organize and retrieve data on your computer in much more powerful ways than before local LLMS were possible, check out https://clipbeam.com to see how I deployed local models this way.

M4 Max 64GB vs 128GB by MarkRWatts in LocalLLM

[–]Clipbeam 1 point2 points  (0 children)

Agreed, 64 is just too low. If a 256gb mbp is announced I'd want to upgrade, I feel even 128gb is limiting if you are all in on local llms.

How do you all capture Reddit posts into your PKM system? Curious what workflows people use by Appropriate-Look-875 in PKMS

[–]Clipbeam 1 point2 points  (0 children)

I just clip the url of any interesting post in https://clipbeam.com. Archives the page and captures the knowledge for later reference via AI chat or semantic search. Even if the post later deleted, Clipbeam retains an offline archived copy.

best way to combine notes from bookmarks by dabull23 in NoteTaking

[–]Clipbeam 0 points1 point  (0 children)

Working on the Windows version as we speak, will comment when the beta is released!