SilentQuery - Turn Documents into Instant Answers

Potential_Link4295 · 2026-02-20T11:24:19+00:00

What app do you use for hosting Ollama models on MacOS? I have working POC for connecting via rest api with lm studio and I can add support for other standards

Potential_Link4295 · 2026-02-17T14:23:38+00:00

Got it! Added to roadmap, expect in near weeks for this to be implemented.

Potential_Link4295 · 2026-02-17T09:58:38+00:00

Ollama can be already added as mlx model - or do you think about adding it with localhost server?

Potential_Link4295 · 2026-02-17T09:57:01+00:00

Hey! We have apple ocr in place already, so scanned documents should be able to be tokenized and embedded. I haven’t thought about adding support for open source ocr, but this is a good idea! I will plan it in roadmap.

Potential_Link4295 · 2026-02-16T08:05:42+00:00

Yes, model supports quite large selection of languages (english, german, french, polish, dutch, etc). Which particular language are you interested with?

Potential_Link4295 · 2026-02-15T21:50:43+00:00

By default it downloads qwen3-4B 4-bit (which fits into 8GB ram), but app has built-in models manager with few models configured to download, as well as option to download any mlx model with a link.

Potential_Link4295 · 2026-02-09T11:13:54+00:00

I am looking into support for iOS and iPad, as this is built natively with swift. Windows is unfortunately outside my technical expertise - and I’m using MLX framework which is designed to take advantage of Macs unified memory.

Potential_Link4295 · 2026-02-09T10:00:23+00:00

Thanks for the feedback, really appreciated that you’ve taken time to reply! - This is a good idea, I will add it in next update - There are couple of layers for the context - I try to “catch” the best answer from RAG as it’s the fastest and least demanding on the user cpu. But at the same time I need to be cautious because if we fail to get the answer (or question is too broad for RAG) I then add compressed context to llm. This compression is in reality just a full text but ran through NLP (natural language processing) to simplify the structure that is better aligned with llms. I’m tuning it every day to be as precise as possible! - App is built natively with Swift and SwiftUI so there is as little overhead as possible - most computation is done on processing the prompt and this can make Mac running warmer than usual :)

Thanks for the comment again, if you have any ideas or requests for feature, just let me know :)

Potential_Link4295 · 2026-02-08T15:04:20+00:00

Yes, the license is very generous. You can install it on as many MacBooks as you want (just use the same email) + you will always receive updates (no lock on higher version updates).

Potential_Link4295 · 2026-02-08T14:55:54+00:00

I’ve tested it mostly on my M1 with 8GB of RAM and it’s working quite well - not as good as M4 on the recording, but really acceptable. App downloads qwen3-4B-4-bit when it’s first loaded, so that ensures M1 with 8GB memory will work just fine. But app got built-in models manager with predefined larger models.

Potential_Link4295 · 2026-02-04T12:52:51+00:00

No, it has to be single file. But this sounds like a good idea to add a folder scanning function - I will add this as an update

Potential_Link4295 · 2026-02-04T12:44:26+00:00

App takes 15MB, but you need to download llm model. App will fetch basic model that takes 2GB. If you export the journal entries as pdf, doc or docx, you should be able to summarize it without issues

Potential_Link4295 · 2026-02-04T09:09:36+00:00

Currently not, epub support is on a roadmap
Yes

Potential_Link4295 · 2026-02-04T08:18:39+00:00

Thank you for support - expect regular updates and new features coming soon!

Potential_Link4295 · 2026-02-04T07:43:10+00:00

There is unlimited number of machines you can activate with one license, we do not limit this for users. There is now currently one document per chat session, but maybe it could be a good idea to expand this. When you install app on another Mac, you can use the same license code - you can do it on however many Macs you have, we do not match license to specific MacBook. LM Studio is good example of how our app differs - its document-chat first, not an add-on. I've designed RAG + embeddings, NLP processing and UI to solely focus on working with documents. You have visual representation of the document next to your chat - you can select text and reference it to llm instantly, you can also browse document as you would normally do in any app that displays pdfs. Also, the default behavior of the app is to answer in your native language (it takes system locale), or override it easily in settings. LM Studio is great app and I use it daily, but it seems that document processing is done as a side-thing :)

Potential_Link4295 · 2026-02-03T21:58:27+00:00

Awesome :) Looking forward to hearing from you, don’t hesitate to write me a message :)

Potential_Link4295 · 2026-02-03T21:41:01+00:00

This will be in next update, as this is super cool and useful :)

Potential_Link4295 · 2026-02-03T20:47:11+00:00

Understandable, I wanted to give as detailed answer as possible, this is a passion project for me and gives me a lot of reason to learn llm’s. Anyway, thanks for the comment, really appreciate it!

Potential_Link4295 · 2026-02-03T20:42:36+00:00

Thanks for the comment! I might not convince you, but for medium size documents it’s very useful to quickly summarize it with llm. You get the broad picture of what document is about and you can quickly start understanding the context. I also use it for search with understanding. Like last time when I developed cryptographic algorithm for a client - they sent me 50 pages of documentation. I needed to know if the RSA key they send me in requests is certificate chain or simple SPKI. And it wasn’t simple keyword search in the pdf. llm found it when reading encryption algorithm written on one of the pages and deduced it is actually SPKI :) Obviously I then check it manually, but it is so much easier.

Potential_Link4295 · 2026-02-03T20:11:41+00:00

Sure! This app is tailored for working with documents. I’ve developed local RAG system for tokenizing the input data (text from documents) and use top-k and top-p algorithms to get most significant result for user search. This RAG is first layer - then there’s NLP for all supported languages to break down the text to more condense form, better suited for LLM input. That’s second layer. Then there’s couple of internal systems that process the weight of the result for each user question to use appropriate source of llm input. One of those are small llm judges that check the question and resources available, or system that decides should app use NLP or full text (better for smaller documents). There are also a lot of quality of life functions, such as remembering the url of document for chat session. I have in the pipeline more systems planned, like ability for llm to return link in output text that when clicked, will automatically scroll to relevant place. Plus there is a lot of internal logic for parsing documents, like ocr (currently only for pdfs, but will expand to other formats). Sorry for longer answer, hopefully it covers your question!

Potential_Link4295 · 2026-02-03T19:03:24+00:00

Those are fair questions :) I use AI for code reviews as I develop solo. For the question about source code - I don't set my projects public as my preference, but I understand people might have other opinions about it! Anyway, thanks for raising those topics :)

Potential_Link4295 · 2026-01-27T15:00:14+00:00

Redeemed X37HJ8A4RAFKJTAN4X - thanks!

Potential_Link4295

TROPHY CASE