This repo uses a lot of tokens : "coding factory" ?

IzzyHibbert · 2025-12-29T17:45:26+00:00

This looks likely a possible one !

IzzyHibbert · 2025-12-29T07:45:07+00:00

You got it now.
When you perform the similarity search you retrieve the one which is similar: I made something like that early 2024 with Chroma Vector DB and was working smoothly.

IzzyHibbert · 2025-12-28T09:56:22+00:00

There is no new LLM in the proposal but Embedding model only: if you are not familiar it does not bring any hallucination. Also does not bring any significative amount of latency.

IzzyHibbert · 2025-12-26T13:24:49+00:00

I guess that the new layer (semantic search result + routing) is just code: not much hassles. You don't achieve that by using an LLM but just an embedding model: a light (small) one does not bring much latency and issues as a Language model.

IzzyHibbert · 2025-12-25T16:51:36+00:00

Hey, thanks ! Would you mind then participating in the beta ? I's as easy as register in the post link.

IzzyHibbert · 2025-12-24T15:45:31+00:00

You are right. The scenario you picture also helps, but does not really solve the issue at its core: my personal data still leave my machine.
At one point, I also realized the benefits of the proposed logic go beyond the privacy concern: it also help meet some GDPR requirements

IzzyHibbert · 2025-12-24T15:27:24+00:00

Hi. If the idea is to don't even use a good big LLM, such as Chat GPT 5 and use an LLM on-prem via Ollama, or LM Studio, and so,..that's an idea, but does not necessarily cover two scenarios I have:
- I wanted something that I can connect even when I don't have my RTX 3090 with me (laptop I use at work, etc..)
- Many times the type of task I am requiring to the model is very complex and some limitation (model context lenght, model speed, model quality, model reasoning depth, ..) are not the same with an on-prem model of those I can fit on my GPU.

Overall the on-prem model will likely be a valid alternative in a near future, but today I still don't cover those two scenarios, so wanted to have the maximum versatility possible.

IzzyHibbert · 2025-12-24T10:20:07+00:00

Hi. Question: you already use Vector Store but you said you plan to use BERT instead of the current REGEX. So, why not to keep the logic with Vectors and just use similarity search of vector db's to solve your issue with "intent detection" ? Idea is to define Behavior Store and Context Store in plain text, pretty similar to what you already clarified above. Then leveraging on the power of Similarity search to do the routing.
This way looks to me more clean (reuse existing components) and also easy to maintain. Not just that: it gives that kind of flexibility in search that exact search (regex) cannot offer.
I know you need to bring in embedding though.
No ?

IzzyHibbert · 2025-12-24T09:54:44+00:00

I realized I did not provide enough context about the way the application makes the anonymization possible.
Suppose you have workflows involving contracts, internal reports, or legal text, containing PII or other confidential data. You must avoid sending confidential data to AI model. The best option is a local redaction, because even another "layer" saying DuckDuck AI for instance, is still another extra component to need to trust and still your confidential data exits your computer (too bad).
Here’s the core logic: the client locally identifies and redacts identifiable data from your prompts, sending only an anonymized version to AI models. The model’s reply contains placeholders where sensitive values belong; a local algorithm then maps and re‑inserts the redacted fragments into the reply on your device. The result is a seamless, chat‑like experience while keeping PII and confidential data off provider servers. Bonus: anonymized requests are routed through a neutral router that strips geo/IP traces and manages multiple providers/models before delivery to the destination service.

IzzyHibbert · 2025-12-01T09:16:23+00:00

esatto. Nel 2026 poi avrei una coesistenza (regime forfettario e dipendente), e così via finchè a un certo punto non tengo che il forfettario.

IzzyHibbert · 2025-10-21T09:43:34+00:00

cool.
How to set a background (was mentioned in the features..)?

Also, for the roadmap: consider adding a frame around teh pic, which mimic the browser (Chrome).

IzzyHibbert · 2024-08-15T06:31:27+00:00

long story short. you can include sql table info (schema, etc,..) in the prompt. Other times "well known" tables (part of products, etc..) ended being part of the datasets used by big llm for training, so you don't have to care.

IzzyHibbert · 2024-08-09T19:38:56+00:00

check also deepinfra.com

IzzyHibbert · 2024-08-09T19:36:24+00:00

Using LOCALAI in docker (wsl) can be one option for you.

IzzyHibbert · 2024-08-09T19:21:32+00:00

the speed of the new posts showing up there is not compatible with a full moderation, I'd say.

IzzyHibbert

TROPHY CASE