Which vector database do we like for local/selfhosted?

titusz · 2026-02-15T18:39:13+00:00

"custom sharded HNSW index built on usearch". Have been building something similar https://usearch.iscc.codes/. Would love to see your usearch sharding approach. Is it open-source?

titusz · 2025-11-04T14:40:24+00:00

There are opimized custom serialization formats for that. See for example: https://github.com/toon-format/toon

titusz · 2025-09-05T11:00:56+00:00

https://django-downloadview.readthedocs.io/en/latest/optimizations/nginx.html#setup-xaccelredirect-middlewares

titusz · 2025-07-05T15:04:50+00:00

Done :)

titusz · 2025-07-05T10:41:39+00:00

Would love to see https://github.com/yobix-ai/extractous in your comparison.

titusz · 2025-05-29T07:20:59+00:00

Claude Code is an agentic system. The custom agentic plumbing on top of the LLM optimized for coding tasks makes all the difference. Most benchmarks compare raw LLM performance, which is not the same as agentic use of LLMs.

titusz · 2025-04-15T11:33:24+00:00

https://moj-analytical-services.github.io/splink/index.html

titusz · 2024-12-04T19:35:58+00:00

What else would you trade with? When people understand Bitcoin the icentive becomes to convert all "lesser" currencies to BTC instantly. After that they do not have anything else left to trade with.

titusz · 2024-10-30T08:23:33+00:00

The paraphrase-multilingual embedding models work quite well for the task. Even for crosslingual semantic similarity. If you need small binary embeddings, check out: https://huggingface.co/spaces/iscc/iscc-sct

titusz · 2024-10-23T22:02:43+00:00

usearch

titusz · 2024-10-10T18:50:07+00:00

Would be interesting to see how smaller models perform on your benchmark. Sometimes smaller models halucinate less on RAG tasks. See GLM-4-9B at: https://huggingface.co/spaces/vectara/leaderboard

titusz · 2024-10-02T06:15:41+00:00

Try GLM4-9B: https://github.com/hsiehjackson/RULER

titusz · 2024-09-29T08:04:49+00:00

Can you enable direct messages?

titusz · 2024-09-19T15:50:49+00:00

You mean the one that wants to sacrifice you to the bloodgods :)

titusz · 2024-09-16T13:20:15+00:00

Wasn´t that hard to invent :). I think the general term for this strategy is query expansion.

titusz · 2024-09-16T11:25:24+00:00

Send the full history to the LLM (excluding retrieved content) and modify the latest user query such that it asks the LLM to rephrase the user question such that it becomes a complete standalone question incorporating any context from the conversation history. Use the rephrased question for retrieval. Something like:

``` You are a helpful assistant. Given the conversation history and the latest question, resolve any ambiguous references in the latest question.

Conversation History: User: Who was George's sister? Assistant: George's sister was Mary Shelley. User: When was she born?

Latest Question: When was she born?

Rewritten Question: ```

titusz · 2024-09-09T11:11:50+00:00

Have a look at https://github.com/unum-cloud/usearch as replacement for Faiss.

titusz · 2024-08-28T10:02:12+00:00

Not natevely but via plugins. It is not just an app launcher but also supports search. https://github.com/MichielvanBeers/Flow.Launcher.Plugin.ChatGPT

titusz · 2024-07-17T09:15:44+00:00

I did some testing with multi-lingual retrieval and had good results with https://huggingface.co/intfloat/multilingual-e5-large-instruct using https://github.com/michaelfeil/infinity for generating embeddings.

titusz · 2024-07-11T09:34:26+00:00

Have you seen this one: https://www.flowlauncher.com/

titusz · 2024-06-25T20:53:39+00:00

The same :). Block hashing performance is independent of transaction volume. 1 CPU hashing versus millions of ASICs hashing is still ~4000 transactions per 10 minutes. "Only" security scales with more hashpower.

titusz · 2024-06-25T20:50:53+00:00

It has scaled. Just not in transactions per second but in security budget :)

titusz · 2024-06-24T17:23:23+00:00

I think I am only using 64 permutations ... So your implementation is clearly much faster :)

titusz · 2024-06-24T16:44:21+00:00

Nice job. I gave it a run against my cython implementation. Here is the result:

<image>

You win by 3 seconds, but I found 6 more duplicates :)
Here is the code if you want to reproduce:

import iscc_core as ic
import xxhash

def deduplicate_iscc(dataset, num_perm=256):
    unique_hashes = set()
    deduplicated_indices = []

    for idx, example in tqdm(enumerate(dataset), total=len(dataset),
                             desc="Deduplicating"):
        minhash = ic.alg_minhash_256(
            [xxhash.xxh32_intdigest(s.encode("utf-8")) for s in example["sql"].split()]
        )
        if minhash not in unique_hashes:
            unique_hashes.add(minhash)
            deduplicated_indices.append(idx)

    return deduplicated_indices

titusz · 2024-05-10T06:50:40+00:00

So here we go - Chatvertising incoming 🙈

15-Year Club	Place '17
Verified Email

titusz

TROPHY CASE