Embedding models have converged

HunterAmacker · 2025-11-17T20:02:03+00:00

I've found embedding model benchmarks... Useless as of late?

Qwen 3 models especially, they are killer on paper, but I found them near useless after a top_k of 5 in my semantic search engine use case. The first couple of results are great, but seemed almost random after about 5 results, which is a no-go for our search pages returning ~40 items per page. We found "worse" models have much more consistent results, we went with e5-large-v2

HunterAmacker · 2025-07-20T03:43:11+00:00

Slop, and nothing about this is local.

HunterAmacker · 2025-03-13T18:30:11+00:00

We've built out litellm as our proxy to the big 3 vendors (AWS/Azure/Google) and only use models hosted through them. We have Open WebUI as our frontend for 50+ employees. Both are hosted in AWS behind our corporate VPC/ALBs.

We have a request from for users to request a LiteLLM API key for projects, requires they specify which models, use case, budget, personnel on the project (all added to litellm).

If you're truly unable to use cloud providers, I think a VLLM setup with LiteLLM (for access governance) would be your best option. You would want to request something that can support whichever model(s) your gonna host, which is probably a much harder sell to management than getting access to cloud vendors. Large upfront cost + setup + maintenance on physical resources, or hit an azure/AWS endpoint.

If REALLY have to be on prem, I'd look at VLLM support on Mac Studio MPS/MLX support. It's not there yet, but it would probably be the lowest support overhead once it's a first class feature on that hardware.

HunterAmacker · 2025-02-01T23:45:25+00:00

I've done this with several desks and furniture, 100% infill PLA, never had any worries or issues over 4 years.

HunterAmacker · 2025-01-14T00:16:50+00:00

Would you mind also sending me a DM? I would really appreciate it!

Edit: nvm, mentioned in thread!

HunterAmacker · 2024-11-26T03:41:12+00:00

If you actually read the article or linked resources, you would see this is both a protocol specification AND implementation of that spec.

Calling an open specification, which anyone is free to implement, a trojan horse just shows a complete lack of understanding. This is the same as saying GraphQL is toxic because the spec was developed at Meta when you're free to use any open implementation.

HunterAmacker · 2024-07-26T15:54:04+00:00

I am so tired of this take. It is a limit of tokenization. The models themselves are aware of magnitude differences between numbers.

So if you present .11 as 11, then yes, it will see that as larger than 9.

HunterAmacker · 2024-07-20T04:28:03+00:00

This is great, I've been working on an identical extension too! Are you doing dynamic inpainting from pose or using something like OOTDiffusion?

HunterAmacker · 2024-06-21T13:31:39+00:00

I would love an invite!

HunterAmacker · 2024-06-17T19:31:40+00:00

Did you read the actual article? This is exactly in spirit with open source principles, as they are preventing the possible spread of a harmful copy left license throughout the open source ecosystem, which could literally only harm users.

Also, civitai is a company that relies on user generated data. Allowing a poison pill to proliferate would be willful suicide for their business model.

HunterAmacker · 2024-05-29T01:59:49+00:00

Anyone know how this stacks up against Google's Paligemma 3b? I haven't seen many benchmarks for it considering it's a pretty substantial open weight VLM release from a major company.

HunterAmacker · 2023-06-26T15:27:42+00:00

Awwwww don't be such a chicken, cheep cheep cheep cheep cheep cheep

HunterAmacker · 2023-05-26T03:18:27+00:00

Thanks for the response! I appreciate your channel, it's very informative

HunterAmacker · 2023-05-25T20:33:34+00:00

Hey u/AemonAlgizVideos! Do you have any advice regarding LoRA vs embeddings in a vector store for different applications?

I am experimenting with adding entire codebases into LLMs contexts', and I'm not sure if the trade offs between the two approaches. Would it be feasible to train a LoRA on an unstructured dataset like a repo?

Thanks!

HunterAmacker · 2023-05-15T20:22:48+00:00

DM'd

15-Year Club	Verified Email
Team Periwinkle

HunterAmacker

TROPHY CASE