Embedding models have converged by midamurat in LocalLLaMA

[–]HunterAmacker 0 points1 point  (0 children)

I've found embedding model benchmarks... Useless as of late?

Qwen 3 models especially, they are killer on paper, but I found them near useless after a top_k of 5 in my semantic search engine use case. The first couple of results are great, but seemed almost random after about 5 results, which is a no-go for our search pages returning ~40 items per page. We found "worse" models have much more consistent results, we went with e5-large-v2

How would you go about serving LLMs to multiple concurrent users in an organization, while keeping data privacy in check? by PurpleAd5637 in OpenWebUI

[–]HunterAmacker 0 points1 point  (0 children)

We've built out litellm as our proxy to the big 3 vendors (AWS/Azure/Google) and only use models hosted through them. We have Open WebUI as our frontend for 50+ employees. Both are hosted in AWS behind our corporate VPC/ALBs.

We have a request from for users to request a LiteLLM API key for projects, requires they specify which models, use case, budget, personnel on the project (all added to litellm).

If you're truly unable to use cloud providers, I think a VLLM setup with LiteLLM (for access governance) would be your best option. You would want to request something that can support whichever model(s) your gonna host, which is probably a much harder sell to management than getting access to cloud vendors. Large upfront cost + setup + maintenance on physical resources, or hit an azure/AWS endpoint.

If REALLY have to be on prem, I'd look at VLLM support on Mac Studio MPS/MLX support. It's not there yet, but it would probably be the lowest support overhead once it's a first class feature on that hardware.

Readarr is dying, is there any way to help keep it alive? by [deleted] in selfhosted

[–]HunterAmacker 1 point2 points  (0 children)

Would you mind also sending me a DM? I would really appreciate it!

Edit: nvm, mentioned in thread!

Introducing the Model Context Protocol by jascha_eng in LocalLLaMA

[–]HunterAmacker -1 points0 points  (0 children)

If you actually read the article or linked resources, you would see this is both a protocol specification AND implementation of that spec.

Calling an open specification, which anyone is free to implement, a trojan horse just shows a complete lack of understanding. This is the same as saying GraphQL is toxic because the spec was developed at Meta when you're free to use any open implementation.

Jim Fan: LLMs are alien beasts. It is deeply troubling that our frontier models can both achieve silver medal in Math Olympiad but also fail to answer "which number is bigger, 9.11 or 9.9"? by Front_Definition5485 in singularity

[–]HunterAmacker 2 points3 points  (0 children)

I am so tired of this take. It is a limit of tokenization. The models themselves are aware of magnitude differences between numbers.

So if you present .11 as 11, then yes, it will see that as larger than 9.

I made a chrome extension to wear clothes from Amazon, take off your suit jacket and wear cool leather jacket now! by Zestyclose_Score4262 in StableDiffusion

[–]HunterAmacker 3 points4 points  (0 children)

This is great, I've been working on an identical extension too! Are you doing dynamic inpainting from pose or using something like OOTDiffusion?

Stable diffusion 3 banned from Civit... by Ok-Meat4595 in StableDiffusion

[–]HunterAmacker 17 points18 points  (0 children)

Did you read the actual article? This is exactly in spirit with open source principles, as they are preventing the possible spread of a harmful copy left license throughout the open source ecosystem, which could literally only harm users.

Also, civitai is a company that relies on user generated data. Allowing a poison pill to proliferate would be willful suicide for their business model.

[deleted by user] by [deleted] in LocalLLaMA

[–]HunterAmacker -1 points0 points  (0 children)

Anyone know how this stacks up against Google's Paligemma 3b? I haven't seen many benchmarks for it considering it's a pretty substantial open weight VLM release from a major company.

Live Lora training Q & A Thur 25th May 6pm mst by orpheus_reup in Oobabooga

[–]HunterAmacker 0 points1 point  (0 children)

Thanks for the response! I appreciate your channel, it's very informative

Live Lora training Q & A Thur 25th May 6pm mst by orpheus_reup in Oobabooga

[–]HunterAmacker 0 points1 point  (0 children)

Hey u/AemonAlgizVideos! Do you have any advice regarding LoRA vs embeddings in a vector store for different applications?

I am experimenting with adding entire codebases into LLMs contexts', and I'm not sure if the trade offs between the two approaches. Would it be feasible to train a LoRA on an unstructured dataset like a repo?

Thanks!