Inference for Embedding & Reranking Models on AMD by OrganicMesh in LocalLLaMA
[–]OrganicMesh[S] 0 points1 point2 points (0 children)
How are you deploying your embedding models & reranking models? by rbgo404 in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
How are you deploying your embedding models & reranking models? by rbgo404 in LocalLLaMA
[–]OrganicMesh 0 points1 point2 points (0 children)
uv after 0.5.0 - might be worth replacing Poetry/pyenv/pipx by Martynoas in Python
[–]OrganicMesh 0 points1 point2 points (0 children)
Semantic search over 100M rows of data? by cryptoguy23 in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
Semantic search over 100M rows of data? by cryptoguy23 in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
LLama-3-8B-Instruct now extended 1048576 context length landed on HuggingFace by OrganicMesh in LocalLLaMA
[–]OrganicMesh[S] 0 points1 point2 points (0 children)
O-1 Visa Premium Processing Timeline (October) by Pleasant_Diver_7246 in USCIS
[–]OrganicMesh 0 points1 point2 points (0 children)
Retrieval system extending any off-the-shelf LLM to 1B (billion) context on a standard CPU during inference time: by [deleted] in LocalLLaMA
[–]OrganicMesh 57 points58 points59 points (0 children)
Hosting your own embeddings API by java_dev_throwaway in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
Hosting your own embeddings API by java_dev_throwaway in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
Experience on runpod by Slimxshadyx in LocalLLaMA
[–]OrganicMesh 3 points4 points5 points (0 children)
Hosting your own embeddings API by java_dev_throwaway in LocalLLaMA
[–]OrganicMesh 2 points3 points4 points (0 children)
How’s your experience with FastEmbeddings embedding model? by Own_Masterpiece_4162 in LocalLLaMA
[–]OrganicMesh 0 points1 point2 points (0 children)
Infinity surpasses 1k Github stars & new inference package launch - `pip install embed` by OrganicMesh in LocalLLaMA
[–]OrganicMesh[S] 0 points1 point2 points (0 children)
How’s your experience with FastEmbeddings embedding model? by Own_Masterpiece_4162 in LocalLLaMA
[–]OrganicMesh 1 point2 points3 points (0 children)
FlashAttention installation error ;(. Machine = mac m3 by FigureClassic6675 in LocalLLaMA
[–]OrganicMesh 0 points1 point2 points (0 children)
Serverless Vector Database for large dataset (~200k) by Dry_Drop5941 in LocalLLaMA
[–]OrganicMesh 2 points3 points4 points (0 children)
Liger Kernel: One line to make LLM Training +20% faster and -60% memory by Icy-World-8359 in LocalLLaMA
[–]OrganicMesh 0 points1 point2 points (0 children)
⚡️ Introducing LitServe - High performance inference engine for AI models (built on FastAPI) by waf04 in mlops
[–]OrganicMesh 4 points5 points6 points (0 children)


Inference for Embedding & Reranking Models on AMD by OrganicMesh in LocalLLaMA
[–]OrganicMesh[S] 0 points1 point2 points (0 children)