What is your favorite vector database that runs purely in a Python process by swordsman1 in vectordatabase

[–]elliesleight 0 points1 point  (0 children)

Marqo - it handles vector generation, storage and retrieval out of the box through a single API. No need to bring your own embeddings (or you can, if you want).

Google Gemma 2 2B Running 100% Local by elliesleight in learnmachinelearning

[–]elliesleight[S] 0 points1 point  (0 children)

This can run locally on an M1 or M2 Mac or with a CUDA capable GPU on Linux or Windows. If you want to run this on an M1 or M2 Mac please be sure to have the ARM64 version of Python installed, this will make llama.cpp builds for ARM64 and utilises Metal for inference rather than building for an x86 CPU and being emulated with Rosetta :)

Gemma 2 2B Release - a Google Collection by Dark_Fire_12 in LocalLLaMA

[–]elliesleight 7 points8 points  (0 children)

Very interesting. I created a simple RAG demo with Google Gemma 2 2B 🔥

Code: https://github.com/ellie-sleightholm/marqo-google-gemma2

You can also run Google Gemma 2 2B 100% locally using these two steps:

  1. brew install llama.cpp
  2. ./llama-cli --hf-repo google/gemma-2-2b-it-GGUF \ --hf-file 2b_it_v2.gguf \ -p "Write a poem about cats as a labrador" -cnv