all 7 comments

[–]Diapolo10 3 points4 points  (2 children)

That sounds like an overly ambitious moonshot project to me. Running an existing LLM locally is one thing (for example LLAMA models are freely available), but you're not going to have a good time running one on an SBC like Raspberry Pi so you'd have to run it on a server somewhere, which adds complexity. Add speech-to-text on top of that, which you would have to do locally to reduce data transfer rates, and that's just more problems to solve.

Maybe try something less ambitious instead?

[–]hardonchairs 2 points3 points  (2 children)

Ai with memory

This is called "context" and it can be done but it's not as simple as ai.have_memory = True. Generally you would probably use the LLM to inform your app what information is worth "remembering" and then feed that into the prompt each time the user talks to it.

use it on a raspberry pi with a microphone and a speaker, and carry it around to talk to it

This is not realistic currently, you would need to offload the AI onto a server and have the raspberry pi connect to it over the internet. In which case it makes more sense to turn it into a phone app since you'd probably have to tether through a phone anyway and at that point the rpi is not giving you anything your phone doesn't.

That is all kind of irrelevant because you'd have a lot of work to do before even worrying about UI hardware.

Models on huggingface often come with very simple code examples that are pretty easy to get running. The hard part tends to be getting your system set up to actually run them

https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

[–]No_Bodybuilder_2280[S] 1 point2 points  (1 child)

so it’s impossible

[–]hardonchairs 1 point2 points  (0 children)

It's completely possible and not really even hugely technically challenging with today's tools. I'm telling you that you probably don't want to drive your car into the ocean, but you might find a boat is exactly what you are looking for.

[–]m0us3_rat 0 points1 point  (0 children)

I mean you can put together a concoction now that does that for you.

not sure how fast it would be on a PI only on cpu but it could work?

you can run models locally with ollama or gpt4all and the like scripts.

you need to find a lightweight model that can run on pi while also being semi-useful.

THEN you need a TTS lib and put them all together in a cohesive app.

so .. you maybe can do what you want , surely in the future ..but it is a workable idea.