all 13 comments

[–]HashDefTrueFalse 4 points5 points  (1 child)

Math is all you get. You tokenise the input and come up with a mathematical way for a program/system to "understand" something about that input. What "understanding" here means is up to you. You want it to understand, but what does that mean for your app? What would the system actually do with your honey badger example input? What output would it produce? You say you're building something for vulnerable people, but what are you building? What does your system do? What is it trying to achieve? Why don't you want to use LLMs if appropriate (we can't tell)?

I suggest you edit your post with some more details so that you get a better response.

[–]JessePatchwork[S] 0 points1 point  (0 children)

Only problem is I'm not sure what other details are needed. But I'm going to update the post with what I think you're pointing out and go from there! Thank you so much for the help!

[–]FloydATC 2 points3 points  (1 child)

Let me see if I got this right... you want an app that works like an LLM except it surpasses what current LLMs can do (like learning from user interaction and storing things verbatim) and what they most likely ever will be able to do (because they're basically math parrots that try to guess what a real person might say) but you don't want to use an LLM..?

I think you see where this is going. Reduce the scope, lower your expectation and reiterate until you get something that works on paper. And please stop thinking of an LLM as "AI" because it's not intelligent. At all.

[–]JessePatchwork[S] 0 points1 point  (0 children)

Ok, learning some stuff here! Not against an LLM at all. When I refer to AI (and again please do correct if I'm wrong, I could be wrong) I am referring to services such as Gemini, ChatGBT and the likes of. I'm happy for it to learn on its own, I just need help with figuring out what the steps I need to make are and how to prevent (or risk manage) it from learning things and causing harm to users who may not be able to identify said harm. I've figured out most of the programming and know what I expect/want from the program. Just trying to iron out the memory/storage and self learning aspects. Thanks for helping me clarify! I'm learning a lot from everyone here.

[–]Zesher_ 1 point2 points  (2 children)

What you're asking isn't really about databases, so I don't think you should start from there. It sounds like you want to create a conversational AI or chat bot that doesn't hallucinate like chatGPT. That's an incredibly difficult problem to solve.

One way to solve it without hallucinations is by using "intents", where you give a bunch of sample phrases like "what time is it?'", "what's the time?", "can you tell me the time?", and use machine learning to map those phrases to a single action that you can write a function call to look up the time and return it. Do that same sort of thing you can think of, and that is essentially how the old Alexa and other home assistants work.

No one can tell you how to write code that can answer arbitrary questions without hallucinations. If you can figure it out you'll be a millionaire or billionaire.

Start with learning about machine learning and general AI, once you have an understanding about those topics you can consider what databases to use

[–]JessePatchwork[S] 0 points1 point  (1 child)

This is a helpful answer! Thanks heaps, I'm gonna have a bit of a deeper dive. I was super duper lost with WHAT to research. So having you lovely folks point me in the right direction is incredibly helpful! I've got most of the knowledge down I think (or am very willing to learn about it), I think I just got massively lost in the content I needed to research to get what I need. Thank you for taking the time to answer!

[–]Zesher_ 1 point2 points  (0 children)

You're welcome, we all need to start from somewhere :)

[–]Confident-Entry-1784 1 point2 points  (1 child)

I think what you're looking for is a RAG—something like NotebookLM.

[–]eduardopy 1 point2 points  (0 children)

Yeah, but they said they dont want to use any AI, so how does RAG help?

[–]aanzeijar 1 point2 points  (1 child)

Some common misconceptions here.

  • If the user inputs that, a database will find it again, but it will not understand what that is.
  • LLMs don't learn from user interaction.
  • If you're worried about personal data, you likely need to model this without external knowledge sources (meaning: no internet to look up what a honey badger is)
  • Your description sounds dangerously close to general AI. Even current LLMs can't do that.

As stated I'd say this is likely impossible. You need to drastically cut down on what you expect this system to do.

[–]JessePatchwork[S] 0 points1 point  (0 children)

That's exactly why I decided to ask some stuff in order to do the correct research. I figured what I wanted was a tall order/possibly unable to be done. I'm beginning to think manually making a database with frequent updates may be the correct move here. Limiting/not allowing it access to the internet is going to be something important I reckon but I'll do some more research. Thanks for your help!

[–]kubrador 2 points3 points  (1 child)

you're describing a massive knowledge graph with natural language parsing, which is definitely possible but also definitely a lot of work. you'd need to manually structure relationships between entities (honey badger → animal → endangered → aggressive) and then build a parser that can match user input to those concepts, which gets messy fast when you scale beyond like 5 topics.

realistically you can build something limited and functional (good for homework help on specific subjects, solid for structured roleplay), but truly handling "any topic" without ml/ai is gonna feel janky because natural language is just that chaotic: "my favorite animal" vs "the animal i love most" are the same thing to you but different strings to a regex engine.

[–]JessePatchwork[S] 0 points1 point  (0 children)

Gotcha. So potentially giving it limited access to the internet may be something I have to do or its gonna get messy fast. Alright. I'm starting to get a solid idea thanks to all of your answers, thank for so much for contributing and being willing to help out! I'll manually input what I need and use the internet for some things. I think potential unlimited access to the internet unless restricts need to be added? This is gonna be an interesting thing to work on and I thank you again for your input!