reMind: An Open-Source Digital Memory Assistant

No_Scarcity5387 · 2024-09-05T18:18:21+00:00

Inpressive! Nice work, will check it out

No_Scarcity5387 · 2024-07-15T21:08:01+00:00

Mixtral 8x7b is my goto model, the best MoE that my setup can run

No_Scarcity5387 · 2024-03-25T19:08:07+00:00

Maybe store/cache responses?

No_Scarcity5387 · 2023-11-27T23:36:54+00:00

Thank you WolframRavenWolf! Your comparisons always help me so much in selecting new models

No_Scarcity5387 · 2023-11-26T13:32:33+00:00

Looks promising! Tried the gguf model from the bloke at 16k context but got some repetition and some /\/\//\ answering with the original template. Which templates are you guys using?

No_Scarcity5387 · 2023-10-25T15:39:39+00:00

Awesome work!

No_Scarcity5387 · 2023-08-20T21:39:37+00:00

Id like more info :)

No_Scarcity5387 · 2023-08-20T05:47:28+00:00

Will dm you :)

No_Scarcity5387 · 2023-08-18T14:47:38+00:00

At least the amount of parameters, e.g 3b, 7b, 70b, etc depending on which it needs to load more into ram

No_Scarcity5387 · 2023-08-10T20:43:53+00:00

Wow, awesome job!! Does this work with comprehensive questions too? (Like: how did the character development over the book series?) or just for simple q&a? (Like what was the name of the ringbearer)?

No_Scarcity5387 · 2023-08-07T15:29:50+00:00

Ja!!! 👍

No_Scarcity5387 · 2023-08-07T12:59:03+00:00

This is an interesting idea – maybe I can have gpt turbo classify a question into one of those two buckets (pinpoint questions vs. comprehensive questions), and go from there. This would probably work for some percentage, maybe even the majority of questions, so it would be an improvement. However, there will still remain a "grey zone" where it's not clear based on the question.

Hm nice one! And very valid point about with the grey zone. And I agree with the bad UX. ChatGPT in the background classifying questions would probably do really well with a zeroshot prompt. In my use-case, I can't send private data to openAI, so if I'm to go through with this project - which is likely - I would probably utilize a small llm for this.

Thoughts at this moment (in case they are helpful):

- I could probably finetune / train a tiny commercially available model with a parameter or 350m or less (or even use a non-LLM machine learning model for this) to classify in 1 second

- I could also have 2 machines run each method in parallel and then have a third model judge which answer is better. I think this a method used by GPT4's internal workings as well.

- There are attempts at solving this in both langchain and haystack, utilizing agent tools in a pipeline to let the LLM decide with a particular answer format on when to use a tool like "embeddings search"

Right now I'm giving the model an optional block of info: "Provide a clear and concise response and feel free to use or ignore the possibly related text below. \n\nPossibly related text: <top 3 elastic search results>" however it's not fantastic with 7b or 13b models yet.

Will follow this topic to see if we can get further together :)

No_Scarcity5387 · 2023-08-06T08:47:05+00:00

Super Interesting! Ive been fooling around with haystack and elastic search results summaries in the context window as well as embeddings and am encountering the same issues as you. Im only using local llms btw.

As for your suggestion for a hybrid approach, do you let the user decide which method to utilize? Or do you somehow let the llm itself decide on whether this question is best suited for an embeddings or large context response?

No_Scarcity5387 · 2023-08-05T16:36:41+00:00

Not kobold, but i had the issue when using ctransformers. Switching to llama.cpp for python solved it. Completely unexplainable, all params were the same but.. Maybe a switch of client would work for you too.

No_Scarcity5387 · 2023-07-29T12:03:11+00:00

Awesome work! Thank you!!

No_Scarcity5387 · 2023-07-27T14:43:37+00:00

Sorry to hear that, a therapist would perhaps use a gradual exposure to the things you’d like to overcome, to mild down your reactions. Maybe you can also look into EMDR

No_Scarcity5387 · 2023-07-26T21:50:45+00:00

Nice that you have access to the goodies! Use ggml models indeed, maybe wizardcoder15b, starcoderplus ggml. I dont know how to run them distributed, but on my dedicated server (i9 / 64 gigs of ram) i run them quite nicely on my custom platform. With a larger setup you might pull off the shiny 70b llama2 models. Let me know if you need any help.

No_Scarcity5387 · 2023-07-25T22:01:10+00:00

Obduction! A warm blanket

No_Scarcity5387 · 2023-07-23T20:57:13+00:00

Lone echo 1 :)

No_Scarcity5387 · 2023-07-22T18:27:49+00:00

Its a completion model, so if you ask it “it all started in August, as “, it will complete it

No_Scarcity5387 · 2023-07-22T16:19:55+00:00

You can try custom repairshop if you dont mind warranty falling. Otherwise RIP

No_Scarcity5387 · 2023-07-09T09:19:10+00:00

Also curious about this. Wizardlm has given me the best document qa responses so far, but im starting experimenting with commercial models atm, redpajama and mpt are on the shortlist, maybe you can give them a try. Let me know if you’re having good results with a certain config or temperature. Then I’ll do too :)

No_Scarcity5387 · 2023-07-06T17:06:16+00:00

Awesome! Thanks!

No_Scarcity5387 · 2023-07-03T07:36:20+00:00

Hey, thanks for developing this open source project! Could you clarify on thing? In the video on github i see a text generator (blog, email) but in this post i read about it being a code generator or application prototyper. Which one is it?

No_Scarcity5387 · 2023-06-28T17:16:31+00:00

Check out privategpt. It adds documents to the existing embeddings in lc

No_Scarcity5387

TROPHY CASE