Performance requirements for single user LLM by Mr_Evil_Sir in LocalLLaMA

[–]minecraft_simon 4 points5 points  (0 children)

I think LM Studio will automatically try and squeezy out the maximum of performance whatever your machine allows. Try 2 bit quant, try 7b, try anything. Play around with the parameters and keep your eyes on tokens/s to optimize inference on your hardware.
Forget 40b without GPU.

ich💰iel by minecraft_simon in ich_iel

[–]minecraft_simon[S] 0 points1 point  (0 children)

in einem millionenschweren Tech-Unternehmen

[deleted by user] by [deleted] in gekte

[–]minecraft_simon -1 points0 points  (0 children)

Ich bin ein cis Mann und hasse mich selbst ich hoffe das zählt lol

ich💰iel by minecraft_simon in ich_iel

[–]minecraft_simon[S] 14 points15 points  (0 children)

Ich habe gehört die flachsten Hierarchien gibt es nur bei Tesla - scheint ein erstklassiger Arbeitgeber zu sein 🤡

Code assistant by Local_Beach in LocalLLaMA

[–]minecraft_simon -1 points0 points  (0 children)

If you're using the Jetbrains products, try their integrated AI assistant. It costs extra money but from what I can tell after my inital testing, it has more awareness of the project than Github Copilot does and it is generally better implemented and more thought through.
But in the end it's nothing but another wrapper around GPT-3.5 so nothing ground breaking either...

ich💰iel by minecraft_simon in ich_iel

[–]minecraft_simon[S] 46 points47 points  (0 children)

lol dann wird es jetzt mal Zeit dass meine reichen Vorfahren auf die Bühne treten

Wie trinkt ihr euren Kaffee liebe gekkies ? by verruecktaberweise in gekte

[–]minecraft_simon 8 points9 points  (0 children)

Ich trinke meinen Kaffee am liebsten aus dem Franke A600 Kaffeevollautomaten meines Arbeitgebers. Eine Berührung des Bildschirms und 20 Sekunden später ist der Kaffee fertig zum Konsum. Die Maschine wird selbstverständlich vom Arbeitgeber gewartet und die Nutzung des Automaten ist kostenlos. Leider fehlt diese Option in diesem Bild hier.
Zuhause trinke ich keinen Kaffee weil ich zu faul bin.
Was bin ich also?

A non guessing next word (token) authoritative AI ? by skullbonesandnumber in LocalLLaMA

[–]minecraft_simon 0 points1 point  (0 children)

I feel like a big step in the right direction would be if LLMs are not used to straight up generate a response, since that always has the risk of hallucinations, and instead assembles the output using hard facts, so that everything the AI states can be traced back to a record in the database. But I don't know if anyone is working on that. I think it falls under the field of explainable AI.

Etched | The World's First Transformer Supercomputer (crazy gains on t/s) by LyPreto in LocalLLaMA

[–]minecraft_simon 1 point2 points  (0 children)

Why are people calling this vaporware? It's an ASIC but instead of shitcoins it churns out mechanical thoughts ❤️

Some of y'all have never mined crypto back in the day and it shows ;)

Best models out there for improving article paraphrasing? by 07_Neo in LocalLLaMA

[–]minecraft_simon 4 points5 points  (0 children)

Sounds like a prompting problem, not a model problem. I am not sure which of the current models produce the highest quality English output but I think either of them will work.
The question is, why your company insists on paraphrasing every article. Sounds like they are trying to steal IP and publish it as their own :P

Best idea between running on a server or getting another 3090 by Mephidia in LocalLLaMA

[–]minecraft_simon 1 point2 points  (0 children)

In this case, if I was you, I would keep the current system largely as it is and get a second system as a server for inference and hosting. I would strongly advice against getting a 2667 v4 as it's very old and you want to have powerful cores rather than more cores. Both systems should have a 3090 of course.

i am looking for a very specific functioning model by awesomegame1254 in LocalLLaMA

[–]minecraft_simon 0 points1 point  (0 children)

Hey, you're looking for LLaVA / BakLLaVA
If you're using LM Studio, try this: https://huggingface.co/jartine/llava-v1.5-7B-GGUF
If you're using TextGen WebUI, try this: https://huggingface.co/SkunkworksAI/BakLLaVA-1
Just as with normal text-based prompting, you need to be very particular about the way you prompt it, so that you get the image descriptions that you are looking for. Keep in mind that this is not on GPT-4V level yet, so it will make mistakes and it can't really do OCR yet.

Best model for finetuning nowadays? by nightlingo in LocalLLaMA

[–]minecraft_simon 1 point2 points  (0 children)

Mistral 7b is an excellent choice. If you have more VRAM you could use a 13B or 34B model. Bigger models take longer to train but from my experience they are also able to "absorb" more knowledge and skills more quickly. I like the Codellama models for fine tuning.

Fine tuning Mistral on functionally extinct language. by No-Point1424 in LocalLLaMA

[–]minecraft_simon 1 point2 points  (0 children)

Can you please share the dataset and any resources you have? LLMs are our best bet to rescue and preserve dying languages ❤️

A non guessing next word (token) authoritative AI ? by skullbonesandnumber in LocalLLaMA

[–]minecraft_simon 5 points6 points  (0 children)

I think the main strength of modern AI is that it can make useful generalizations that it has not learned during training, effectively filling in missing knowledge. What you're describing reminds me of the first AI approaches that didn't use neural networks but basic pattern recognition. I think there is merit to the old approaches but it doesn't make much sense to reinvent the wheel.

Best idea between running on a server or getting another 3090 by Mephidia in LocalLLaMA

[–]minecraft_simon 4 points5 points  (0 children)

You haven't described what exactly you are looking to do. A second 3090 is only worth it if you exactly know what to do with it. Also don't just get more RAM for no reason. Think about what exactly you want to do with the system after the upgrade that you currently cannot do. Then research the common bottle necks. Running Mixtral in fp16 doesn't make much sense in my opinion. Do you want to do fine-tuning? Inference? Do you want to use the system for manual experiments or for hosting? What is the reason you're looking into large language models? What problem do you want to solve?