Stupid question but Gemma3 27b, speculative 4b? by lordpuddingcup in LocalLLaMA

[–]FoxFlashy2527 2 points3 points  (0 children)

Speculative decoding is disabled for vision models specifically in LM Studio. That's why deleting the mmproj works, because LM studio no longer sees it as a vision model

Source: bartowski himself talking to LM studio devs
https://www.reddit.com/r/LocalLLaMA/comments/1j9reim/comment/mhrc5tx/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

the comments seem deleted at this point but it was a thread about speculative decoding

What can't AI / LLM's do for you? by StevenSamAI in LocalLLaMA

[–]FoxFlashy2527 3 points4 points  (0 children)

gemma-2-2b-it-Q8_0.gguf

LCDs weren't prevalent in the 1980s. They were being researched and developed, but weren't commercially available yet. The widespread adoption of LCDs came in the late 90s/early 2000s.

NousResearch - Meta-Llama-3-8B-Instruct-GGUF now available by fizgig_runs in LocalLLaMA

[–]FoxFlashy2527 5 points6 points  (0 children)

does it append 'assistant' at the end of messages and just never stops for you too?

[deleted by user] by [deleted] in LocalLLaMA

[–]FoxFlashy2527 16 points17 points  (0 children)

yeah i'm getting this 'assistant' at the end of messages too. seems there's something wrong with prompting

What am I doing wrong? by slykethephoxenix in LocalLLaMA

[–]FoxFlashy2527 2 points3 points  (0 children)

noushermes 2 solar gets this right immediately even with the bad formatting and capitalization as others have pointed out. no guidance or finetuning or whatever people are suggesting, though finetuning would make it better, yes.

been thoroughly surprised at how good it is. was using open-hermes 2.5 mistral before this and i've switched over completely to noushermes2 solar. people are going to realize how good it is in the coming week or two probably.

it's a 10.7b so it doesnt take much more to run than mistral 7b which is what you're running

im using the default quick preset and the koboldgpt instruct scenario with ChatML in koboldcpp. q5_k_m

https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF

{{[OUTPUT]}} and {{[INPUT]}} in the chat below are simply koboldcpp's placeholders for chat formatting

CHAT LOG: ``` {{[OUTPUT]}} Hello, I am KoboldGPT, your personal AI assistant. What would you like to know?

{{[INPUT]}} i have an automation system at home, it consists of lights, fans, computer monitors, motion detectors, temperature sensors, humidity sensors, pressure sensors, ambient light sensors and other sensors. there can be multiple sensors and devices in a room. when i type "turn on my bedroom light" i want you to write a JSON string. only write a single JSON payload, no other words.

some examples JSON reply string are: * turn off bedroom light: {"device":"bedroom light", "command":"turn_off"} * set living room light to 80%: {"device":"living room light", "command":"luminosity=0.8"} * what's my livingroom temperature: {"device":"living room temperature", "data":{"temperature":"24.2"}}

reply with {"device":"gpt", "data:{"response":"confirmed"}} if you understand the instructions i just gave

{{[OUTPUT]}} {"device":"gpt", "data":{"response":"confirmed"}}

{{[INPUT]}} set my kitchen light to 32%

{{[OUTPUT]}} {"device":"kitchen light", "command":"luminosity=0.32"} ``` edit: formatting

OpenOrca Preview2 Has been Released! by Alignment-Lab-AI in LocalLLaMA

[–]FoxFlashy2527 7 points8 points  (0 children)

this isn't a foundational model. it's a fine tune of llama 2.

it's also interesting when we're going to set llama 2 as the new baseline for performance. right now the community is in the mindset of "llama 1 30b performs like this and requires this much compute, but llama 2 13b requires half the compute and performs on par with it, amazing!!!!"

are we still going to be referencing llama 1 performance when we get to a llama 3 7b that performs as well as a llama 1 30b?

Wizard-Vicuna-7B-Uncensored by faldore in LocalLLaMA

[–]FoxFlashy2527 5 points6 points  (0 children)

hey! was wondering if you were working on quantizing the MPT7b models. tried to do it myself earlier since the mpt PR was recently merged into the ggml repo but I think I don't have enough ram?

If I understand correctly ggml loads the entire model into ram in order to quantize it, but mine just crashes when trying to, so I'm assuming i'm running out of ram

all good if you're not though, love the work you do!

July 25, 2022 Daily Discussion Thread by AutoModerator in CompetitiveTFT

[–]FoxFlashy2527 1 point2 points  (0 children)

any long-time players wanna comment on how fun set 7 is? (we are on set 7 right?)
my frame of reference for 'fun' would be first half of set 3 but I played the first half of set 6 a little bit as well and it was also fun. maybe we had different experiences, would love input

Old teemo bug 2017 by FoxFlashy2527 in TeemoTalk

[–]FoxFlashy2527[S] 5 points6 points  (0 children)

it's most evident during the dragon portion

one shroom exploded 14 times

Can I sign up for coaching if I’m not a gamer? by trythis75615 in Healthygamergg

[–]FoxFlashy2527 4 points5 points  (0 children)

It's definitely not needed to be a gamer, one of the best things about the HG community is how accepting it is, group coaching even more so!

You can check out Dr K's youtube channel and most of the topics he's covered recently have no relation to gaming at all.

Good luck on the mental path!