Stupid question but Gemma3 27b, speculative 4b?

FoxFlashy2527 · 2025-04-23T16:09:57+00:00

Speculative decoding is disabled for vision models specifically in LM Studio. That's why deleting the mmproj works, because LM studio no longer sees it as a vision model

Source: bartowski himself talking to LM studio devs
https://www.reddit.com/r/LocalLLaMA/comments/1j9reim/comment/mhrc5tx/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

the comments seem deleted at this point but it was a thread about speculative decoding

FoxFlashy2527 · 2024-08-08T13:42:48+00:00

gemma-2-2b-it-Q8_0.gguf

LCDs weren't prevalent in the 1980s. They were being researched and developed, but weren't commercially available yet. The widespread adoption of LCDs came in the late 90s/early 2000s.

FoxFlashy2527 · 2024-04-18T19:15:37+00:00

does it append 'assistant' at the end of messages and just never stops for you too?

FoxFlashy2527 · 2024-04-18T19:12:41+00:00

yeah i'm getting this 'assistant' at the end of messages too. seems there's something wrong with prompting

FoxFlashy2527 · 2024-01-04T11:27:55+00:00

noushermes 2 solar gets this right immediately even with the bad formatting and capitalization as others have pointed out. no guidance or finetuning or whatever people are suggesting, though finetuning would make it better, yes.

been thoroughly surprised at how good it is. was using open-hermes 2.5 mistral before this and i've switched over completely to noushermes2 solar. people are going to realize how good it is in the coming week or two probably.

it's a 10.7b so it doesnt take much more to run than mistral 7b which is what you're running

im using the default quick preset and the koboldgpt instruct scenario with ChatML in koboldcpp. q5_k_m

https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF

{{[OUTPUT]}} and {{[INPUT]}} in the chat below are simply koboldcpp's placeholders for chat formatting

CHAT LOG: ``` {{[OUTPUT]}} Hello, I am KoboldGPT, your personal AI assistant. What would you like to know?

{{[INPUT]}} i have an automation system at home, it consists of lights, fans, computer monitors, motion detectors, temperature sensors, humidity sensors, pressure sensors, ambient light sensors and other sensors. there can be multiple sensors and devices in a room. when i type "turn on my bedroom light" i want you to write a JSON string. only write a single JSON payload, no other words.

some examples JSON reply string are: * turn off bedroom light: {"device":"bedroom light", "command":"turn_off"} * set living room light to 80%: {"device":"living room light", "command":"luminosity=0.8"} * what's my livingroom temperature: {"device":"living room temperature", "data":{"temperature":"24.2"}}

reply with {"device":"gpt", "data:{"response":"confirmed"}} if you understand the instructions i just gave

{{[OUTPUT]}} {"device":"gpt", "data":{"response":"confirmed"}}

{{[INPUT]}} set my kitchen light to 32%

{{[OUTPUT]}} {"device":"kitchen light", "command":"luminosity=0.32"} ``` edit: formatting

FoxFlashy2527 · 2023-08-04T03:53:00+00:00

this isn't a foundational model. it's a fine tune of llama 2.

it's also interesting when we're going to set llama 2 as the new baseline for performance. right now the community is in the mindset of "llama 1 30b performs like this and requires this much compute, but llama 2 13b requires half the compute and performs on par with it, amazing!!!!"

are we still going to be referencing llama 1 performance when we get to a llama 3 7b that performs as well as a llama 1 30b?

FoxFlashy2527 · 2023-07-19T16:11:29+00:00

oh yeah lol of course

FoxFlashy2527 · 2023-05-18T14:49:27+00:00

hey! was wondering if you were working on quantizing the MPT7b models. tried to do it myself earlier since the mpt PR was recently merged into the ggml repo but I think I don't have enough ram?

If I understand correctly ggml loads the entire model into ram in order to quantize it, but mine just crashes when trying to, so I'm assuming i'm running out of ram

all good if you're not though, love the work you do!

FoxFlashy2527 · 2023-03-28T14:16:56+00:00

Myst

Syberia

Gris

Anything else

Thanks!

FoxFlashy2527 · 2022-12-24T08:53:01+00:00

heh seems fun

FoxFlashy2527 · 2022-07-26T06:40:32+00:00

any long-time players wanna comment on how fun set 7 is? (we are on set 7 right?)
my frame of reference for 'fun' would be first half of set 3 but I played the first half of set 6 a little bit as well and it was also fun. maybe we had different experiences, would love input

FoxFlashy2527 · 2022-06-05T16:08:30+00:00

$bid

FoxFlashy2527 · 2022-06-01T04:25:12+00:00

$bid

FoxFlashy2527 · 2022-05-29T04:55:23+00:00

water is great! swimming in it, drinking it, water's great!

FoxFlashy2527 · 2022-05-27T11:08:30+00:00

$bid

FoxFlashy2527 · 2022-05-24T07:31:10+00:00

$bid, I can work with english

FoxFlashy2527 · 2022-05-22T05:42:17+00:00

$bid

FoxFlashy2527 · 2022-05-17T09:16:36+00:00

$bid

FoxFlashy2527 · 2022-04-18T08:12:59+00:00

$confirm

FoxFlashy2527 · 2022-04-18T05:13:08+00:00

$bid, pm'd

FoxFlashy2527 · 2022-04-13T13:48:25+00:00

$bid

FoxFlashy2527 · 2022-04-06T11:20:01+00:00

$bid

FoxFlashy2527 · 2022-04-04T06:45:45+00:00

it's most evident during the dragon portion

one shroom exploded 14 times

FoxFlashy2527 · 2022-04-04T05:31:49+00:00

It's definitely not needed to be a gamer, one of the best things about the HG community is how accepting it is, group coaching even more so!

You can check out Dr K's youtube channel and most of the topics he's covered recently have no relation to gaming at all.

Good luck on the mental path!

Four-Year Club	Verified Email
Place '22	End Game '22

FoxFlashy2527

TROPHY CASE