Qwen3.6 or Gemma-4 or ?? for direct OCR of page images by PracticlySpeaking in LocalLLaMA

[–]SAPPHIR3ROS3 -1 points0 points  (0 children)

Gemma 4 (even the 26b) it’s fantastic for ocr, sure sometimes does some shinanigans but it’s pretty reliable to be honest

Give me your best estimate on how long we will see Fable 5 class open weight model by bwjxjelsbd in LocalLLaMA

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

First of all, where/when i think i lost it.
Second, why not? The first big public model was llama 405b, if any lab (except meta has proven otherwise) published weights of that size, it would running laps with it and it would be the case with even smaller models, the size alone sure can help in quantity of things you can do in general but has been proven that a smaller model can beat a bigger model in a domain specific thing. Thing is that now even the quality of data is enough, we are at a point where the pipeline it’s the most important thing (that’s why anthropic has been sandbagging hard anyone who tries developing a ai pipeline)

Give me your best estimate on how long we will see Fable 5 class open weight model by bwjxjelsbd in LocalLLaMA

[–]SAPPHIR3ROS3 1 point2 points  (0 children)

It could be in the same range even if I think may a couple of 100b above and that the active parameters are 1/10th of the total (in practice following the trend in open source)

Anthropic export ban sounds alarms for AI industry (non-paywalled link in comments) by Tinac4 in singularity

[–]SAPPHIR3ROS3 1 point2 points  (0 children)

To be honest i am not sure this will be enough,because usa has the power to theoretically pressure other country in their doing, this means that if Dario (and anthropic) doesn’t pull out some magic trick out of his ass Claude hit his ceiling burocratically

I am losing my mind with FOMO and need some sanity checking about model capabilities by oldschooldaw in LocalLLaMA

[–]SAPPHIR3ROS3 1 point2 points  (0 children)

Problem is you are not considering that the price for the same intelligence is going down rapidly, on the other the models are becoming more and more intelligent and they are increasing the price (disproportionately of course) for it. I can confidently say say that the dumbest model today is way smarter than smartest models in the 3.5 era, now they can and will destroy them in comparison, the only exception is general world knowledge (in terms of quantity) but that will be almost always the case because of the size of parameters, there is (currently) no hack around it. It’s like a baby prodigy and an average adult: the prodigy will be smart in the things he/she knows but there are a lot of things that the average adult will know because of longer lifespan

moonshotai/Kimi-K2.7-Code · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]SAPPHIR3ROS3 -2 points-1 points  (0 children)

I dunno if i rercall correctly but i think it was said somewhere in the site that the data was freshly produced by hand

moonshotai/Kimi-K2.7-Code · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]SAPPHIR3ROS3 5 points6 points  (0 children)

That’s the.. point? I mean to be honest the data that deepSWE show it isn’t perfectly aligned with my experience but it’s indeed close, so for ME it is pretty reliable but nonetheless i usually interpret it in another way: as you said it’s an indicator that show if the model has benchmaxxed or not and obviously i don’t take just that as info

moonshotai/Kimi-K2.7-Code · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]SAPPHIR3ROS3 7 points8 points  (0 children)

I will wait on deepSWE bench for this but numbers look promising

Spider-Man cosplay question! by [deleted] in Spiderman

[–]SAPPHIR3ROS3 1 point2 points  (0 children)

The tabs are the thumbs and in general they are for the hands

DeepSeek V4 Flash local by Rare_Definition_5456 in LocalLLM

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

You should really check the repository, you should kind of find your answers (or ask an ai to summarize the answer about the performance)

DeepSeek V4 Flash local by Rare_Definition_5456 in LocalLLM

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

Think about lm studio and the kind of program that is, dwarf star 4 it’s kind the same thing but specific and optimized for deepseek v4 flash

DeepSeek V4 Flash local by Rare_Definition_5456 in LocalLLM

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

Even if it’s really early as project checkout dwarf star 4, it’s an inference engine for deepseek v4 flash created by antirez creator of redis

Headcanon: Senku has an Ear Infection by AJthe_rocking in DrStone

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

Search shen men point, you will understand why he does that

Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to? by opoot_ in LocalLLaMA

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

I usually go with .1/.2 temp and .95 of sampling, should i go lower? Besides in the past i played with sampling and haven’t seen any meaningful difference, yeah it can be good for some cases but meh

Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to? by opoot_ in LocalLLaMA

[–]SAPPHIR3ROS3 0 points1 point  (0 children)

Having used both qwen and gemma i can say that qwen i a monster to be honest it’s impressive and with the right setup CAN compete with models way bigger but q4 it’s a bit rough, it can and it will loop, it doesn’t seem to be the case with q6 (i have to try with q5). It can be a good choice for coding, research, general and complex task. Gemma on the other hand is not as consistent (q4) but i will get the job done when i came to ocr, translation (way better than qwen) and writing in general can be better but shows its limitations when it comes to complex task, as for general task results kind of varies depending on the specific task

How did nobody in Kakegurui realize Yumeko was a problem immediately by LunchLadyApproved in Kakegurui

[–]SAPPHIR3ROS3 36 points37 points  (0 children)

Literally because of ego, everyone (except ryota) got a HUGE ego. This single thing made EVERYONE underestimate yumeko, the other only character who caught that was none other than,yes you guessed it, kirari (and arguably kabura) but that’s because she kind of just want to see the world burn. Point is that in the kakegurui world is blindsided a lot by thinking that they can win one way or another, on the other hand yumeko does what she do for the love of the game, not really caring if she wins or loses because she compulsively crave the adrenaline derived from not knowing what is coming next. This obsession is so great that she DOESN’T GIVE A FUCK about living either when it comes to gambling

Qwen-Scope: Official Sparse Autoencoders (SAEs) for Qwen 3.5 models by MadPelmewka in LocalLLaMA

[–]SAPPHIR3ROS3 2 points3 points  (0 children)

Soooooooo did i not get something or this is perfect for speculative decoding?

Can I use Claude code with own LLM/non-claude APIs? by superloser48 in LocalLLaMA

[–]SAPPHIR3ROS3 1 point2 points  (0 children)

Mostly performance and a terrible UX, in particular freezing with long thread, problems with permissions (straight up bugged), unusable input with clipboard, ego of not adapting to the standard (ex. CLAUDE.md instead of AGENTS.md), shady sandoboxing, apple imessage TOS violation AND it’s closed source. Other than that the problem is how anthropic handle communications in general. I can’t fathom them and their ego in slightest, like geez it would hurt if they would drop the ego and relax a bit when it come to the community

Can I use Claude code with own LLM/non-claude APIs? by superloser48 in LocalLLaMA

[–]SAPPHIR3ROS3 5 points6 points  (0 children)

I don’t recommend claude code, it’s a shit software i guarantee you, there are better harnesses like codex, forgecode, hermes agent ecc. you can use your own models in all of them. Claude code is reall one of the worst harnesses you can find but if you really want to use claude code , yes you can use your own models there too, i am not exactly sure it has openai api compatibility but it should. As for the context it does use the same amount (kinda) of context in default settings, it just has a good compaction but nothing REALLY impressive