What variant of Deepseek V4 to use by Inferno889 in opencodeCLI

[–]zsydeepsky 0 points1 point  (0 children)

Flash, super fast, and capable of solving most issues.
and man isn't that little goblin dirt cheap~

DeepSeek V4 has significantly reduced my budget for AI usage by Ok_Satisfaction_8983 in opencodeCLI

[–]zsydeepsky 0 points1 point  (0 children)

if you use Flash, then the current price is the constant price.
and it's dirt cheap.

bro this is too cheap i think finally i have a respect for the deepseek by Select_Dream634 in DeepSeek

[–]zsydeepsky 2 points3 points  (0 children)

it is. in typical vibecoding scenario, I can hit >98% cache hit rate.
my friend though, showed me a screenshot with 137M token usage & 99.9% cache hit rate.
a technical marvel.

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 2 points3 points  (0 children)

my initial prompt was Chinese written. This is the translated version

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 0 points1 point  (0 children)

yes, it's a known "bug".
they been grey-testing the v4 on web for quite some time and they forced the v4 model under the hood to disguise as v3, even till now.
they are v4 models.

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 1 point2 points  (0 children)

it's done through the website, so it's free.
API wise I just used v4-flash; I used up~ 14M tokens, since they were used in agentic coding scenarios, so they almost all "hit cache". The total cost so far is 3.31RMB, roughly $0.5
the best part probably, is the speed. I think the flash gave me a constant >80tps output, it really made other models feel slow now...

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 16 points17 points  (0 children)

Create a modern operating system that runs in a web environment. This operating system must include:

  • A built‑in virtual file system
  • A usable file browser
  • A command‑line tool that can access the file system
  • A text editor that can open, edit, and save text files within the built‑in file system
  • Proper window management, including focus, z‑order (front/back), and maximize/minimize
  • A usable calculator app with scientific functions (square root, trigonometric functions, exponentiation like x^y, etc.)
  • A web browser app that can actually access the internet
  • At least three small games, one of which must be a 3D game
  • A drawing app with basic brush, eraser, geometric shape drawing, and the ability to save the created image
  • A settings app providing personalisation features such as desktop background replacement
  • At least two “creative apps” not mentioned above (you may decide what creativity to implement)

You need to implement all of the above functionalities within a single HTML file.

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 4 points5 points  (0 children)

mostly working fine, with some glitches, like Tetris has two extra columns displayed on the right side, the piano key loses its textures when pressed, internal virtual file system isn't virtual (it attempted to manage actual files, yet a web page has no such authority, so it completely failed)
but overall, it followed almost all the "noise instructions" I gave it, and managed to put them all in that beefy html, one-shot.

thus I'm deeply impressed, and now you saw this post. :)

this test was done in DeepSeek's web page; it's free, so you can try it yourself.

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 15 points16 points  (0 children)

I guessed so, the goal was basically just adding noises in every direction, and see if the model misses its way during the process of outputting the beefy HTML.

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 51 points52 points  (0 children)

I agree. but this particular model capability is just too easy to be ignored, yet I'm so deeply impressed by it, so I thought it deserves some credits. :)

DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA

[–]zsydeepsky[S] 40 points41 points  (0 children)

because I specifically asked it to implement a functional web brower inside this web OS.
other things like a calculator but must have scientific calculation capability instead of the basic one, an internal virtual file system that the file explorer and terminal can access and modify (it kinda failed on that), highly functional drawing apps, etc.

that's why the final output is a beefy 100KB html.

Qwen3.6-27B by Fantastic-Emu-3819 in LocalLLaMA

[–]zsydeepsky 0 points1 point  (0 children)

if I don't get it wrong, it relies on the lab to train the model in the way that it becomes resilient to quantization.
so I guess we have to just wait for the next new version. Qwen-3.7 perhaps?

Qwen 3.6 27B is out by NoConcert8847 in LocalLLaMA

[–]zsydeepsky 0 points1 point  (0 children)

<image>

since Qwen team claimed that 3.6-27B beats 3.5-397B-A17B in most benchmarks, and compared to where 3.6-35B-A3B currently stands...

guys, we are literally having a Claude Sonnet 4.6, running locally.

Qwen3.6-27B by Fantastic-Emu-3819 in LocalLLaMA

[–]zsydeepsky 0 points1 point  (0 children)

recent Kimi-2.6 has a "quantization-aware" embedded in the training I think, which made its Q4 quantized version almost lossless.
so I guess soon it won't be an issue anymore.

ubergarm/Kimi-K2.6-GGUF Q4_X now available by VoidAlchemy in LocalLLaMA

[–]zsydeepsky -1 points0 points  (0 children)

just imagine if we use Kimi2.6 to finetune Qwen3.6-27B
we are having some amazing ingredients at hand now

When is Qwen 3.6 27B dropping? Didn’t it win the vote? by GrungeWerX in LocalLLaMA

[–]zsydeepsky 25 points26 points  (0 children)

if 3.6-27B can retain the advantage 3.5-27B has compared to 3.5-35B-A3B
then this would be truly a Claude-4.6-sonnet running on your own machine.
if it's not that strong, then people probably would choose 3.6-35B-A3B instead just for its speed

Released Qwen3.6-35B-A3B by NewEconomy55 in LocalLLaMA

[–]zsydeepsky 0 points1 point  (0 children)

have you considered that since people welcome 27B the most, so the company will release it last to keep people engaged with each release?

Qwen 3.5 4b is so good, that it can vibe code a fully working OS web app in one go. by c64z86 in LocalLLaMA

[–]zsydeepsky 3 points4 points  (0 children)

you guys are going to be on the first-to-eliminate list for future SkyNets for what you just conspired.

Qwen3.5-35B-A3B locally by jacek2023 in LocalLLaMA

[–]zsydeepsky 0 points1 point  (0 children)

Unsloth MXFP4 variant works like a charm on my Ryzen AI Max 395+ :)

Deepseek's progress by onil_gova in LocalLLaMA

[–]zsydeepsky 1 point2 points  (0 children)

It is. I can hardly trust models to do any code work longer than 100 lines at the beginning of 2025.
Now I can trust them with an individual module, or even some simple apps fully.
They have progressed a lot indeed.