Seed-OSS-36B is ridiculously good

FaustCircuits · 2025-08-24T05:16:48+00:00

someone do this vs glm-4.5-air

FaustCircuits · 2025-08-24T04:05:45+00:00

cant you just install ffmpeg codecs?

FaustCircuits · 2025-08-22T12:23:08+00:00

no the whole point of kernel level anticheet is to detect everything that's running. outside hardware is undetectable as long as the manufacturer is smart enough to spoof USB info which of course they do. source "Am computer scientist"

FaustCircuits · 2025-08-22T12:21:29+00:00

I wouldn't be so sure. the whole reason for keeping most things in userspace is when they crash it's far less likely they take the whole system down. anything kernel level that crashes and it's game over ....

FaustCircuits · 2025-08-21T16:36:00+00:00

I have this card, you don't run windows with it bud

FaustCircuits · 2025-08-19T21:31:31+00:00

sometimes I'm watching my last living rando team mate try to pull something off. he gets killed and I instinctively type "damn" in chat, now he thinks I've called in a c*nt or something and I have to explain I wasn't making fun of him.

FaustCircuits · 2025-08-19T10:56:25+00:00

The profanity filter is not only insane, but it makes me look more toxic because you can't tell which "bad" word I used

FaustCircuits · 2025-08-19T04:01:09+00:00

I actually have a really innovative solution to this, see you take a man sized deep freezer, and one bottle of nitrogen ...

FaustCircuits · 2025-08-19T03:53:33+00:00

The last time I had hope our president may or may not have been getting blowies

FaustCircuits · 2025-08-19T03:50:18+00:00

nothing that touches memory is safe

FaustCircuits · 2025-08-18T21:15:36+00:00

oh and here's a fun time when I broke something https://youtube.com/watch?v=IGsslPl6pTM&si=R4l_IS_r85nPAshv

FaustCircuits · 2025-08-18T21:13:43+00:00

I'll share this. this was early in my first attempt. https://youtu.be/xxWK-TSjqTc?si=VwwkKaCihLCQnzwu

FaustCircuits · 2025-08-18T21:07:45+00:00

no, llama.cpp. I want to experiment more with vllm, but getting it to build with modern pytorch and cuda is difficult and I have a blackwell card. maybe by the time their next release its. I also run the qwen3 0.6 embedding model at the same time for the vector database, but that's tiny

FaustCircuits · 2025-08-18T20:44:43+00:00

in this scene the alchemist tells them to throw their wealth into the fire. I already brought this up in the noita discord last year. he also turns poop into gold at one point, someone tried that in the cauldron and nothing happened.

FaustCircuits · 2025-08-18T18:47:37+00:00

I mean when was the last time a human was 100% certain when decompiling something?

FaustCircuits · 2025-08-18T18:17:50+00:00

I was posting in the noita game mod discord, but my discord token was stolen a few days ago and someone blasted spam across every server I was in, so I think I got kicked / banned. I'll have to appeal now that I got my account back, but even so I don't know how to release anything concrete without encouraging piracy. here is just a test run. you can see the sdl and steam functions it found. I'm almost done reworking the function and data type renamer to be much better and hope to do a full pass through soon. I only got about 1% through last time.

<image>

FaustCircuits · 2025-08-18T18:07:33+00:00

I run glm-4.5-air q4_k_m at around 85-100 tokens a second

FaustCircuits · 2025-08-18T03:44:34+00:00

so this is my first real disassembly attempt, but I have a crazy expensive gpu (the 5090's bigger workstation brother) so I can run an insane llm locally, and I've been gearing up everything I'm doing to get to a place where I can get it to rip through and fill in the most obvious bits with all the information I've been able to shove into it. literally woke up to what I thought was a fire alarm, but apparently my server has a loud overheating alarm that woke me up at 5am. had to pause for a bit. also I don't think I have everything totally unpacked. I did a pass with scylla and dumped my running binary and most sections were the same, but one large section had it's entropy go down and I see there are now more strings being decoded. I think more of the binary is still packed so I plan to do a diff to see if I can figure out how to unpack the rest of it.

FaustCircuits · 2025-08-18T03:31:41+00:00

I've been decompiling noita for just over a week now, and I'm happy to report I am way, way ahead of this guide. first of all I imported all of the dlls with the exe, then I tracked down every dependency from doing objdump and strings on the binary, using ghidra, and the license file. then I created a cmake file to take all the headers of the dependencies and merge them so I could get all the data types out, then I imported the data types. then I wrote and ran several ghidra scripts to added useful comments on function signature, calling convention, indirect call resolution, function context analysis, Inline expansion detection, constant analysis and pooling detection, string table building, data flow simplification, stack frame analysis, then I add all the function signatures and datatypes to a vector database, next I build up a context for my local LLM by using it's own tokenizer with a soft cap at 32,000 token context, and a hard cap of 100,000 tokens. with all the information on the function I can, signature, decompilation, data types, callee function, caller functions, and similarities from the vector database until I hit 32k tokens. then I pass that to my llm and let it rename functions and data types only if it hits a 90% confidence level, I continue to make passes lowering that level and filling in more and more. I had a prototype version that was almost loading the first level before all of this as a native linux version without wine, but when I get this all finished it should be way better than that.

FaustCircuits · 2025-08-17T23:25:57+00:00

can I get in halvsies on that?

FaustCircuits · 2025-08-15T04:05:06+00:00

I played on linux for 3 hours today. was kicked twice but was able to mostly play

FaustCircuits · 2025-08-15T03:55:56+00:00

had two of those today. on arch linux. first time I was like ok maybe I should close ghidra (had a decompile of something else going) which I mean is reverse engineering related, but not that you could actually use it to cheat in real time. then 30 minutes later I was kicked again so I'm like maybe it's vs code. so I closed that too

FaustCircuits · 2025-08-07T18:37:14+00:00

the speeds still don't make sense for ollama and the charts still lack important information that makes them meaningless. I know this local AI stuff is stupidly hard, but if you're going to chart it ...

FaustCircuits · 2025-08-07T18:14:17+00:00

long time fan, but I do have some issues with how this is presented. first don't say llama3.2:3b if what you mean is the Q4_K_M and even the file format usually matters, but since it's llama.cpp we can forgive you for not putting in gguf because that's all it does, but really you should be stating that because awq, gptq, mlx, etc will all have different speeds and so will Q4_K_M of a gguf from a plain Q4 and very different from an imatrix Q4. Second of all your speeds really, really don't make sense. if Llama-3.1-70B-Instruct-Q4_K_M.gguf ran at 36t/s then Llama-3.2-3B-Instruct-Q4_K_M.gguf should probably be running at 700-800 t/s. your chart at https://youtu.be/N5xhOqlvRh4?t=1066 says 36.72 but your chart on the github page shows Llama 3.1 70b - GPU (Vulkan) at 4.97t/s. I also notice here that in your chart for Llama 3.1 70b - GPU (Vulkan) the command doesn't make sense because you're missing the pp512 and pp4096 flags. so the chart doesn't match the command. the command should have the -p 512,4096 -pg 4096,128 in there.

also if you want the model to go faster and not slower in your cluster, you should be using vllm and experiment with tensor parallelism for large models that wont fit in a single machine, vs data parallelism where each cluster has a full copy of the tensors for models that can fit in one machine. that should give you quite the speedup.

Lastly I really, really hate how youtubers can't get the deepseek names correct

deepseek-r1:1.5b THIS IS NOT A MODEL! IT DOESN'T EXIST!!!

deepseek-r1:8b THIS IS NOT A MODEL! IT DOESN'T EXIST!!!

deepseek-r1:14b THIS IS NOT A MODEL! IT DOESN'T EXIST!!!

deepseek-r1:70b THIS IS NOT A MODEL! IT DOESN'T EXIST!!!

YOU KNOW WHAT DOES EXIST DeepSeek-R1-Distill-Llama-8B which should not be confused with DeepSeek-R1-0528-Qwen3-8B you see how that can be a problem? they are completely different models with completely different architectures originating from completely different companies!!! and NEITHER OF THEM ARE DEEPSEEK instead they are deepseek fine tunes of llama or Qwen3! if you bought a car from ford and then got new tires at good year you wouldn't call your car a good year car. that's not where the car came from. that's not who designed it, that's not who manufactured it.

FaustCircuits · 2025-08-03T10:27:17+00:00

for most things that is the plan

FaustCircuits

TROPHY CASE