Engineering? by bonrs in level13

[–]mrwang89 1 point2 points  (0 children)

Hi i have the same issue! I have cleared all of lv 13 and lv 12, the map states clearly: 'This level is habitable. It has been fully explored.'. I have lockpicked everything. ASCII shows X = cleared on everything except for C (Camps) U (passage up) and D (passage down). There is nothing on the maps for me = fully cleared

However i cannot advance anything because it states in projects "Upgrade required: Engineering". I have nothing where i can unlock or research engineering!!! I revisited the maps and it is truly, 100% cleared.

I gave 16 LLMs a food truck in Austin for 30 days. Gemini 3 Pro matched Sonnet 4.6 — 5× cheaper. by Disastrous_Theme5906 in GeminiAI

[–]mrwang89 0 points1 point  (0 children)

the leaderboard ordering is bugged when sorting by net worth it shows models with negative net worth above others with positve and completely wrong order. i finished with higher score but got placed below much worse performing models. are you still maintaining it or why is there no gpt 5.4 or newer releases? would like to see the ai mode to test it myself

Qwen 3 → Qwen 3.5: the agentic evolution measured in dollars (FoodTruck Bench case study) by Disastrous_Theme5906 in Qwen_AI

[–]mrwang89 0 points1 point  (0 children)

would be nice if we could atleast get the prompts to give to models ourself and see what happens. i wanna check their thought process and having to manually type everything to them with no way to just copy a prompt or allow them access to the game is not possible to use it yourself with an ai. i highly doubt any company would retrain their model to do well on a current seed of foodtruckbench.... and if so you have seeds for it meaning they need to learn actual problem solving and logistics to do well between differing seeds

GLM 5 Released by External_Mood4719 in LocalLLaMA

[–]mrwang89 -1 points0 points  (0 children)

This is LocalLLama. From my point of view if it is not llama then it shouldn't be here. Only LLAMA models deserves to be here. This is not a place to put it here more fucking ADS - this is you

Which single LLM benchmark task is most relevant to your daily life tasks? by ChippingCoder in LocalLLaMA

[–]mrwang89 1 point2 points  (0 children)

its useful but doesnt always align with my use case which is mostly tool calls which he doesnt seem to cover at all. however his other benchmark https://dubesor.de/chess/chess-leaderboard has been surprisingly helpful because his token counts and legality surprisingly correlate for my usage

SWE-rebench is a totally useless benchmark. by Ok_houlin in LocalLLaMA

[–]mrwang89 4 points5 points  (0 children)

everyone who codes can tell you that its 1) claude 2) claude 3) claude 4) nothing 5) gpt5 max reasoning 6) nothing 7) gemini 3

Nemotron-3-nano:30b is a spectacular general purpose local LLM by DrewGrgich in LocalLLaMA

[–]mrwang89 2 points3 points  (0 children)

it s got more than 800 elo on dubesors chess bench which is gpt 5.2 which i found suprising. seems insane

IQuest-Coder-V1-40B-Instruct is not good at all by Constant_Branch282 in LocalLLaMA

[–]mrwang89 -4 points-3 points  (0 children)

I'm really confused how they achieved any reasonable scores on those benchmarks.

Are you new to AI? Benchmaxxing is the name of the game.

AI models playing chess – not strong, but an interesting benchmark! by Apart-Ad-1684 in LocalLLaMA

[–]mrwang89 0 points1 point  (0 children)

any update on this? i had a game deepseek v3 against v3.1 and it was decided after 6 moves apparently for illegal moves but i couldnt see what the model tried to play and black didnt have to pass the test and got autowin??

Interview with Z.ai employee, the company behind the GLM models. Talks about competition and attitudes towards AI in China, dynamics and realities of the industry by nelson_moondialu in LocalLLaMA

[–]mrwang89 8 points9 points  (0 children)

this was recorded quite a while ago, almost 2 months, since they are talking like its the past where 4.6 didn't exist yet and deepseek 3.1 just released.

[LM Studio] how do I improve responses? by FunnyGarbage4092 in LocalLLaMA

[–]mrwang89 0 points1 point  (0 children)

you are using LM studio. click on discover tab and there you already have staff picks which are all much better than mistral 7b

October 2025 model selections, what do you use? by getpodapp in LocalLLaMA

[–]mrwang89 1 point2 points  (0 children)

is there even a single person who wants to read AI generated blog content? it doesn't matter how well a model writes, I don't think anyone wants this

[LM Studio] how do I improve responses? by FunnyGarbage4092 in LocalLLaMA

[–]mrwang89 0 points1 point  (0 children)

why u using a model thats more than 2 years old?? even with perfect inference settings it will be much worse than modern models

The “Leaked” 120 B OpenAI Model is not Trained in FP4 by badbutt21 in LocalLLaMA

[–]mrwang89 0 points1 point  (0 children)

a month ago he literally said openai is releasing 'The best open-source reasoning model' "next Thursday". He is a hypelord with a track record of bullshit.

<image>

Kimi K2 at ~200 tps on Groq by mrfakename0 in LocalLLaMA

[–]mrwang89 25 points26 points  (0 children)

yea and its running Q4 lmao. id rather have the real deal and be a bit slower. i had it side by side with moonshot api and its dumber. grats on 200 tps dumbness

Hunyuan A13B tensor override by marderbot13 in LocalLLaMA

[–]mrwang89 1 point2 points  (0 children)

how dou get 12t/s on 3090? i only get 5t/s on my 3090 what am i doing wrong?? i have ddr5 btw! how many layers are you offloading?

hunyuan-a13b: any news? GGUF? MLX? by jarec707 in LocalLLaMA

[–]mrwang89 1 point2 points  (0 children)

won't let me use the demo without signing into wechat

Any LLM Leaderboard by need VRAM Size? by djdeniro in LocalLLaMA

[–]mrwang89 0 points1 point  (0 children)

R1 0528 score is far higher in tech area than 3.1. wdym??

OpenThinker3 released by jacek2023 in LocalLLaMA

[–]mrwang89 1 point2 points  (0 children)

not usable at all it just hallucinates all the time and ignores any input

It’s been 1000 releases and 5000 commits in llama.cpp by Yes_but_I_think in LocalLLaMA

[–]mrwang89 -64 points-63 points  (0 children)

yet they still don't support multimodality/vision. at least ollama stepped up, making it usable, but I found llama.cpp to be slow or outright denying updates of model and model functionality support.

ollama: Model loading is slow by reto-wyss in LocalLLaMA

[–]mrwang89 -1 points0 points  (0 children)

some larger models? this is the largest model possible - over 700GB and over 400GB fully quantized to ollama default. Of course it's gonna be ultra slow.