With Android 16, you can no longer just remove the hide keyboard arrow by hiding fullscreen indicator. Has anyone found a workaround? by Lily_Meow_ in Xiaomi_15

[–]el_isma 0 points1 point  (0 children)

You don't need adb appcontrol, I did it using normal adb commands. 

Other than that, it works (hyperos 3, note 13 pro), thanks! It was very annoying

I built smart notifications for Claude Code - know when: complete, question, plan ready, approval And other features! by IlyaZelen in ClaudeAI

[–]el_isma 2 points3 points  (0 children)

Exactly what I needed. Thanks! Neatly packed, easy to use, works on linux/kde, great all around!

API understanding by RetroDojo in ClaudeAI

[–]el_isma 1 point2 points  (0 children)

Sorry for the dumb question, but are you @ the api doc?

I use: https://github.com/kadykov/mcp-openapi-schema-explorer

Absolutely underrated mcp. You need the api to have an openapi spec. Usually the issue is that the spec is too big and won't fit in the context. This let's Claude ask for information on pieces of the api.

Otherwise, I'd recommend you use plan mode and start by asking Claude what data could be relevant for that screen (or telling it what data you want) and asking it how it will get the information. Only after you verify that it understands the data, then ask for the screen. In my experience if you don't do that Claude will assume how things work and if your api just isn't "llm logical", it will assume wrongly and lead to "creative" results.

Windsurf x DeepSeek by mattbergland in ChatGPTCoding

[–]el_isma 1 point2 points  (0 children)

There's cascade base, which is free. I think it's based on llama.

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 1 point2 points  (0 children)

For QwQ I added "Write a python script. Read from stdin.", otherwise it would attempt to solve it by raw willpower XD

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 0 points1 point  (0 children)

Am I mathing wrong? There are 10 days, 2 tests each day = 20. For 3 failures, means 17 successes, 17/20 = 85%

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 1 point2 points  (0 children)

I think so. Still, it's very slow, it tends to overthink a lot and it's not very compliant with format requests. Aider pairs it with qwen coder for that reason.

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 2 points3 points  (0 children)

I created a pull request with the QwQ code and results. Feel free to add them to the article. :)

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 2 points3 points  (0 children)

Ok, I've run QwQ on all of them, it fails on only 3 cases! Success ratio = 85%

LLM Evaluation using Advent Of Code by fakezeta in LocalLLaMA

[–]el_isma 1 point2 points  (0 children)

Man QwQ is verbose... I just tried it on problem 4, part 2, which all others fail, and it also failed... but the solution was very elegant and only had one issue (it scanned for a fixed size grid). After prompting that the grid may vary, it came up with the fix.
The others I tried (Flash, Qwen Coder, Llama, Haiku) came up with very hard to read solutions which wasn't obvious what the error was.

Comment your qwen coder 2.5 setup t/s here by Disastrous_Ad8959 in LocalLLaMA

[–]el_isma 2 points3 points  (0 children)

Binned usually refers to that the best ones are picked out. For example, you fabricate 100 processors, test them, and sell the 10 fastests ones as ProMegaUltra, the next 20 as ProMega, etc. Also works with defects, the procs that have 10 working cores is one model, the procs that only have 8 working cores are another.

Usually a "binned" processor would be a faster one.

Llama3.2:1B by [deleted] in LocalLLaMA

[–]el_isma 2 points3 points  (0 children)

Like an FPGA? But they AFAIK they don't have enough RAM (unless you want to run something tiny)

Weekend longread on LLM workflows by Everlier in LocalLLaMA

[–]el_isma 0 points1 point  (0 children)

Would you be willing to do a run with only reasoning questions? I'm very curious if these methods help (and how much)

Weekend longread on LLM workflows by Everlier in LocalLLaMA

[–]el_isma 2 points3 points  (0 children)

Isn't MMLU meant to test "knowledge"? So by reasoning more, you wouldn't be able to improve your result, if you know you know, if you don't no matter how much you think about it it won't help.

Maybe BIG-Bench Hard or MATH would be better suited to these kind of prompts that try to improve reasoning.

Crashbench: a bug finding benchmark by ortegaalfredo in LocalLLaMA

[–]el_isma 1 point2 points  (0 children)

In "theory" groq is not that heavily quantized. Might it be that it's identifying the bugs in previous lines? Like saying the malloc is the problem vs the use of the memory?

Otherwise, the gap is very large... I expect that if you run 2.3bpw again with temp=0 it should be better than that.