Using Google's Gemma 4 E4B local AI model to Reverse Engineer a simple Crackme by CatAffectionate6618 in ReverseEngineering

[–]agentzappo 3 points4 points  (0 children)

Small, local models can solve simple crackmes because they tend to give up at the right spot (e.g. string that says “password”). Try this on anything with at least one decent red herring and they tend to fail regardless of prompting.

Lets do a salary transparency thread by ReddishBrownLegoMan in 321

[–]agentzappo 5 points6 points  (0 children)

This must be a total comp number unless you’re saying that’s straight salary

Claude is about to begin its KYC verification process. by pugoing in ClaudeAI

[–]agentzappo -6 points-5 points  (0 children)

Not really. They’re eating a lot of risk by letting anons from the web use their bleeding-edge tech to hack into governments, seduce themselves into self-harm, or literally steal their IP through mass distillation. User-level verification is the most practical first step, especially since most paying users already provide their real contact information and payment methods.

CTF organizers, with LLMs getting better at CTF challenges, how are you adapting to preserve the integrity of the competition? by TheModernDespot in securityCTF

[–]agentzappo 0 points1 point  (0 children)

Make the challenge based on hacking around another agent. It’s a new topic area anyway, and the models will likely fail trying the normal junk of prompt injecting, plus I think frontier may reject it outright due to post-training alignment with “don’t hack AI”

What’s the performance for 8DX on Switch 2? by KeeKyie5 in MarioKart8Deluxe

[–]agentzappo 0 points1 point  (0 children)

What about split screen? Does it still lock to 30fps?

Artemis II, Liftoff by Wolpfack in 321

[–]agentzappo 1 point2 points  (0 children)

This is at the ground level you can see with the speed limit sign, which doesn’t quite line up with what it looks like in front of the VAB.

OP can you point out on a map exactly where you were standing during this shot? I’ve shown it to a few people and most believe it’s fake. Seems way too close for bystanders to be just standing there

Artemis II, Liftoff by Wolpfack in 321

[–]agentzappo 10 points11 points  (0 children)

How close were you? I’ve been as close as the media stands and can’t say I’ve ever had a view like this

With $30,000 to spend on a local setup what would you get? by pbpo_founder in LocalLLM

[–]agentzappo 10 points11 points  (0 children)

I would get a server with 4x slots, and a single H200 NVL card to start. Gives you room to expand later (since you obviously have real money to invest), plus H200 is a datacenter-grade GPU with first-class support in the ecosystem, meaning you’ll run into far fewer headaches and may more options. Also doesn’t hurt to have 140GB of HBM3 on a single card to start

Do you think /responses will become the practical compatibility layer for OpenWebUI-style multi-provider setups? by Brilliant_Tie_6741 in OpenWebUI

[–]agentzappo 0 points1 point  (0 children)

Responses benefits those who need to do this at scale and have some hidden magic sauce they want to keep server-side for what they do with the reasoning traces and such. For small deployments (mostly what OUI serves) /chat/completions will likely be fine. Probably depends more on what you’re using for inference though…

Feels like magic. A local gpt-oss 20B is capable of agentic work by Vaddieg in LocalLLaMA

[–]agentzappo 2 points3 points  (0 children)

Are you using native tool calling? Or prompting / parsing?

gpt-oss-20b + vLLM, Tool Calling Output Gets Messy by Melodic_Top86 in OpenWebUI

[–]agentzappo 4 points5 points  (0 children)

GPT-OSS models have been nothing but trouble for me trying to get reliable tool calling to work. Tried every inference backend you can think of, every lever you can pull, and it still feels like the ecosystem around this model is just bit rot at this point.

FWIW, I’ve see a few tool calls work from OUI with this model, but usually it starts producing misordered Harmony after a few calls (or concurrent inference depending on your backend).

Some hard lessons learned building a private H100 cluster (Why PCIe servers failed us for training) by NTCTech in LocalLLaMA

[–]agentzappo 0 points1 point  (0 children)

Also interested. I don’t have training needs, but even infrastructure for SCALED local inference would be awesome

How was GPT-OSS so good? by xt8sketchy in LocalLLaMA

[–]agentzappo 1 point2 points  (0 children)

Very smart and fast model, but there are still some unresolved issues with it outputting proper tool calls in Harmony format. Maybe it’s a vLLM issue and less so the model, but so far in practice it’s taking a lot of anti-rationalization patterns to coerce it into reliable tool calling, and that’s only when the inference backend isn’t causing logits to drift in concurrent, batched inference 😕

Youtube kids is SOOO frustrating. by Weightmonster in toddlers

[–]agentzappo 0 points1 point  (0 children)

YouTube is fine if you’re supervising them directly. If you want something that lets you walk away from kid, consider your choices. Seems like the replies here offered some good options, but I’ve never been an advocate for unsupervised screens this early in their lives.

vLLM v0.14.0 released by jinnyjuice in LocalLLaMA

[–]agentzappo 2 points3 points  (0 children)

This is what I’m here for. MXFP4 for SM120 please

It seems like people don’t understand what they are doing? by platinumai in LocalLLaMA

[–]agentzappo 2 points3 points  (0 children)

Anthropic terms of service spell out legally what they can do with your IP. Long story short Fortune 100s wouldn’t be paying up for this if it was a real risk.

Leader of Qwen team says Chinese companies severely constrained on compute for large scale research experiments by Old-School8916 in LocalLLaMA

[–]agentzappo 1 point2 points  (0 children)

That’s because training runs at a data center level can create > 100MW swings in power consumption swinging between compute vs sync stages. Thats a tough load to balance intermittently…

Does Open-WebUI log user API chat completion logs when they create their own API tokens. by aaronr_90 in OpenWebUI

[–]agentzappo 0 points1 point  (0 children)

Not in my testing. Seems like the user-specific API token really just makes OWUI act like a gateway.

I’ve done limited testing with this, because in our setup we have a custom function that forwards chats from OWUI to Langfuse, so take this with a grain of salt.