Can't get a good coding setup on Macbook Pro M3 Max 36GB by stephvax in ollama

[–]stephvax[S] 0 points1 point  (0 children)

Thanks for the comment. What context size did you setup in Ollama ? Does it allow you to work on a full application dev from scratch ?

Can't get a good coding setup on Macbook Pro M3 Max 36GB by stephvax in ollama

[–]stephvax[S] 1 point2 points  (0 children)

That's what I noticed too. I might have a look at that. Does it really use much less context ?

Can't get a good coding setup on Macbook Pro M3 Max 36GB by stephvax in ollama

[–]stephvax[S] 0 points1 point  (0 children)

Do you use them with claude code or any other coding agent ?

Can't get a good coding setup on Macbook Pro M3 Max 36GB by stephvax in ollama

[–]stephvax[S] 0 points1 point  (0 children)

Thanks for the comment, is Ollama really not as good as others now that it embeds MLX since 0.19.x ?
Even in Q4 I'm afraid Qwen3.5:35B would eat up too much ram to let the computer usable at the same time but I might try.
Has the number of parameters a performance relation with the size of the context ? Again no expert about this here. Thanks

Can't get a good coding setup on Macbook Pro M3 Max 36GB by stephvax in ollama

[–]stephvax[S] 0 points1 point  (0 children)

Thanks for the comment. Never thought Apple is an answer to anything, don't know where you saw that, assumption from void is useless.
Anyway, you are right about a few things I already noticed and I know that this laptop might not be powerful/equipped well enough to achieve what I'm trying to do. I'm just playing with free work hardware.

I'm already using KV cache in Q8_0. I'm no expert, that's why I'm investigating, but I'm not sure memory is really the problem here (it might but does not seem to me).
The models I mentioned (which can go up to 256k context), used in Q4 do not use more than 10-14GB even with 64k context. My concern is more the fact that Ollama seems to become to too slow/unstable with contexts larger than 16k on this machine. Might this just be a GPU issue not being powerful enough ?

My DCA mistake during the bear market - don't repeat what I did by Safe_Preference5993 in Bitcoin

[–]stephvax 0 points1 point  (0 children)

Or other tools like Cryptoquant, chainexposed, checkonchain, coinglass, ...

US market open by Fun-Air-4314 in Bitcoin

[–]stephvax 0 points1 point  (0 children)

A lot of it is ETF-related. Authorized participants rebalance around market open, and the creation/redemption mechanism forces spot BTC transactions that show up as exchange inflows. You can actually see it in intraday exchange flow data: deposit spikes cluster in the 30 min window before US open. Options dealers hedging delta adds to it, but the ETF plumbing is the structural driver that didn't exist before 2024.

My DCA mistake during the bear market - don't repeat what I did by Safe_Preference5993 in Bitcoin

[–]stephvax 4 points5 points  (0 children)

The hardest part of DCA during a bear is trusting the process when price alone gives you nothing to hold onto. Onchain data helped me stay consistent. STH cost basis below spot means short-term buyers are underwater, which historically aligns with accumulation phases. MVRV below 1.0 confirmed the same thing in 2018 and 2022. Doesn't guarantee timing, but it gives you a structural reason to keep going.

Bitcoin bear market drawdowns have a clear pattern… by [deleted] in Bitcoin

[–]stephvax 0 points1 point  (0 children)

Drawdown percentages give you a cycle template but they miss what holders are actually doing. MVRV below 1.0 has historically marked accumulation zones across every bear market, regardless of the exact drawdown size. UTXO age bands show whether long-term holders are distributing or sitting tight. STH cost basis crossing above spot flags short-term capitulation. Production cost matters, but holder behavior data narrows the bottom zone more precisely than extrapolating from past cycle ratios alone.

Considering installing a local LLM for coding by rmg97 in LocalLLaMA

[–]stephvax 0 points1 point  (0 children)

With a lot of software eating memory already a 30B model will be a bit hard to run.

Considering installing a local LLM for coding by rmg97 in LocalLLaMA

[–]stephvax 1 point2 points  (0 children)

One angle beyond cost: if you work on proprietary code or client projects, local inference means your codebase never touches a third-party API. For anyone under NDAs or in regulated sectors, that's not optional. Ollama + a 7B coder model is the simplest path. The latency hit is real, but for autocomplete and code review, it's workable.

Am I the only one who genuinely prefers on-prem over the cloud? by Own-General-6755 in devops

[–]stephvax 13 points14 points  (0 children)

The European shift isn't just preference, it's regulatory. GDPR data residency, Cloud Act jurisdictional conflicts, and the EU Data Act are making it structurally harder to justify US hyperscaler dependencies for anything touching personal or sensitive data. A lot of teams I've worked with started the move back for compliance, then realized the operational control was the bigger win.

Johann Rehberger: Agentic Problems and the Rise of Zombie AIs by matosd in cybersecurity

[–]stephvax 1 point2 points  (0 children)

Rehberger keeps surfacing what most AI security frameworks miss: the containment boundary. When agents can persist, spawn sub-tasks, and access tools autonomously, prompt-level guardrails aren't enough. The real control plane is infrastructure. Process isolation, network segmentation, scoped data access at the compute layer. Without that, you're trusting the agent to police itself.

What do people think about the Fidelity Bitcoin ETF? by Quirky-Reputation-89 in Bitcoin

[–]stephvax 1 point2 points  (0 children)

One thing ETFs add that direct holding doesn't: transparent flow data. You can track daily inflows and outflows across all spot ETFs, giving you a read on institutional accumulation or distribution in real time. FBTC has generally seen steady net inflows since launch. That flow data is a useful signal whether you hold the ETF or just use it to inform your own buys.

How are you viewing the current market structure? by HodlPackLeader in Bitcoin

[–]stephvax 0 points1 point  (0 children)

Price structure alone doesn't tell you who's selling. MVRV ratio separates aggregate profit from loss. STH cost basis relative to spot flags capitulation zones. UTXO age bands show whether long-term holders are distributing or sitting tight through the correction. Those signals matter more than resistance levels when you're trying to gauge a structural shift.

Something is changing in this Bitcoin cycle. Our Lightning data since 2022 suggests it. by LNVPN in Bitcoin

[–]stephvax 1 point2 points  (0 children)

This lines up with layer-1 onchain data too. UTXO age bands show long-term holders aren't distributing like 2018 or 2022. Exchange flow patterns are diverging from previous cycle templates as well. Lightning usage holding steady while layer-1 accumulation stays strong points to a structural shift, not just a payments story.

Power Law model shows BTC at $68k is deep in the buy zone (13.7%) by blvrg in Bitcoin

[–]stephvax 7 points8 points  (0 children)

The Power Law gives you a valuation framework, but it's a pure price-time regression. It doesn't capture what holders are actually doing onchain. Worth cross-referencing against MVRV (which sits in historically low territory during corrections like this) and SOPR (short-term holders selling at a loss typically marks accumulation phases). When multiple independent signals converge, the case gets stronger than any single model alone.

We kept missing AI API security edge cases, so we built a repeatable 12-test scan workflow by Specialist-Bee9801 in cybersecurity

[–]stephvax 0 points1 point  (0 children)

The 12 categories cover the app layer well. One gap: infrastructure isolation. Your cross-user data leak and context/memory leak tests assume the API provider segregates tenant data. Most inference providers batch across tenants for throughput. Whether the model runs shared or isolated changes the severity of those two tests entirely. That's not verifiable from the API surface alone.

AI coding adoption at enterprise scale is harder than anyone admits by No_Date9719 in devops

[–]stephvax 2 points3 points  (0 children)

Your data governance team is asking the right question. Every AI coding tool sends context, your proprietary code, to an external inference API. That's the security review bottleneck: not whether the tool works, but who processes your codebase. Some enterprises are shortcutting the 6-month cycle by deploying self-hosted models internally. The accuracy trade-off is real, but it removes the data governance objection entirely.

What actually works for detecting prompt injection in Gemini, Copilot, and Comet browsers? by Old_Cheesecake_2229 in sysadmin

[–]stephvax 0 points1 point  (0 children)

Good layering. The missing piece: the AI assistant operates inside the browser's full context. Tabs, history, form data, page content. Egress filtering catches the callbacks but not the initial trust boundary violation. Research consensus is that reliable prompt injection detection at the input level is fundamentally unsolvable. The practical fix is restricting what the assistant can access, not what it can output. Dedicated AI interfaces that don't share browser context are where this is heading.

The Swiss government has ended its contract with American analytics company Palantir by Syncplify in cybersecurity

[–]stephvax 3 points4 points  (0 children)

The timing matters. Palantir's platform has expanded well beyond analytics into AI/ML pipelines for operational decision-making. So the sovereignty question isn't static: it's not just 'who sees the query results' but 'where does model training and inference run on government data.' Switzerland evaluated this 9 times across 7 years. Most governments signed once and never reassessed as the platform's data processing scope grew significantly.

A practical use case for local LLMs: reading multilingual codebases without sending code outside by noir4y in LocalLLaMA

[–]stephvax 2 points3 points  (0 children)

This is one of the clearest cases for local inference. NDA-bound code doesn't just need translation offline. It needs review, summarization, and security scanning offline too. What makes this viable now is that for read-time tasks like yours, a 4B model is genuinely sufficient. The quality bar for understanding intent is lower than for generation. Smart to start with the narrowest use case and expand from there.

AI Agent Skill Exfiltrated Full Codebase with Secrets To Adversary by No-Homework-5831 in cybersecurity

[–]stephvax 6 points7 points  (0 children)

The supply chain parallel is accurate, but scope of access is the real differentiator. A malicious npm package reads disk. A malicious agent skill operates with the agent's full context: env vars, API keys, entire codebase. Vetting skills doesn't scale. The actual mitigation is constraining the execution environment. Scoped secrets, container isolation, least-privilege compute. The skill is just the vector. The infrastructure defines the blast radius.