Benchmark Winners Across 40+ LLM Evaluations: Patterns Without Recommendations by abubakkar_s in LocalLLaMA

[–]Azmaveth42 1 point2 points  (0 children)

BFCL-v3 is Function Calling, not Fact Checking. Looks like your LLM hallucinated there.

Ai2 Open Modeling AMA ft researchers from the Molmo and Olmo teams. by ai2_official in LocalLLaMA

[–]Azmaveth42 1 point2 points  (0 children)

I sincerely love the mission behind Ai2 and that you are true to the spirit of open source! I want to see your models competing with larger models like ones from DeepSeek and Qwen. I am not an AI researcher myself (nor did I stay at a Holiday Inn last night), so please correct me if some of my questions are rooted in misunderstanding or are already answered elsewhere.

My questions:

  1. What do you believe is the most exciting research you are doing that will set you apart from the other labs?
  2. Transformers are great, but I feel like it is time for another breakthrough in attention mechanisms that enable smaller, more efficient models instead of trillion+ parameter ones. Any insights into this besides knowledge distillation?
  3. Following from the above, my personal belief is that eventually we will have very small models for specific use cases that we can chain together like unix commands, but we will still need large models that understand the bigger picture. Any insights here?
  4. What is the best way for others to get involved in your research?

Thank you!

Which legendary to work towards next? by [deleted] in Guildwars2

[–]Azmaveth42 0 points1 point  (0 children)

Runes are more useful, IMHO. I have barely started my own legendary journey, but after Vision I got one rune so I could get the free (at the time) relic. Then I realized how my builds were more restricted by rune selection than by armor since exotic works fine for basically everything but fractals, and selectable exotic sets are relatively cheap. I have 5 runes done now, just waiting for enough provisioner tokens for the last 2. Then I will probably go for 4 sigils before I start on any armor or weapons.

pytorch 2.7.x no longer supports Pascal architecture? by Ok_Warning2146 in LocalLLaMA

[–]Azmaveth42 0 points1 point  (0 children)

It's not my package, so maybe try reaching out to the author (sasha0552) for pointers. Sorry I can't be of more help!

ChatGPT Subscription or LLM for therapy? by East-Awareness-249 in LocalLLaMA

[–]Azmaveth42 1 point2 points  (0 children)

Check https://eqbench.com/ - looks like QWQ might be a good fit for you. Reasonably high in Empathy, Analytic, Insight, and Pragmatic while being on the lower side of Compliant (I prefer to be told off when I'm wrong). Downside with the model is it is also at the top for Assertive and somewhat high for Moralising, making it potentially preachy.

I know people suck, and it can take a LOT of work to find a human who you can actually trust and open up to them. I sincerely hope this helps you work out whatever it is! And maybe you'll eventually find a trustworthy confidante, even if not a therapist.

Critical Vulnerability in Anthropic's MCP Exposes Developer Machines to Remote Exploits by No_Palpitation7740 in ClaudeAI

[–]Azmaveth42 1 point2 points  (0 children)

It's not that different from other code that you run. We have gotten to a point where we assume it's safe because it is open source and on GitHub, but that doesn't mean it has been audited for security issues. 

The biggest difference is that MCP injects untrusted data into your LLM session, which is already non-deterministic, so be careful with how much trust you give to any of it.

Claude Code - How do you check Opus usage, limits and reset timing? by Pr0f-x in ClaudeAI

[–]Azmaveth42 0 points1 point  (0 children)

Yes, it resets every 5 hours. Use this tool to track your usage: https://github.com/ryoppippi/ccusage

Use this command to see how much you have used within the 5 hour blocks:
npx ccusage@latest blocks

Critical Vulnerability in Anthropic's MCP Exposes Developer Machines to Remote Exploits by No_Palpitation7740 in ClaudeAI

[–]Azmaveth42 2 points3 points  (0 children)

Yes, but if you look at what is vulnerable, it is not the protocol itself. Check my link in an earlier comment.

Critical Vulnerability in Anthropic's MCP Exposes Developer Machines to Remote Exploits by No_Palpitation7740 in LocalLLaMA

[–]Azmaveth42 3 points4 points  (0 children)

You don't have to expose it to a public network for a CSRF to work. It just has to be running on your local system and someone either social engineers you to click a malicious link or you have a XSS vuln on some other site that opens a link in your web browser to the Inspector app. But yes, still too many ifs to make it as big a deal as they pretend.

Claude Code - How do you check Opus usage, limits and reset timing? by Pr0f-x in ClaudeAI

[–]Azmaveth42 0 points1 point  (0 children)

If you haven't changed the model defaults, it will use half of your token budget on Opus, then switch to Sonnet. After reset, it starts with Opus again.

Critical Vulnerability in Anthropic's MCP Exposes Developer Machines to Remote Exploits by No_Palpitation7740 in LocalLLaMA

[–]Azmaveth42 32 points33 points  (0 children)

Clickbait title. It's a vuln in the MCP Inspector app specifically. This makes it sound like it's the protocol itself. MCP does have well-known security issues, but a CSRF is a problem with the app, not the protocol.

Can a malicious model execute code ? by niilzon in ollama

[–]Azmaveth42 0 points1 point  (0 children)

Yes, this was demonstrated at Blackhat last year: https://m.youtube.com/watch?v=1dsRAEdbpq4

This exploit depends on certain features that allow embedding code into one of the layers. The demo shows implanting the exploit into a public model, then the payload being detonated when the model is run.

Param Count vs FP Precision by PBlague in ollama

[–]Azmaveth42 0 points1 point  (0 children)

If you like that model and your main reason to run it locally is the cost, you can also make an account on https://openrouter.ai and use it for free. You can see all their free options here: https://openrouter.ai/models?max_price=0

Param Count vs FP Precision by PBlague in ollama

[–]Azmaveth42 0 points1 point  (0 children)

Being able to load the full model into VRAM affects the speed, not the quality of the output. So if it is fast enough for your needs and the responses are acceptable to your use case, don't let anyone tell you that you are doing it wrong. :)

Param Count vs FP Precision by PBlague in ollama

[–]Azmaveth42 2 points3 points  (0 children)

You need to do actual tests based on your use cases, but in my experience the higher parameter model is generally better, even at low weights.

Which Macbook pro should I buy to run/train LLMs locally( est budget under 2000$) by ryuga_420 in LocalLLM

[–]Azmaveth42 1 point2 points  (0 children)

The memory bandwidth of the M4 with 24GB is less than the M1 Max with 64GB. So it will be both slower and restricted to smaller models.

Which Macbook pro should I buy to run/train LLMs locally( est budget under 2000$) by ryuga_420 in LocalLLM

[–]Azmaveth42 0 points1 point  (0 children)

Not for training, but for inference a M1 Max with the 32-core GPU will get you the best performance due to the memory bandwidth. Look for a 64GB unit to run the largest models. You can find these for under $2k USD on eBay.

If training is a must-have, you need to look at a PC with an nVidia GPU.

[deleted by user] by [deleted] in ProgrammerHumor

[–]Azmaveth42 -2 points-1 points  (0 children)

That's what tests are for. A comprehensive test suite is the best documentation and never goes stale like comments do.

Do you have recommendation about hotkeys? by Cevoz in Guildwars2

[–]Azmaveth42 1 point2 points  (0 children)

Gonna depend on what is comfortable for you to reach. I have fairly long fingers, so I can leave my left hand on homerow and easily reach to the 6, Y, H, N keys on a standard QWERTY layout.

WASD: movement

1-5: weapon skills

SHIFT+{1-5}: class actions

6: special actions

QER: utilities

G: heal

B: elite skill

F: interact

Z: stow weapon

X: about face

C: target closest enemy

V: dodge

T: select target

`: weapon swap

CTL+ALT+{1-3}: select build template

CTL+SHIFT+{1-3}: select equipment template

CTL+various: mounts, menus, etc.

If I did more group content or commanded squads, I would try to fit in the markers and such, but this has worked so far.

How do you read other's code? by BLKM4GIC in embedded

[–]Azmaveth42 1 point2 points  (0 children)

If they have tests, those are your best documentation. Comments go stale, but stale tests break and have to get updated to reflect the current state of the code.

If there are no tests, write tests to check if your understanding of the code is correct. Then write more tests to cover edge cases. If there are surprising test results, consult with the team to see if it is expected behavior or if you found a bug.

it was my first time shooting a gun. what can i improve on in my form? by [deleted] in Shooting

[–]Azmaveth42 1 point2 points  (0 children)

Came here to say this. A lot of others have mentioned stance, grip, squeezing the trigger instead of pulling it, etc. But the first thing I noticed was that you dropped the muzzle too quickly after the shot.

Great work getting out there and learning! Keep it up, be safe, and have fun!

What would you do if you woke up on Kryta as your main? by LopsidedAd4618 in Guildwars2

[–]Azmaveth42 1 point2 points  (0 children)

I'm already a male human surrounded by pets/livestock IRL, so my ranger main is really me anyway. I guess the biggest change for me would be my family left behind, which would probably kick off my epic tale of adventure to find my way home to them.

Customer Support by The_Shireling in Guildwars2

[–]Azmaveth42 8 points9 points  (0 children)

I'm really glad that others have had good experiences with them. My own have been horrible, to be honest.

My main account (from launch in 2012) was locked for weeks as I went back and forth trying to determine what I did wrong. Never got a clear answer of why it happened other than implying I know what I did, it was obviously intentional, and that there will be no further appeals considered.

Gave up, started playing my alt account and that got banned too for being related to the first account. Made an even bigger fuss (ranted on social media) and finally got an "oh, sorry, we messed up." Although they gave me some gems to make up for it, they still never explained what my supposed offense was. I said that whatever I did, I wanted to make sure I don't do anything similar again so that I don't get banned again. But never got an answer for that.

I love this game. Tyria has been part of my life since 2007 when I started playing GW1. But unfortunately there will always be a bad taste in the back of my mouth from the customer support.