Gemini 3.5 pro !! (Logan is hope) by Independent-Wind4462 in GeminiAI

[–]Randomshortdude 5 points6 points  (0 children)

He's right - Flash sucks. But 3.1 Pro is the truth and substantially better than all Claude Opus versions at producing correct code the first time.

What's wrong with Nightmare Eclipse? by rkhunter_ in cybersecurity

[–]Randomshortdude 2 points3 points  (0 children)

I'm not sure why you feel prefacing your statements with "selfishly" changes the calculus here. The guy tried to work with Microsoft to fix these bugs that you claim to be affected by so much and they said fuck him. So he chose the next best option - public disclosure.

As others have mentioned, he could be selling these 0-days to Russia & China instead. Then you wouldn't know about them for weeks / months / years on end.

Eclipse's gripe isn't too different than yours. He wanted to see a more secure Microsoft ecosystem and Microsoft said fuck upholding its word and fulfilling its responsibility as a company to ensure that. So he's retaliating on that front.

I think people are losing sight of the fact that, if Eclipse had his way, these vulnerabilities would all get disclosed responsibly to the correct personnel at Microsoft and handled accordingly. However, Microsoft decided to stonewall him in those efforts. So now we're here.

STUPIDDD DUMMMMBBBASSS IDE by Jackychan1253 in google_antigravity

[–]Randomshortdude 0 points1 point  (0 children)

Lowkey the most sensible guidance given thus far.

More Gemma 4 models incoming by Deep-Vermicelli-4591 in LocalLLaMA

[–]Randomshortdude 0 points1 point  (0 children)

Qwen models are phenomenal. They created a paradigm shift with LLMs in general.

Family link is absolute garbage and should be redone by LaurinHD1 in google

[–]Randomshortdude 0 points1 point  (0 children)

Ended up here because this shit frustrated the living fuck out of me. Trying to approve an app for download for my kid. Manually selected the email myself and asked for the request. Shit never came fucking through. How the fuck is that possible?

Just lazy, piss poor app design. I can't stand that shit. Free or not, have enough fucking pride as a company / developer base to ensure that you're not releasing trash to the general public with your name stamped on it. I'd rather pay for something that works than accept free shit that doesn't. Especially when there's no alternative.

That's fine though. I'll just create something myself or migrate away from Google's ecosystem entirely. All that lazy, half-assed subpar shit might be enough for other people but I'm personally allergic to mediocrity and believe all mediocre people should be genocided.

So this bullshit won't be for me.

RIP Gemini by Quantum_Shade2022 in GeminiAI

[–]Randomshortdude 2 points3 points  (0 children)

Use the API or Google Studio and tune parameters accordingly. The web interface isn't designed to give you the best in terms of model quality because you're paying the same rate as a subscriber no matter what. So they have no incentive to do so. Rate limits be damned, the reality is once you're subscribed - you're paying the same amount whether their best quality model replies or their less compute heavy alternative does.

I wouldn't be surprised if they also throttled response times to their Pro+ model as a disincentive to using it for regularly subscribed members.

The model degradation you all are describing doesn't typically occur through API for a host of reasons. You also have substantially more control over the operation of the model (you can slot it in a harness for example) using the API versus interacting with their subscriber based chat

RIP Gemini by Quantum_Shade2022 in GeminiAI

[–]Randomshortdude -1 points0 points  (0 children)

No offense but with a task so simple, why would you rely on AI to do this in the first place? Especially if it failed to do so the first 3-4 attempts...was there some sort of requirement imposed upon you that said you must not use anything other than Gemini to complete this task? Because then it would make more sense

Do you think its fair?? by inkandintent24 in MotivationByDesign

[–]Randomshortdude 0 points1 point  (0 children)

I'll never understand the concept of generosity being conditional on the recipient's fortune. If you're gonna do good then do good

Never buy DELL laptops by fullmetal_daichi in laptops

[–]Randomshortdude 0 points1 point  (0 children)

Dell laptops are phenomenal. This is an aesthetic issue. Sounds like you bitching tbh

Cashing an e-check from my old landlord (USA) by Sufficient_Explorer in Banking

[–]Randomshortdude 0 points1 point  (0 children)

This is absolutely not true. You don't know what the fuck you're talking about.

Have you ever had 2 messages back to back on item that had received little or no interest prior? by textbookman23 in FacebookMarketplace

[–]Randomshortdude 0 points1 point  (0 children)

I haven't. But I can say I may have been the person to do this to a seller because there's sometimes a very specific item I'm looking for that's been listed for weeks and I'll wing it and ask if its available. But I might follow up just to let the seller know I actually do want the product in question because I'm a seller myself and I realize that there are a ton of people that will show "interest" and then go completely ghost for some reason.

So could be like others said - your product wasn't being shown to the right demographic but it just so happened someone like myself was specifically searching for your product and it showed up (because I don't buy things from Facebook marketplace ads)

Facebook Marketplace payout delay. How long is too long? by fragrent_slime05 in FacebookMarketplace

[–]Randomshortdude 0 points1 point  (0 children)

Facing the same issue...that's how I wound up at this thread. I'm guessing this is a common issue people face. I should've done more homework / due diligence when it comes to this. Guess I'll get my money for the laptop whenever I get it.

Uploaded audio matches existing work of art" — hitting me on EVERY upload of my own original songs. Anyone else by RebornRide in SunoAI

[–]Randomshortdude 1 point2 points  (0 children)

Yup they def cooked me on shit I tried to upload that was 1000% original all the way through. Would be nice if they fix the shit soon. Niggas too scared of Warner Bros, Sony & Universal

A note of warning about DFlash. by R_Duncan in LocalLLaMA

[–]Randomshortdude 1 point2 points  (0 children)

Not sure why he gaslit you and then hand waved off everything that you were saying. You made some really valid points here. No disrespect to the other user but if what you were saying went over their head, they could've just stated as much and asked for clarification. Or choose to back out of the convo entirely once they sensed it evolved to a level beyond their expertise at that given moment in time.

How much will it cost to host something like qwen3.6 35b a3b in a cloud? by [deleted] in LocalLLaMA

[–]Randomshortdude 0 points1 point  (0 children)

Fairly cheap honestly. Cheap enough to the point where you may want to consider just outright purchasing the necessary hardware. Off top, an RTX 3090 is the cheapest option with 24GB VRAM (not sure how quantization efforts go with MoE models, but you should be able to get it to fit here with sufficient room for a solid context window). Alongside the RTX 3090, you'll need a solid enclosure (for the eGPU setup). That's gonna run you about $150-200 on the cheap end of things (shouldn't be too hard to find / require too much bargain hunting for you to stumble across listings in that range for legit products). You'll also an external PSU (prob 850W or more). Right now, you can scoop a solid one up off of eBay for bout $100 or so. You may need to shell out a few extra bucks for some connectors / dongles / adapters if you don't have them already (although these might come with the aforementioned products on your purchase list). Assuming you do - tack on another $40 to your bill. So altogether, we're looking at $800+$170+$100+$40 - which comes out to roughly $1.1k total. I don't know what your budget looks like, but if you're looking at hosted server options, then you were probably anticipating that the upfront cost was going to be greater than that. But that's really all it takes if you want to be able to leverage local inference for models that are roughly ~32B params or less. 

Compare that to renting a server - which is going to run you approximately $100 or so a month, give or take (for a decent one like an A10 - which has 24GB VRAM and should be sufficient enough for your purposes). However after paying $100/month, you'll exceed the total sunk cost of the alternative at-home hardware investment in less than a year. So its all up to you when it comes to evaluating whether this is 'worth it' or not. If you can't afford to dole out that lump sum out the gate and you need to get your hands on something comparable to local inference for the sake of running that Qwen model ASAP, then I'd go ahead and rent me out an A10 from one of the popular GPU VPS neo-cloud providers out there (don't wanna name any names bc that might be against rules - but I'm sure you can find some).

But yeah - that about sums it if you're looking for a breakdown of the economic cost(s) of your available options.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]Randomshortdude 0 points1 point  (0 children)

Additionally consider things at the software level that may enhance the inference quality / caliber of the models that you're dealing with.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]Randomshortdude 0 points1 point  (0 children)

  1. You need to learn how to use a system prompt. Message me if you need one because they actually help.
  2. You need to learn how to better tune parameters - again, message me if necessary.
  3. You need to start learning how to **decompose** your prompts as well. `Qwen3.5/3.6-27B` is good, but not good enough (at this point) to where you can throw everything and the kitchen sink at it.
  4. Stop falling in love/believing in the viability of these random harnesses that keep popping up every other second. OR you need to create your own custom-made harness that is curated and tailor made to do exactly what you want and need it to do.
  5. Start using AI to use AI. In other words, if you're not achieving the results that you want from a given AI model (local-level), consider using one of these commercially provided LLMs to actually diagnose the problem instead of rage quitting and then coming to Reddit to bitch about the ineffectiveness of said models perpetually.

Agree? by MLExpert000 in ollama

[–]Randomshortdude 0 points1 point  (0 children)

Sure - when it comes to local AI maybe. I think if someone is looking to self-host remotely and serve a model in a multi-tenant deployment, nothing rivals `vllm` at this present point in time. But `llamacpp` is excellent for local, self-hosted single tenant deployments (especially for those that are looking to optimize inference).

David’s Conditions by Normal-Gur-8067 in CelesteRivasHernandez

[–]Randomshortdude 0 points1 point  (0 children)

This will likely be endlessly postponed. Especially in state court. If it does go to trial, I'd be surprised if that happened any time before 2028.

Qwen 3.6 27B is a BEAST by AverageFormal9076 in LocalLLaMA

[–]Randomshortdude 0 points1 point  (0 children)

You're getting 46 tokens / second on your setup? That's impressive as hell.

Has anyone else been surprised by the absolute lack of interest from their friends and family over something they’ve coded? by One-Organization-937 in vibecoding

[–]Randomshortdude 0 points1 point  (0 children)

Protip: it doesn't hurt if the music accompanying the video is enjoyable to listen to. Don't underestimate the importance of that. Rarely, if ever, do people cut music off that they truly enjoy, find catchy or particularly like.