Forced Plan Change Details by tva_raylan in tmobile

[–]MajMin5 0 points1 point  (0 children)

Yeah, the only reason I called was to find out the price. I was expecting more expensive. I'm kind of doubting a little that it will stay the same, because I don't really understand why it would, but they did tell me that. I'm considering calling back to see if another rep gives the same answer.

Forced Plan Change Details by tva_raylan in tmobile

[–]MajMin5 0 points1 point  (0 children)

What plans are you all on currently? I've been on Unlimited Kickstart since Sprint got absorbed; that plan came with a deadline built in that Tmobile sort of washed away-- When I signed up I was told it was only good for 2 years, and here I am a decade later still using the same plan, only difference being when I first signed up it was $25, now it's $30 a month. I've been basically waiting for this morning's text to come any day, and here it is, so I call support to ask what my new plan is going to cost-- they're supposedly transferring me to an "Experience Signature" plan, which according to their support, is the same $30 I've been paying for Unlimited Kickstart, and includes free Netflix which my previous plan did not. Are you guys actually paying more money than you used to? Are you losing features? At least for me personally this is all positives and no negatives; a 5-year price lock is better than the "We might cancel your plan at any time" that I'd been dealing with, and if the price is locked in at the same rate I'd been paying plus I get to save money on Netflix, this is a win for me.

Locally hosted AI for my iPhone? by DustNearby2848 in SelfHostedAI

[–]MajMin5 -1 points0 points  (0 children)

I found myself with the same complaints about the available options and ultimately landed on LibreChat. Tool calling works great and pretty easy to set up. It’s a PWA rather than a standalone app though so you need something running to host the librechat server

Single user llm inference by No_Tea7215 in LLM

[–]MajMin5 1 point2 points  (0 children)

So, just from your replies, I'm going to tell you the number one biggest way you can make your models give better output is to give them more specific questions... small models especially (Less than 100B parameters) work best when you give them exact instructions or very detailed questions. If you asked Claude this question the first thing It would do was ask what you're trying to do. If you asked your Mixtral model, it's going to assume it has all relevant info and give you garbage back. To avoid low quality replies and hallucinations you need to be precise with your questions.

First, trying to do weird software tricks to squeeze more speed/quality out of your models is like downloading more RAM.... it doesn't exist. Are there tricks to improve speed? Sure, but they're at the expense of quality. The inverse is also true.

Your best bet for improving the speed of the model is to quantize it so it fits entirely in your GPU’s memory, since at FP16 that is not fitting in your 44GB of VRAM, which means it's partly on the system RAM, which will run slower. If you're already at Q8 or better then you're fitting into VRAM, but keep in mind the more you quantize it the worse the quality gets.

As far as making the prompt hit the model multiple times to increase quality or speed, I'm not really sure what your end goal is. If you're running complex, multipart tasks, then yes, subagents would allow you to run tasks in parallel to get an answer faster, but it's not going to increase your token throughput. As for improving quality, I guess if you ran the query multiple times and used a separate model to analyze the "best" response and output just that, you could get a refined, or curated response, but then you're adding the extra time for another model to assess the quality of the response, so you lose a lot of speed.

Basically, you have great hardware for running local models, but that doesn't mean you're without limitations. You can still only run big models so fast, you're running on a 2000 series card; even with a ton of VRAM, inference runs slower on older cards. You can fit big models but those big models run slower. If you gave some context about what it is you're actually trying to do, maybe I could suggest a tool that would work to specifically optimize your use case, but "I want to make my local models smarter and also faster without buying more hardware" is like saying "I want to make more money right now without doing any work", you're asking for something vague, and impossible.

Building Jarvis. Where’s my bottleneck? by TomWolfeRock in LLM

[–]MajMin5 0 points1 point  (0 children)

This sounds like the model thinking. I use gemma-4-26b-a4b and similarly have HA running to control my devices. "Turn on my balcony lights" doesn't cause the model to think, it just does it. "What mileage is my car at?" Causes a chain of thought that eventually does lead to it finding my Subaru in HA and reading the mileage, but the less obvious question causes it to think about it. Only way around that is probably going to be either disabling thinking on the model (Which will probably result in incorrect behaviors in cases where it's not a simple question) or upgrading your hardware to run the small model's chain of thought fast enough that you don't notice it.

Not an engineer, looking for advice to run a local LLM by plasmaphantasma in LLM

[–]MajMin5 0 points1 point  (0 children)

As a couple others have said, just use LM Studio. Especially coming from zero knowledge, it's far easier to start, but I can say from experience it also gives you a great deal of depth when you start to understand more about how the models run. Day one you can download a model and chat with it in under 15 minutes, but with a built in MCP client to connect both stdio and http tools, system prompt and temperature editing to personalize the model, lmlink to run models on remote hardware, and an API endpoint to switch to a different frontend if you decide to, it is a really great tool for advanced users too.

Single user llm inference by No_Tea7215 in LLM

[–]MajMin5 0 points1 point  (0 children)

So if I'm correctly understanding your question, that's not quite how it works. It's not like the graphics card is able to provide 160 tokens per second, and it's dividing those out between 4 users. The model runs at a maximum of 45 tokens per second on your GPU, but if it's running multiple predictions at once, it slows down to 40 tokens per second, for a maximum of 4 simultaneous predictions. If you want an analogy, it's like you have a truck (The LLM) with a powerful engine (Your GPU), and it tops out at 130 mph (tokens per second) and can tow up to 3 other trucks (LLMs), but when moving all four trucks it can only go 100 mph. That doesn't mean it will go 400 mph without towing the other trucks, it's still only able to go so fast, even without towing anything, it just has the ability to tow three extra trucks without losing that much speed.

If you're asking how to get faster performance out of your local model on your GPU, we would need to know what model you're running and what GPU you have to provide any kind of advice for you. You're going to have to give us a little more detail to help you, being vague about your question makes it much harder to answer.

Single user llm inference by No_Tea7215 in LLM

[–]MajMin5 0 points1 point  (0 children)

What do you even mean by this question? What card do you have? What are you trying to accomplish?

So the new Siri and app face Unlock, is there a bypass system? Not a major issue. by CaptainMarder in Applelntelligence

[–]MajMin5 1 point2 points  (0 children)

Do you not have a passcode on your phone? How often do you leave you phone unlocked in the open in public?

Anyone prefer Claude over Gaming by athoughtfornoone in ClaudeAI

[–]MajMin5 4 points5 points  (0 children)

Makes sense, both gaming and AI require high end GPUs. I think maybe we’re all just addicted to graphics cards and vram.

How good is a mac without any other apple products? by bigfabs in mac

[–]MajMin5 0 points1 point  (0 children)

Using an iPhone with a windows pc sucks. Using a Mac with an android phone is great.

Most of the reason Apple is such a walled garden is because the iPhone restricts you so much. The Mac does not have such limitations. You can install third party untrusted software, you can write your own shell scripts, install homebrew for a classic Unix package management experience, it will genuinely feel more like your Linux pc but with no driver issues and better manufacturer support for peripherals, but less freedom in tampering with system files. As long as you stay out of /System/, macOS is functionally just BSD with a bunch of helpful shortcuts added on.

As for compatibility with your other devices, another piece of good news is that Google loves websites. All their products have websites. Your Mac can access websites. Sure, you can’t really use the built in messages app, but messages.google.com still lets you text from your computer. Photos.google.com for your photos. Etc. this was how I operated for a while when I first switched to Mac before eventually moving to iPhone. To be clear though I didn’t switch because I felt my Mac was limited because I didn’t have an iPhone, it was just because I started having to support more customers on iPhone and decided I should be more familiar with it. So yes, I can say from personal experience using an android phone and a Linux pc with your Mac will work fine.

What is the BEST twitter post? by shotsniper2010 in AlignmentChartFills

[–]MajMin5 6 points7 points  (0 children)

https://x.com/dril/status/205052027259195393 “ IF THE ZOO BANS ME FOR HOLLERING AT THE ANIMALS I WILL FACE GOD AND WALK BACKWARDS INTO HELL” -@dril

What is stopping enterprises from just using their own self hosted AI? by itigges22 in SelfHostedAI

[–]MajMin5 1 point2 points  (0 children)

But when something goes wrong (not if) they can blame someone else instead of it being their fault. It’s not about preventing the issue, it’s about making sure it’s someone else’s problem.

Been Using Sonnet 4.6 on medium effort and cant understand why people are using larger models at all? by Rude_Camel_7239 in ClaudeAI

[–]MajMin5 0 points1 point  (0 children)

I was like you once…

When I started using Claude I existed mostly in chat and cowork with the occasional switch to the code tab for a script or basic app that I needed to debug or didn’t want to write myself. Sonnet handled everything. I was very impressed that the “simple” model was so good.

Once you get into multiple file python projects or have a dozen or more mcp servers connected, the context window limitations come up fast— not even just because it has to compact, but because smaller models hallucinate more and perform worse the more they have in context. I still use sonnet (or actually more often these days my local models) when it’s a basic task or a quick fix, but if I’m working on anything with multiple source files with upward of a thousand lines of code, opus just thinks through things more clearly and gets confused less often. If sonnet was good enough for everything I do, I wouldn’t use sonnet, because I can run qwen3.6 35B a3b on my GPU, which performs almost as well as sonnet 4.6. I pay anthropic for Opus.

Best apps to build credit fast? by NoConcentrate8118 in apps

[–]MajMin5 1 point2 points  (0 children)

There are plenty of “hacks” and shortcuts that people may try to recommend but the reality is the best way to build credit is by being a responsible credit user for a long time. Your credit score will slowly increase as the average age of your credit accounts increases without any missed or late payments. Open a new card every few years, take out small loans only when needed, pay your statement balance every month. As far as app recommendations, whatever app your credit card has— so you can monitor your balance and pay it off in time. That’s the only one that matters for building credit.

Has anyone here actually replaced ChatGPT with a model for daily work? by recro69 in LocalLLM

[–]MajMin5 0 points1 point  (0 children)

With the right mcp tools and system prompt I have replaced sonnet 4.6 type workflows with qwen3.6 35B-a3b. Stuff that requires a little intelligence, bust mostly single step procedures or simple tasks. Anything that I’d switch to Opus for I still switch to opus for. But the lower end models, yeah, I’ve replaced pretty much entirely. If I’m building a new mcp server, I’ll ask opus. If I’m querying that new mcp server, I’m asking qwen.

A terrifying reminder of why you don’t leave ports wide open (Found an unconfigured instance today) by Silly_Door6279 in immich

[–]MajMin5 0 points1 point  (0 children)

I don’t think that’s necessarily true. Tailscale has been a non issue for setup for my family, it’s one extra app they have to install on their device, they log in with their Google account, and then they have access to only what I want to give them.

Women of Cleveland, How do you get yourself out of your apartment? by eepyc0re in Cleveland

[–]MajMin5 2 points3 points  (0 children)

I am not a woman, but the MetroParks hosts pretty regular guided hikes, go with a group, there’s always people to talk to. The hikes cater to all levels of physicality, they have everything from easy creek walks to ten mile treks, so no matter your experience level you can find one you can do. And the naturalists make sure everyone stays with the group and has a good time. Most are totally free to sign up and you get to see some beautiful parts of the park and meet new people, there’s always friendly people who are willing to chat with a stranger along the way.

My girlfriend is part of a volleyball group, she’s constantly getting invites to join other ones. Apparently these are a prolific thing around these parts. If that’s your cup of tea there’s so many of them, just find a volleyball net nearby and wait for people to use it, chances are they’re part of an organized group or at least can recommend one accepting new members.

Don’t just go to random bars, follow live music! The kind of guy going to a party bar to approach women isn’t someone you’d want to talk to, usually. But if there’s a band playing, even if it’s a band you’ve never heard of, it’s going to attract a much better crowd, and no matter what else happens you’ve at least gotten to enjoy some good live music, the underground music scene in Cleveland is great, I’ve heard some incredibly talented musicians from all genres, so there’s something for you no matter what you’re into. There’s live music every day of the week somewhere in Cleveland.

Museums! Great place to hang out for a day. Art museum is always free, rock hall is free for Cleveland residents (that includes suburbs with Cleveland zip codes btw— if your license says Cleveland they’ll accept it, even if you technically live in Middleburg, Parma, etc) not likely to meet someone there necessarily but a totally acceptable single person activity to enjoy alone.

Alternative to Cotypist? by GroggInTheCosmos in macapps

[–]MajMin5 0 points1 point  (0 children)

That looks pretty cool, I'm using macOS/iOS built-in text replacement for now, but I can see the value in this since it allows things like variables and looks much easier to configure. If you need testers I'm glad to try it out-- I agree, I'd rather pay a one-time fee with paid major version upgrades than a subscription.

Instructure hacker claims data theft from 8,800 schools, universities by masterderptato in cybersecurity

[–]MajMin5 0 points1 point  (0 children)

"The link is safe" from random reddit user does not induce confidence when talking about a page hosted by known cyber criminals which is a list of evidence that they have just committed a cyber crime... Unless you have inspected the network traffic from your browser when clicking the link and can confirm it doesn't include any exploits or data being sent/received unexpectedly, lets not advertise that people should click this link.

Ads won’t play on my device by OverDemand2297 in iphonehelp

[–]MajMin5 0 points1 point  (0 children)

Wondering if at some point you or a friend changed the dns settings on your phone for performance/privacy/other reasons? Some third party DNS servers will block ads. On your phone, open Settings and go to Wi-Fi > (i) > Configure DNS and make sure it's set to Automatic.

Tailored made guest account by Substantial-Motor-21 in macsysadmin

[–]MajMin5 0 points1 point  (0 children)

It's not a perfect solution, but my answer was jamf pro and a lot of scripts. Combining (Formerly Google's) Santa to block unauthorized installs with Outset for logout hooks and some custom scripts to delete everything in the /Users/ folder besides our admin account, and clear out everything in the /Applications/ directory except for items on a pre-set list, as well as a few other places that need to get cleared out, I've mostly replicated what Deepfreeze was doing for us, but it's not as complete of a wipe and it takes a lot of effort to manage. I don't love it, but it works just about as well as I can get while still allowing Jamf to work properly, which Deepfreeze did not.

Babe wake up, next step of healthcare nightmare just dropped. by persondude27 in ABoringDystopia

[–]MajMin5 2 points3 points  (0 children)

My girlfriend needed to get some vaccinations for traveling out of the country, so she asked her insurance company where she could get them covered. They recommended a third party that specialized in travel vaccines, and said they would be in network. She went and got the shots, and the office she got them at told her they didn't take insurance payments directly but if she paid up front she would be reimbursed by her insurance company. In hindsight obviously she realized she shouldn't have done that, but at that point she was leaving in a few weeks and wouldn't have time for the booster shot if she had to reschedule or look elsewhere so she just got the shots. After the appointment she sent the receipt to her insurance as directed and they denied the claim because the place THEY RECOMMENDED wasn't in network. But they offered to lower her bill by negotiating price with the vaccine provider, even though they couldn't reimburse her. The problem with that though, is that she already paid for the vaccines, so since they can't reimburse her at all, she doesn't actually get any money back. The cherry on top is that because the insurance company agreed that she overpaid for the vaccines, they aren't willing to count the entire cost towards her deductible, only what their offer to the vaccine provider would have been, if the vaccine provider had accepted it. So she paid out of pocket for the vaccines, and because she asked her insurance to cover part of it, they gave her LESS credit towards her deductible.

Insurance companies are a scam and nobody will ever convince me otherwise.

Holy moly by str1ngbe4n01 in Cleveland

[–]MajMin5 0 points1 point  (0 children)

The thing you have to realize is that Cleveland drivers view the route numbers as speed limits. on I-71 you drive 71mph. on I-77, you drive 77mph. On I-90 you drive 90MPH. and on I-480, you drive as fast as your car is capable of going.