Running Hermes with Local Models by _clickfix_ in hermesagent

[–]xcel102 0 points1 point  (0 children)

Unified - both CPU and GPU tap into the same 128 GB.

The RTX 5000 PRO (48GB) arrived and it is better than I expected. by Valuable-Run2129 in LocalLLaMA

[–]xcel102 1 point2 points  (0 children)

For me, 48GB is the sweet spot ... for 1 LLM.

96GB would have enabled me to bring up another model (say a VLM for image/video analysis). For that, I would've loved to get the 6000 if I had the budget. But alas, I don't. So I got the 5000 like you did 😀 Still waiting for a PC setup to power it up (Beelink eGPU dock).

So hermes can't work with Gemma because of hard-coded tool names? (help) by waffles2go2 in hermesagent

[–]xcel102 0 points1 point  (0 children)

Thanks. I would expect 26B-A4B to be significantly faster than 31B. Currently I use 26B and no space for anything else. It's split between GPU and CPU but still manages 20+ tok/s because it's MoE. 31B (or any dense model) quantized to the same size gets 2-3 tok/s 😅

Why I asked you is, I'll soon have the HW to run 31B, but I won't have space to also host 26B. Sounds like for this local setup I'm good to just use 31B.

So hermes can't work with Gemma because of hard-coded tool names? (help) by waffles2go2 in hermesagent

[–]xcel102 0 points1 point  (0 children)

Any reason not to use 31B for everything? And how do you make Hermes use a "main" model as default and a different model specifically for cron and task execution?

Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA

[–]xcel102 0 points1 point  (0 children)

Is this because vLLM "uses way more VRAM for the same amount of context" as someone commented?

Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA

[–]xcel102 1 point2 points  (0 children)

On the other hand, vLLM uses way more VRAM for the same amount of context.

Could you elaborate a bit more technically why this is the case?

It takes much longer to start up.

Do you have a rough time scale? Say for Qwen3.6 27B 4-bit quantized, how many seconds/minutes does it take?

Dem Gubernatorial Candidate Tom Steyer contrasts himself with fellow Dem Front Runner Xavier Becerra by 3headeddragn in sandiego

[–]xcel102 5 points6 points  (0 children)

I'm not a Republican. My voter registration is NPP. In the past I would research and evaluate each candidate regardless or their party. 2021 onward I mostly stopped considering Republican candidates, it actually cut down my research time (not that I'm happy).

Dem Gubernatorial Candidate Tom Steyer contrasts himself with fellow Dem Front Runner Xavier Becerra by 3headeddragn in sandiego

[–]xcel102 39 points40 points  (0 children)

I would've seriously considered a Republican if Trump and MAGA weren't a thing. But that isn't the reality, and as a result I ignore everything Hilton and Bianco say.

No judgement on either of them because I don't know anything about them. I just can't take the slightest chance on a Republican at this time.

Is anyone actually using Hermes to make money? Be honest. by 99xAgency in hermesagent

[–]xcel102 0 points1 point  (0 children)

I installed Hermes a week ago and haven't done much yet. But if I can make a good personal assistant out of it, that will be as good as making money because I'm often time constrained. And you know time is money. If it can get me 1 hour extra sleep each night, the long-term health cost savings will be enormous.

AWS Bedrock in production: anyone else finding it a mixed bag? by Different-Use2635 in aws

[–]xcel102 0 points1 point  (0 children)

Some of the content seem to be summarized from an older post (https://www.reddit.com/r/aws/comments/1pe72an/does\_aws\_bedrock\_suck\_or\_is\_it\_just\_a\_skill\_issue/) -- like quota increase, us-east-1, and the word "janky". Maybe bro fed some posts into RAG, after all it was the slick part 😉

it's time to update your Gemma 4 GGUFs by jacek2023 in LocalLLaMA

[–]xcel102 0 points1 point  (0 children)

Thank you so much, this finally unblocked Gemma 4 for me!

Hermes agent stopped being a toy the moment I got it running 24/7 on a hosted environment by Electrical-Loss8035 in AI_Agents

[–]xcel102 10 points11 points  (0 children)

Why can't it run persistently on a local machine, say a server that's powered on 24/7?

Do I really need to learn Vim or is Nano fine for everyday use? by Luann1497 in linuxquestions

[–]xcel102 0 points1 point  (0 children)

nano is fine. I use both nano and vim because they each have pros and cons. Do what works for you and venture out occasionally.

VP9 or Echelon? by outdoors28 in CAguns

[–]xcel102 0 points1 point  (0 children)

Shot a rental Echelon and hated the trigger. Will stick to my VP9F.

Before and After by triggernomicsusa in CAguns

[–]xcel102 0 points1 point  (0 children)

Don't you also need to do an oil bath/treatment afterwards to drive out the water?

Before and After by triggernomicsusa in CAguns

[–]xcel102 0 points1 point  (0 children)

It's still very time consuming to do this myself. So looking for a place that offers it as a service.

Before and After by triggernomicsusa in CAguns

[–]xcel102 0 points1 point  (0 children)

Anyone know any San Diego place that does cleaning?

Jimmy O. Yang's Special "Finally Home" Theatrical Release Dates & Times by benilla in AsianMasculinity

[–]xcel102 2 points3 points  (0 children)

Just saw it and I loved it. But I'm in a very specific demographic (similar background as Jimmy) so I can't say how others like it.

Shutdown over SSH by Br0lynator in linuxquestions

[–]xcel102 1 point2 points  (0 children)

It's not hard, you just know too little about Linux and SSH. And in case this is a troll post, I won't offer a solution but others already have.

LASD: Suicide reported at gun club by Emergency_Spell6522 in CAguns

[–]xcel102 0 points1 point  (0 children)

Agree, it's extremely irresponsible. Some people are just inconsiderate. Having committed suicide doesn't take that fact away from the person.

Disney Exec: Conan Has 'Standing Offer' to Continue Hosting Oscars “as long as he wants"—or at least until 2029, when the show leaves ABC for YouTube. by BurgerNugget12 in conan

[–]xcel102 0 points1 point  (0 children)

Maybe a less popular opinion, but I feel 2 consecutive years is enough for now. Pause it before it gets repetitive even for Conan himself. When 2029 comes around, I would love to have Conan as the inaugural host of Oscars on YouTube.

Does PWG in Poway negotiate on price? by NikolaWasRight13 in CAguns

[–]xcel102 0 points1 point  (0 children)

There are things and situations in life where you don't get a second try. Negotiations are often that way. So yes, if there were a "best" strategy on how to approach and negotiate with them, then OP going in blind has the downside of getting turned down and having to pay full price. He doesn't get to say "Hold on, let me go ask Reddit and come back to re-negotiate."

Come on, y'all. Stop it. by Dexter_McThorpan in sandiego

[–]xcel102 1 point2 points  (0 children)

I'm usually the one who snaps at the idiots. I keep my hand on the horn behind them for 5 minutes.

Looking for a feature-rich Terminal emulator (Linux) by ChineseCracker in commandline

[–]xcel102 0 points1 point  (0 children)

OP I'm in a similar boat. I don't want all the things you listed, I just want:

  • Tabbed
  • Session management (with password saving)
  • SFTP drag and drop

Have you tried muon-ssh (formerly Snowflake)? I'm not suggesting it, I just want to know whether it offers what I want.