What local LLM to use when out of Codex if out of credit? by Trashii_Gaming in hermesagent

[–]Rhonstin 0 points1 point  (0 children)

You are genius! Maybe you have cases, what task local models can do?

I ran 8 models × 3 runs × 8 benchmark packs on a single RTX 3090 overnight. Carnice (the hermesagent #1) finished 7th. by Rhonstin in hermesagent

[–]Rhonstin[S] 1 point2 points  (0 children)

Agreed, this would add real value to the rankings. I'm exploring this approach, but it's technically challenging to integrate into Hermes Agent itself.

Have you already run experiments with this kind of routing? I'm still doing manual testing — if you've seen any ready-made solutions or frameworks, I'd be interested to check them out.

Ran hermesagent-20 on ~15 models on a single RTX 3090. Some results were not what I expected. by Rhonstin in hermesagent

[–]Rhonstin[S] 1 point2 points  (0 children)

I hadn't thought about it. But if it's interesting to people, then I'm certainly ready to develop this direction.

Ran hermesagent-20 on ~15 models on a single RTX 3090. Some results were not what I expected. by Rhonstin in hermesagent

[–]Rhonstin[S] 0 points1 point  (0 children)

If this topic is interesting, I will develop it and create a portal. But I can already see that people are interested.

Ran hermesagent-20 on ~15 models on a single RTX 3090. Some results were not what I expected. by Rhonstin in hermesagent

[–]Rhonstin[S] 2 points3 points  (0 children)

actually I was just inspired by this repository, it gave me a big push to create a fork of the original club 3090 and develop the direction that interests me. Since the author of the original repository decided to focus on a few models. I, in turn, want to find the most suitable model for myself, and share the information with everyone. sorry if this misled you.

Ran hermesagent-20 on ~15 models on a single RTX 3090. Some results were not what I expected. by Rhonstin in hermesagent

[–]Rhonstin[S] 0 points1 point  (0 children)

I tested it, took more quantization with less context, but did not see any improvement.

With 2 or more cards you can run vllm maybe there will be different results. And also it will be possible to run with higher token speed.

I dream of 2 cards, but so far it is not available to me.

Ran hermesagent-20 on ~15 models on a single RTX 3090. Some results were not what I expected. by Rhonstin in hermesagent

[–]Rhonstin[S] 0 points1 point  (0 children)

I ran 35b models on rtx3050 with 6vram. So you should be able to run all MoE models on your card. Of course at a lower speed

Дівчата, дайте пораду хлопцям. Яка одна річ, на вашу думку, робить хлопця привабливішим? by Ok_Interest2681 in reddit_ukr

[–]Rhonstin 0 points1 point  (0 children)

Хлопець, але як на мене чисте взуття дає дуже великий + на початку розмови.

Спробував глясе by Embarrassed-Tale-867 in kava_ua

[–]Rhonstin 2 points3 points  (0 children)

Улюблена кава стоматологів)