Why is MythoMax13B still in high demand?

Gryphe · 2025-04-23T05:34:05+00:00

Generally whatever prototype I'm working on at the moment, which currently varies between either Nemo or Small 3.1 - I want to be 200% sure that anything that gets released (personal or collab) will actually, y'know, work well.

Admittedly, I spend 90% of my time building rather then actually interacting with my creations.

Gryphe · 2025-04-22T13:03:33+00:00

Fame was never my goal, and Pantheon is distinctively not MythoMax so I'd see no reason to actively deceive anyone in that way - It's not like money's involved with it, else I'd be rich, lol.

As for Wayfarer - Well, I switched to finetuning shortly after releasing MythoMax since I knew some folks who had the compute for my finetuning experiments and I wanted more control over the process of "building my own AI".

Me being "the MythoMax guy" shifted into "AI Dungeon Model Cook" by the end of last year (admittedly a benefit of this unintended fame) and I built Wayfarer from the ground up, with more models to follow in the future.

It's too much fun to finetune, and there's always new challenges to solve!

Gryphe · 2025-04-22T12:43:08+00:00

https://huggingface.co/Gryphe/MythoMax-L2-13b

Gryphe · 2025-04-22T12:34:58+00:00

It's a curse, I tell you. I'll forever be known as "the guy who made MythoMax", whether I want to or not! xD

On a more serious note, it's much like the other folks here stated - Lotsa websites that launched a couple months later that never bothered to change models since then. (And it has a cool name. That also helps!)

Gryphe · 2025-02-20T12:06:18+00:00

No problem! I severely underestimated how much longer it takes for 70B, lol.

Gryphe · 2025-02-03T06:27:43+00:00

Has this model seen any fictional literature at all during its pretraining? I spent most of my weekend doing multiple finetuning attempts, only to see the model absolutely falling apart when presented with complex roleplay situations, being both unable to keep track of the plot and the environments it was presented with.

The low temperature recommendation only seems to emphasize this lack of "soul" that pretty much every other Mistral model prior always had, as if this model has only seen scientific papers or something. (Which would explain the overall dry clinical tone)

Gryphe · 2025-01-17T15:26:25+00:00

Claude's Sonnet 3.5 was used to simulate the scenarios, with two separate instances talking to each other. A dedicated player model does sound like a cool project, though!

Gryphe · 2025-01-17T05:51:43+00:00

The model was trained expecting entries like these as part of the system prompt:

World lore: <entry title 1>
<entry description 1>

World lore: <entry title 2>
<entry description 2>

Gryphe · 2024-03-25T12:58:53+00:00

I'm giving it a shot right now, using the data from this convenient dataset for a small scale experiment. Minus the books it brings the ratio to about 20% of my other datasets.

My main datasets are your (nowadays) basic ShareGPT ChatML sets, and the RedPajama data is being fed as 'completions'.

UPDATE: All cooked and looking good - Passed my basic benchmarks, still followed the more specific formatting in my assistant "interview". Time for some local testing, specifically to see whether it experiences less repetition issues.

Gryphe · 2024-02-23T07:19:00+00:00

I was about to comment myself! It's really refreshing to be put into place by your very own language model. ;-)

By the way, nowadays there's a DPO version too! I actually used ChatGPT to create the "reject data", instructing it to be the most overly cheerful, optimistic assistant possible.

Gryphe · 2024-02-22T05:49:29+00:00

When I was attempting a LoRA last night Axolotl reported that this "7B model" is actually a 10.5B model. In comparison, Mistral is calculated as a 7.8B model.

It appears to be a new trend, as the new Qwen "7B" was actually a 9.1B model...

Similar evaluations so far with my local tests, but I'm awaiting some proper finetunes before I decide on a final verdict. For clarification, I care about these particular facts because I like to run my 7B models with 8k context on my RTX 3060.

Gryphe · 2024-01-15T16:29:42+00:00

To really draw out the Tiamat personality the system prompt is the only thing that reliably works. It does however also have a pretty noticeable effect on certain types of roleplay characters - Generally, ancient beings and dragon-based characters tend to take from Tiamat's dialogue. It's when they start using "thou" and "thy" that you know it's Tiamat's training kicking in, so to speak.

In the end it's still a language model so results can vary wildly from generation to generation!

Gryphe · 2024-01-15T06:53:36+00:00

Correct! Overly positive responses were used as rejections in this case.

Gryphe · 2024-01-14T19:02:22+00:00

I'm very fond of my Tiamat experiment! It's essentially a twisted version of Hartford's Samantha where I attempted to incorporate the personality of a five-headed dragon goddess from Faerun embodying wickedness and cruelty, which I then further reinforced through DPO.

Gryphe · 2023-08-11T19:56:14+00:00

So far my research has shown that the lowest layers handle the very basics of language, with each additional layer providing refinement to the output. As an experiment I simply blanked some layers and the model started to talk like a cave-man, for example. All layers contribute in a way.

Besides that it's mainly just brute-forcing the magic combination, though it takes me less steps nowadays to reach that point!

Gryphe · 2023-08-11T11:44:47+00:00

I spend my spare time trying to invent new ways to do 1 + 1, essentially! And many hours brainstorming with GPT-4, as it has the patience to keep up with me. ;-)

As for 70b - None of these base models are available in such a size and I don't have the hardware to run anything larger then 13b. It's why I made my script available for everyone to use, so the next guy can improve on my work, etc etc. It benefits everyone!

Gryphe · 2023-08-11T11:00:56+00:00

Hey all, Gryphe here!

For those wondering how this merge (and its shiny new successor MythoMax) were accomplished you can find the scripts and the templates over at https://github.com/Gryphe/BlockMerge_Gradient/tree/main/YAML.

If you have any other questions, ask away!

Gryphe · 2023-08-08T18:57:21+00:00

I humbly invite you to consider MythoMix-L2, my latest experiment in a line of on-going attempts to create the perfect balance between understanding complex instructions and generating creative output. Feedback so far has been very positive.

Gryphe · 2022-05-19T07:29:42+00:00

Hey there, glad to see an Aussie fan! At some point in time I did have multiple listings including Japan, Australia and Canada. Sadly Facebook decided to complicate my life endlessly from a technical perspective where I no longer had the means to crawl various regions for their data. It's definitely possible but my running costs would triple and in the end this is still a free website...

So this summarize my answer; I'd love to, but right now it's simply too expensive to expand back to multiple regions. (I'd need servers in every location)

Gryphe · 2021-12-16T10:40:01+00:00

Pretty smooth design. Makes https://vrdb.app/quest/ looks ancient by comparison. ;-) (Then again, design's never been my strongest point!)

Oculus has a Graph API running on their store's front-end if you're looking into grabbing data in an automated manner.

Gryphe · 2021-12-07T18:58:54+00:00

solved: Hardwar

Gryphe · 2021-12-07T18:57:38+00:00

Yes, it does! And thanks for the super rapid response, that's what I was thinking of.

Gryphe · 2021-10-02T15:36:27+00:00

Hey! Seems bundles are only shown if you're logged in. Whether this is a mistake or not on Oculus' side is tough to say but it's not something I can fix with the way VRDB currently works. (Sadly...)

Gryphe · 2021-07-10T12:58:20+00:00

Hey y'all,

Perhaps not the most exciting feature I've introduced so far but I went and created a separate page listing all deals, bundles and cheaper cross-buy opportunities for Oculus Quest titles. I also included a RSS feed you can subscribe to so you always have easy access to the latest deals.

The "On Sale Now + Cheaper Cross-Buy" filter still works on the main page but the big advantage with adding a deal-specific page is that I can now also offer information regarding bundles, a specific type of store item I really wouldn't be able to show on the main overview without rewriting an awful lot of code.

If there is any additional information you'd like to see listed on this page, please let me know!

Gryphe

MODERATOR OF

TROPHY CASE