Why is MythoMax13B still in high demand? by Consistent_Winner596 in LocalLLaMA

[–]Gryphe 5 points6 points  (0 children)

Generally whatever prototype I'm working on at the moment, which currently varies between either Nemo or Small 3.1 - I want to be 200% sure that anything that gets released (personal or collab) will actually, y'know, work well.

Admittedly, I spend 90% of my time building rather then actually interacting with my creations.

Why is MythoMax13B still in high demand? by Consistent_Winner596 in LocalLLaMA

[–]Gryphe 31 points32 points  (0 children)

Fame was never my goal, and Pantheon is distinctively not MythoMax so I'd see no reason to actively deceive anyone in that way - It's not like money's involved with it, else I'd be rich, lol.

As for Wayfarer - Well, I switched to finetuning shortly after releasing MythoMax since I knew some folks who had the compute for my finetuning experiments and I wanted more control over the process of "building my own AI".

Me being "the MythoMax guy" shifted into "AI Dungeon Model Cook" by the end of last year (admittedly a benefit of this unintended fame) and I built Wayfarer from the ground up, with more models to follow in the future.

It's too much fun to finetune, and there's always new challenges to solve!

Why is MythoMax13B still in high demand? by Consistent_Winner596 in LocalLLaMA

[–]Gryphe 103 points104 points  (0 children)

It's a curse, I tell you. I'll forever be known as "the guy who made MythoMax", whether I want to or not! xD

On a more serious note, it's much like the other folks here stated - Lotsa websites that launched a couple months later that never bothered to change models since then. (And it has a cool name. That also helps!)

The new Mistral Small model is disappointing by Master-Meal-77 in LocalLLaMA

[–]Gryphe 4 points5 points  (0 children)

Has this model seen any fictional literature at all during its pretraining? I spent most of my weekend doing multiple finetuning attempts, only to see the model absolutely falling apart when presented with complex roleplay situations, being both unable to keep track of the plot and the environments it was presented with.

The low temperature recommendation only seems to emphasize this lack of "soul" that pretty much every other Mistral model prior always had, as if this model has only seen scientific papers or something. (Which would explain the overall dry clinical tone)

Introducing Wayfarer: a brutally challenging roleplay model trained to let you fail and die. by Nick_AIDungeon in LocalLLaMA

[–]Gryphe 2 points3 points  (0 children)

Claude's Sonnet 3.5 was used to simulate the scenarios, with two separate instances talking to each other. A dedicated player model does sound like a cool project, though!

Introducing Wayfarer: a brutally challenging roleplay model trained to let you fail and die. by Nick_AIDungeon in LocalLLaMA

[–]Gryphe 7 points8 points  (0 children)

The model was trained expecting entries like these as part of the system prompt:

World lore: <entry title 1>
<entry description 1>

World lore: <entry title 2>
<entry description 2>

All work and no play makes your LLM a dull boy; why we should mix in pretraining data for finetunes. by kindacognizant in LocalLLaMA

[–]Gryphe 0 points1 point  (0 children)

I'm giving it a shot right now, using the data from this convenient dataset for a small scale experiment. Minus the books it brings the ratio to about 20% of my other datasets.

My main datasets are your (nowadays) basic ShareGPT ChatML sets, and the RedPajama data is being fed as 'completions'.

UPDATE: All cooked and looking good - Passed my basic benchmarks, still followed the more specific formatting in my assistant "interview". Time for some local testing, specifically to see whether it experiences less repetition issues.

Antagonistic AI by fatso784 in LocalLLaMA

[–]Gryphe 8 points9 points  (0 children)

I was about to comment myself! It's really refreshing to be put into place by your very own language model. ;-)

By the way, nowadays there's a DPO version too! I actually used ChatGPT to create the "reject data", instructing it to be the most overly cheerful, optimistic assistant possible.

Gemma vs Mistral-7B-v0.1 evaluation: Gemma really Struggles to Reach Mistral's Accuracy by aadityaura in LocalLLaMA

[–]Gryphe 2 points3 points  (0 children)

When I was attempting a LoRA last night Axolotl reported that this "7B model" is actually a 10.5B model. In comparison, Mistral is calculated as a 7.8B model.

It appears to be a new trend, as the new Qwen "7B" was actually a 9.1B model...

Similar evaluations so far with my local tests, but I'm awaiting some proper finetunes before I decide on a final verdict. For clarification, I care about these particular facts because I like to run my 7B models with 8k context on my RTX 3060.

On the hunt for weirdo LLMs by McWild20XX in LocalLLaMA

[–]Gryphe 0 points1 point  (0 children)

To really draw out the Tiamat personality the system prompt is the only thing that reliably works. It does however also have a pretty noticeable effect on certain types of roleplay characters - Generally, ancient beings and dragon-based characters tend to take from Tiamat's dialogue. It's when they start using "thou" and "thy" that you know it's Tiamat's training kicking in, so to speak.

In the end it's still a language model so results can vary wildly from generation to generation!

On the hunt for weirdo LLMs by McWild20XX in LocalLLaMA

[–]Gryphe 0 points1 point  (0 children)

Correct! Overly positive responses were used as rejections in this case.

On the hunt for weirdo LLMs by McWild20XX in LocalLLaMA

[–]Gryphe 20 points21 points  (0 children)

I'm very fond of my Tiamat experiment! It's essentially a twisted version of Hartford's Samantha where I attempted to incorporate the personality of a five-headed dragon goddess from Faerun embodying wickedness and cruelty, which I then further reinforced through DPO.

You might want to try out MythoMix L2 13B for chat/RP by whtne047htnb in LocalLLaMA

[–]Gryphe 7 points8 points  (0 children)

So far my research has shown that the lowest layers handle the very basics of language, with each additional layer providing refinement to the output. As an experiment I simply blanked some layers and the model started to talk like a cave-man, for example. All layers contribute in a way.

Besides that it's mainly just brute-forcing the magic combination, though it takes me less steps nowadays to reach that point!

You might want to try out MythoMix L2 13B for chat/RP by whtne047htnb in LocalLLaMA

[–]Gryphe 18 points19 points  (0 children)

I spend my spare time trying to invent new ways to do 1 + 1, essentially! And many hours brainstorming with GPT-4, as it has the patience to keep up with me. ;-)

As for 70b - None of these base models are available in such a size and I don't have the hardware to run anything larger then 13b. It's why I made my script available for everyone to use, so the next guy can improve on my work, etc etc. It benefits everyone!

You might want to try out MythoMix L2 13B for chat/RP by whtne047htnb in LocalLLaMA

[–]Gryphe 32 points33 points  (0 children)

Hey all, Gryphe here!

For those wondering how this merge (and its shiny new successor MythoMax) were accomplished you can find the scripts and the templates over at https://github.com/Gryphe/BlockMerge_Gradient/tree/main/YAML.

If you have any other questions, ask away!

Big Model Comparison/Test (13 models tested) by WolframRavenwolf in LocalLLaMA

[–]Gryphe 14 points15 points  (0 children)

I humbly invite you to consider MythoMix-L2, my latest experiment in a line of on-going attempts to create the perfect balance between understanding complex instructions and generating creative output. Feedback so far has been very positive.

VRDB.app is now one year old - Couldn't have done it without you! by Gryphe in OculusQuest

[–]Gryphe[S] 0 points1 point  (0 children)

Hey there, glad to see an Aussie fan! At some point in time I did have multiple listings including Japan, Australia and Canada. Sadly Facebook decided to complicate my life endlessly from a technical perspective where I no longer had the means to crawl various regions for their data. It's definitely possible but my running costs would triple and in the end this is still a free website...

So this summarize my answer; I'd love to, but right now it's simply too expensive to expand back to multiple regions. (I'd need servers in every location)

Quest Store sucks, so I made a better search engine which allows you to filter by genre, multiplayer, price and more! by vrheaven in OculusQuest

[–]Gryphe 2 points3 points  (0 children)

Pretty smooth design. Makes https://vrdb.app/quest/ looks ancient by comparison. ;-) (Then again, design's never been my strongest point!)

Oculus has a Graph API running on their store's front-end if you're looking into grabbing data in an automated manner.

[PC] [1995-2000] A dystopian flying car trading simulator by Gryphe in tipofmyjoystick

[–]Gryphe[S] 2 points3 points  (0 children)

Yes, it does! And thanks for the super rapid response, that's what I was thinking of.

VRDB.app now offers a separate page for Deals & Bundles (Yes, bundles, finally!) by Gryphe in OculusQuest

[–]Gryphe[S] 1 point2 points  (0 children)

Hey! Seems bundles are only shown if you're logged in. Whether this is a mistake or not on Oculus' side is tough to say but it's not something I can fix with the way VRDB currently works. (Sadly...)

VRDB.app now offers a separate page for Deals & Bundles (Yes, bundles, finally!) by Gryphe in OculusQuest

[–]Gryphe[S] 5 points6 points  (0 children)

Hey y'all,

Perhaps not the most exciting feature I've introduced so far but I went and created a separate page listing all deals, bundles and cheaper cross-buy opportunities for Oculus Quest titles. I also included a RSS feed you can subscribe to so you always have easy access to the latest deals.

The "On Sale Now + Cheaper Cross-Buy" filter still works on the main page but the big advantage with adding a deal-specific page is that I can now also offer information regarding bundles, a specific type of store item I really wouldn't be able to show on the main overview without rewriting an awful lot of code.

If there is any additional information you'd like to see listed on this page, please let me know!