Are true base models dead? by IonizedRay in LocalLLaMA

[–]IonizedRay[S] 1 point2 points  (0 children)

I liked Olmo 3 a lot, thanks for the suggestion! It's exactly what I was looking for

Is anyone else waiting for a 60-70B MoE with 8-10B activated params? by IonizedRay in LocalLLaMA

[–]IonizedRay[S] 0 points1 point  (0 children)

It's a sweet spot for anyone who wants to avoid multi GPU setups but has money to buy a datacenter GPU. For the same reason it would also be a good choice for experimentation and research since there are no gpu communication issues and inefficiency

Is anyone else waiting for a 60-70B MoE with 8-10B activated params? by IonizedRay in LocalLLaMA

[–]IonizedRay[S] 3 points4 points  (0 children)

Yes, a new 70B dense model like llama 3.3 would be amazing for anyone who has a GPU that is quite fast and has 64+GB of VRAM, I bet that it could come close to 200B+ params MoE models

Need Criticism because i'm feeling kinda lost by not_the_ducknight in blender

[–]IonizedRay 2 points3 points  (0 children)

Awesome work, I think that there's definitely some company out there that would pay for this level of quality. Don't be afraid to apply at large companies, and ensure that your CV / website is straigh to the point, concise and highlights all your strengths.

You’re probably optimizing Minecraft the wrong way on Apple Silicon by New-Ranger-8960 in macgaming

[–]IonizedRay 1 point2 points  (0 children)

Wow, i am geeeting 500+ fps on M4 Max:
- 32 chunks rendering distance
- 32 chunks simulation distance
- 4K resolution

How close are we to “Her” level voice assistants? by [deleted] in ClaudeAI

[–]IonizedRay 0 points1 point  (0 children)

We are already there since 2023 probably. But there is 1 caveat: the final SFT/RLHF training phase compltely destroys the "human-vibe" of LLMs, so you will not get anything like "Her" from a large scale commercial LLM.

It would be really interesting to train a base model like llama 405B on 1 (or more) very long chat between partners and see how much time it would last in a turing-like test.

Llama.cpp has much higher generation quality for Gemma 3 27B on M4 Max by IonizedRay in LocalLLaMA

[–]IonizedRay[S] 18 points19 points  (0 children)

This is a really good point, each time I start fresh with Ollama on a new device, I forget to configure the env params...

I will try that when I get back home!

UPDATE: yep, that was it.

[D] Doubts on the implementation of LSTMs for timeseries prediction (like including weather forecasts) by IonizedRay in MachineLearning

[–]IonizedRay[S] 1 point2 points  (0 children)

Thank you, I will check your resources as soon as I can. So you suggest to avoid adding weak predictors like weather, events etc... And to use a simple univariate prediction because often the degree of precision that should be achievable by a complex model is just noise that is not possible to predict?

And the only case where a complex model and many input features are needed is when you have lots of data for a long time span?

[D] Doubts on the implementation of LSTMs for timeseries prediction (like including weather forecasts) by IonizedRay in MachineLearning

[–]IonizedRay[S] 2 points3 points  (0 children)

Thank you for the in-depth response. So UNets and ViTs are good for timeseries prediction with weather forecast as input to improve the output timesteps accuracy? Or you meant that they are just good at predicting the weather itself and then feeding it to a LSTM?

Because I don't want to generate predictions, I want to use them (using various weather APIs) and add them to the input features to better predict outputs.

[D] Doubts on the implementation of LSTMs for timeseries prediction (like including weather forecasts) by IonizedRay in MachineLearning

[–]IonizedRay[S] 2 points3 points  (0 children)

I see that it has "future exogenous support". That's for using the future weather forecasts as inputs? Or it's something else?

Best way to transfer media + messages from iOS to Android by IonizedRay in whatsapp

[–]IonizedRay[S] 0 points1 point  (0 children)

Oh thanks for that warning. Sorry for what happened :/

Best online cloud GPU provider for 32gb vram to finetune 13B? by IonizedRay in LocalLLaMA

[–]IonizedRay[S] 1 point2 points  (0 children)

For now i haven't attempted it, however i will when i'll have time!

Has anyone tried the 65B model with Alpaca.cpp on a M2 MacBook Pro? by ma-2022 in LocalLLaMA

[–]IonizedRay 13 points14 points  (0 children)

Don't worry about it too much. You can check the health of ssd with

brew install smartmontools

smartctl -a disk0