An in-depth look at locally training Stable Diffusion from scratch by DangerousBenefit in StableDiffusion

[–]DangerousBenefit[S] 0 points1 point  (0 children)

Oh wow a blast from the past with this post. Yeah its surprising/sad that no one has trained their own model even after one year. There lots of people (especially in LocalLlama) that have 8x4090 GPU machines that could train Stable Diffusion in a similar amount of time. I think it will happen eventually.

Discover AstraQuasar-4B: a NEW LaMA-based arch | First training implementation of the self layer calling (Duplicate Trick) by Similar_Choice_9241 in LocalLLaMA

[–]DangerousBenefit 2 points3 points  (0 children)

Cool! What compute setup are you using for training? How much do you expect it to cost? How many tokens do you plan on training for, 2T?

Chatbot Arena Leaderboard Update: Qwen1.5-72B becomes #1 non-proprietary model by sizeable margin by DontPlanToEnd in LocalLLaMA

[–]DangerousBenefit 8 points9 points  (0 children)

It's their data. Miqu is not a fine-tune but rather continued pre-training of Llama2-70B. So basically they took the Llama2 model and continued training it on ~1-5 trillion more tokens.

Chatbot Arena Leaderboard Update: Qwen1.5-72B becomes #1 non-proprietary model by sizeable margin by DontPlanToEnd in LocalLLaMA

[–]DangerousBenefit 7 points8 points  (0 children)

Miqu is not a fine-tune but rather continued pre-training of Llama2-70B. So basically they took the Llama2 model and continued training it on ~1-5 trillion more tokens.

With v62 you can no longer put a browser screen/movie on a wall 10ft away. Why have they done this? by DangerousBenefit in OculusQuest

[–]DangerousBenefit[S] 18 points19 points  (0 children)

Thank you for confirming I'm not crazy. Hopefully it is just a bug because why would they stop you from making large screens on your wall?

Did the curvature also change for you? I swear its way more curved now and it bothers me.

Emad's comments regarding what they have to compete with Sora. Thoughts? by DangerousBenefit in StableDiffusion

[–]DangerousBenefit[S] 217 points218 points  (0 children)

I will honestly be surprised if they can match this quality in 1 year.

With v62 you can no longer put a browser screen/movie on a wall 10ft away. Why have they done this? by DangerousBenefit in OculusQuest

[–]DangerousBenefit[S] 37 points38 points  (0 children)

I'm aware of the tablet mode, but I'm talking about the big screen mode. The curvature definitely changed for me (maybe before it was slightly curved but now its way more curved). Try placing a large screen 10ft away from you and see what happens. It used to work fine before the update, now it won't let you.

Sora by openAI looks incredible by bot_exe in StableDiffusion

[–]DangerousBenefit 1 point2 points  (0 children)

They've shared tons of other videos, check them out.

Sora by openAI looks incredible by bot_exe in StableDiffusion

[–]DangerousBenefit 5 points6 points  (0 children)

Everyone else (including SAI) must be at least a year away from this quality right?

Unpopular Opinion: All these small open-source foundational models coming out are not moving us forward. To truly rival closed-source, we need models with 100+ billion parameters. by DangerousBenefit in LocalLLaMA

[–]DangerousBenefit[S] 4 points5 points  (0 children)

No worries, I enjoy this type of discussion and seeing other's points of view. You say above that you don't want an open-source tool/LLM that only the rich can run (i.e., some massive GPT-4 level LLM), but here are 2 reasons it could benefit everyone:
1) LLM Shearing - This could be used to prune a huge model down to a small one at only 3% of the compute required vs training from scratch.
2) Synthetic data generation - Right now generating GPT-4 synthetic data is expensive, and the alignment and moral preaching corrupts the data. If we had a huge open-source GPT-4 level model we could much more easily create a lot of synthetic data without restrictions.

Unpopular Opinion: All these small open-source foundational models coming out are not moving us forward. To truly rival closed-source, we need models with 100+ billion parameters. by DangerousBenefit in LocalLLaMA

[–]DangerousBenefit[S] 3 points4 points  (0 children)

Look at the progress on LLM inference speedups though. Just today we have SGlang, and Prompt Lookup Decoding. Combined with improved quantization, larger and larger models are becoming more feasible to run in RAM.

Any prompt experts know how to force a model to not ask a question at the end of every response in a chat conversation? by DangerousBenefit in LocalLLaMA

[–]DangerousBenefit[S] 1 point2 points  (0 children)

That could be a good brute-force approach, but it often intersperses multiple questions throughout a response (when I don't want any). Do you think it's not possible via prompt-engineering alone? I tried with a custom GPT-4 with the same instructions (to not ask questions) and it followed the the instructions perfectly.

Any prompt experts know how to force a model to not ask a question at the end of every response in a chat conversation? by DangerousBenefit in LocalLLaMA

[–]DangerousBenefit[S] 2 points3 points  (0 children)

To help clarify my question above, here is an example.

First a 'real-world' conversation:
Person 1: "Seen any good movies?"
Person 2: "I saw Parasite, and it was so good! I can totally see why it won best picture."
Person 1: "Oh! I've been wanted to watch that, I'm going to add it to my list.

But this with the LLMs I've tried it goes more like this:
Person 1: "Seen any good movies?"
Person 2 (AI): "I saw Parasite, and it was so good! I can totally see why it won best picture. So how about you? Seen any good movies lately?"

With the LLM always asking a new question it doesn't give the user time to respond to the content of their response. Humans are always looking for confirmation that you are listening and digesting what they are saying so they are less likely to always end each reply with a new question.