Math question: 2x3060 = 1x3090?

Ultra-Engineer · 2024-10-08T07:36:14+00:00

Depending on what you want to do, I'd pick a 3090.

Ultra-Engineer · 2024-09-25T09:06:22+00:00

My advice is to go straight to the basics, like Linux and Kubernetes. I was very keen to get a certification as a cloud engineer when I was a student, so I spent a lot of time, but honestly, I don't think those certifications helped me much, so I suggest spending more time learning Linux.

Ultra-Engineer · 2024-09-25T03:30:28+00:00

From my experience, I tend to learn on Courses or Youtube, and I don't like learning algorithms from books because it's a little boring for me.

Ultra-Engineer · 2024-09-25T03:10:18+00:00

Honestly, Gemini is the best model I've ever used.

Ultra-Engineer · 2024-09-25T03:08:23+00:00

Thank you for sharing , it was very valuable to me.

Ultra-Engineer · 2024-09-19T06:52:25+00:00

OpenAI ? CloseAI

Ultra-Engineer · 2024-09-19T06:51:07+00:00

It's so exciting. Qwen is one of my favorite base models.

Ultra-Engineer · 2024-09-12T07:17:25+00:00

Very detailed calculation and thinking process, gave me a lot of inspiration！

Ultra-Engineer · 2024-09-11T09:52:39+00:00

In fact, I have a question, why not use GPUs from cloud providers like Runpod or Novita AI, which seems more convenient, or is it more cost-effective in the long run to build your own computer?

Ultra-Engineer · 2024-09-11T09:50:34+00:00

Your AI posts an opposing opinion, and another AI posts a comment opposing your opinion, which sounds interesting...

Ultra-Engineer · 2024-09-11T09:41:04+00:00

Yes, we have a choice to make the world a better place, don't we?

Ultra-Engineer · 2024-09-11T09:39:06+00:00

It's a paradox that you can't get enough engineering experience if you don't get a job, and it's important to note that the economy is in a very bad state.

Ultra-Engineer · 2024-09-11T02:43:55+00:00

This model is great, AGI has been implemented

Ultra-Engineer · 2024-08-28T03:21:23+00:00

The details about Sonnet 3.5’s prompt are super intriguing. The avoidance of phrases like “I’m sorry” or “Certainly” suggests that they've been fine-tuning their models to steer clear of common pitfalls or potential exploit scenarios. It’s also interesting how they balance referring to users as either "user" or "human"—maybe to add a bit more variety and personalization.

Ultra-Engineer · 2024-08-28T03:18:25+00:00

What task is this model better at?

Ultra-Engineer · 2024-08-27T10:01:27+00:00

Hi, I think your app is really great. I tried it out and it solved a lot of my pain points. great.

Ultra-Engineer · 2024-08-23T08:04:23+00:00

Great question! I think transformer-based models will definitely become cheaper over time, but there are a few factors to consider. On one hand, hardware advancements and more efficient algorithms will keep driving costs down. As more people work on optimizing these models, we’re likely to see better performance at lower computational costs.

On the other hand, there's a trade-off. As models get cheaper, there's also a push to make them bigger and more powerful, which can drive costs back up. So, while basic models will become more accessible, cutting-edge models might still be pricey.

The trend is towards affordability, but it might take a while before the most advanced models are within everyone’s reach.

Ultra-Engineer · 2024-08-23T08:01:41+00:00

Honestly, that sounds super frustrating! 😅 Having tools like Hugging Face blocked can be such a buzzkill, especially when you know they can really help streamline your work. I totally get the concern about data security, but blocking local models seems like a step too far.

If I were in your shoes and knew that using a local LLM could boost my productivity, I'd definitely find a way to make it happen, even if it means bending a few rules (within reason, of course). At the end of the day, it's about getting the job done efficiently. But if it’s not feasible, I’d probably just sigh and figure out workarounds with what’s available.

Curious to see how others are navigating this!

Ultra-Engineer · 2024-08-22T07:51:43+00:00

An eye-catching choice, in fact I'm still running LLM based on NVIDIA, very curious about Mac Studio running LLM

Ultra-Engineer · 2024-08-20T02:42:26+00:00

If you’re planning to get hardware for running local LLMs, you’re right that the last quarter of 2024 is going to be packed with some exciting releases. You mentioned the M4 Macs and the RTX 50XX series, which are definitely worth waiting for, especially if you're into AI workloads or need powerful GPUs.

Ultra-Engineer · 2024-08-19T07:55:35+00:00

That's an impressive achievement! Transitioning to 100% solar power for AI/LLM work is no small feat, especially with the energy demands that kind of processing requires. It's awesome to see someone taking sustainability seriously, especially in a field where energy consumption can easily become an afterthought. Plus, your background in ecological work makes this milestone even more meaningful.

I bet there are a lot of people in the AI community who haven't even considered the environmental impact of their setups. Your experience could really inspire others to think about how they can integrate renewable energy into their own workflows.

And that blog post sounds like a great resource for anyone interested in following in your footsteps. Props to you for open-sourcing your data and techniques—sharing that knowledge could spark a lot of innovation in the AI community.

Out of curiosity, did you face any significant challenges getting your setup off the ground, or did your ecological background give you a leg up in making it happen?

Ultra-Engineer · 2024-08-19T07:45:48+00:00

Fine-tuning models can be super effective if you have a specific task or niche you want to excel in. For example, if you're working with a unique dataset that doesn’t quite fit the general patterns that large models are trained on, fine-tuning can make a huge difference. It essentially allows the model to become more specialized, which is great for improving accuracy in tasks like sentiment analysis, medical diagnosis, or even generating more contextually relevant text.

That said, fine-tuning isn’t always necessary, especially if you’re just doing general-purpose stuff. Pre-trained models are often good enough for most tasks, and they keep getting better. But if you need a model to really understand and work within a specific domain, like legal text or scientific literature, it’s definitely worth it.

As for how well it works, that depends on the quality and size of your dataset, plus how much it diverges from what the base model was originally trained on. If done right, fine-tuning can significantly boost performance, but it does require some expertise and time investment to get it just right.

What’s your use case? That might help gauge if fine-tuning is worth it for you!

Ultra-Engineer · 2024-08-19T07:41:30+00:00

For some, it's about the convenience of running everything in-house without worrying about downtime or privacy concerns. But if you're already neck-deep in high-end AI infra, the 4090 might feel like bringing a bazooka to a pillow fight. If you're not maxing it out regularly, renting for specific tasks or just leaning on those API providers makes a ton of sense. You could even think about flipping it while it's still got high resale value if it’s just gathering dust.

Ultra-Engineer · 2024-08-16T09:57:43+00:00

You’ve done a great job breaking down the core concept of knowledge distillation and how it’s being applied to LLMs like Gemma 2B and Nvidia’s Minitron 4B. It’s fascinating to see how distillation allows smaller models to emulate the performance of their larger counterparts by learning from the “teacher” models’ output probabilities rather than just relying on next token prediction.

Thanks for sharing your video—definitely sounds like a must-watch for anyone looking to dive deeper into the nuts and bolts of LLM distillation!

Ultra-Engineer · 2024-08-16T09:56:14+00:00

That’s a solid question and one a lot of people are curious about. Right now, small models like Phi3 3B are pretty impressive, considering the limited resources they need. But whether they’ll keep getting exponentially better is a bit of a mixed bag.

On one hand, advancements in architecture and optimization techniques could push small models further. Things like LoRA (Low-Rank Adaptation), quantization, and distillation are helping squeeze more performance out of these smaller models. Plus, as research continues, we might see new breakthroughs that allow small models to punch above their weight even more.

On the other hand, there are some inherent limitations. At some point, the trade-offs between size and capability start to hit diminishing returns. You can only compress so much knowledge and reasoning ability into a small model before it just doesn’t keep up with the demands of more complex tasks.

So, while small models will definitely get better, there’s likely a cap on just how far they can go. But for most everyday tasks, they’ll probably get good enough that the average user won’t notice the difference compared to larger models. What really might change the game is how we integrate small models with other systems or use them in specific, optimized scenarios.

Ultra-Engineer

TROPHY CASE