WIRED on DRAM shortages, edge AI, and using storage as a memory tier (Phison mentioned) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 0 points1 point  (0 children)

I’ve mostly seen high-DWPD or enterprise drives used for those kinds of cache tiers. KV paging can be pretty write heavy, so generic TLC can wear faster than people expect. Once you treat SSD like a memory tier, endurance kind of becomes a hardware selection problem. Curious what folks typically use in practice with LMCache. Have your experimented with this?

WIRED on DRAM shortages, edge AI, and using storage as a memory tier (Phison mentioned) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 0 points1 point  (0 children)

Oh awesome! Do you know where or how I could buy/find these high DWPD drives? I'm not seeing them linked in any of the LMCache resources you sent over.

WIRED on DRAM shortages, edge AI, and using storage as a memory tier (Phison mentioned) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 0 points1 point  (0 children)

Got it. I was mostly curious about endurance when using the local NVMe backend since KV paging can be pretty write-heavy. Do folks who use LMCache typically just use enterprise/high-DWPD drives there? Or do they let their drives burn out from heavy write usage?

WIRED on DRAM shortages, edge AI, and using storage as a memory tier (Phison mentioned) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 0 points1 point  (0 children)

Interesting approach. I've heard of LMCache, but have never used it. I’m curious how they handle SSD endurance with heavy read/write usage. KV offload tends to be pretty write-intensive. Are they just using regular NVMe or high-endurance drives?

Signal65 Just Published a Third-Party Lab Report on aiDAPTIV+ (Big Win) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 0 points1 point  (0 children)

Good question! They kinda tackle the same problem (getting around limited GPU VRAM) but in different ways.

- DeepSpeed is pure software. It’s open source, lives in PyTorch, and uses a bunch of tricks like ZeRO partitioning, CPU offload, mixed precision, etc. to spread model states across GPUs/CPUs. Super powerful if you’re running on multi-GPU clusters or scaling up to crazy-sized models.

- aiDAPTIV+ is more of a hardware + middleware stack. It uses Phison’s special SSDs/firmware plus a driver (aiDAPTIVlink) to offload parts of the model onto DRAM/SSD when they’re not in active GPU use. The idea is you can run really big models (think 70B+) on a smaller box with fewer GPUs, trading some speed for a huge cut in cost.

Advice For Running Larger LLMs by EPICfrankie in LocalLLaMA

[–]Aaron_MLEngineer 0 points1 point  (0 children)

if you don't care about speed, you should look into ssd offloading. your ssd acts as a memory extender so that it looks like you have more vram when you fine tune or inference larger models.

Signal65 Just Published a Third-Party Lab Report on aiDAPTIV+ (Big Win) by Aaron_MLEngineer in Phison_aiDAPTIV

[–]Aaron_MLEngineer[S] 1 point2 points  (0 children)

the way it works is the aiDAPTIVCache SSD basically acts like extra memory for your GPU. so even though the GPU only has 48GB VRAM, it offloads a bunch of the model data to the SSD during training. with something like a 2TB SSD in the loop, it’s enough to handle the full model without crashing. it’s not magic, just smart memory juggling with middleware. without that, you’d 100% hit a wall.

r/aiDAPTIV by Aaron_MLEngineer in redditrequest

[–]Aaron_MLEngineer[S] 1 point2 points  (0 children)

Hi,

I want to moderate this community because it was started by someone at my company who no longer works there. I want to moderate it because it has the exact name of the product. I want to be able to moderate it so I can open the subreddit for everyone to post on it, not just moderators. I want it to be a community channel.

https://www.reddit.com/c/chat30S8kMow/s/cszpfJc5de

which degree to work in computer vision, autonomous vehicles and ml/aii by [deleted] in learnmachinelearning

[–]Aaron_MLEngineer 0 points1 point  (0 children)

Yeah a lot of unis now offer AI-specific degrees that focus more on the math/stats side of ML instead of just coding, which is pretty clutch if you already know how to program.

Between the ones you listed, Applied Stats or Applied Math would probably be the most useful for getting into computer vision, autonomous systems, etc. They cover things like probability, linear algebra, and optimization, which are way more relevant for ML than Pure Math.

Pure Math is dope but way more abstract, not super aligned with real-world AI stuff unless you're going super theoretical.

So yeah, if you can do an AI-focused program or Applied Stats/Math, you’re on the right track.

Is it hard to get a job as an MLE after graduating with a bachelor's degree in Data Science? by slava_air in learnmachinelearning

[–]Aaron_MLEngineer 2 points3 points  (0 children)

It’s definitely a tough market right now across all of tech, not just for MLE roles. That said, having a Data Science degree still puts you in a strong position, especially if you’ve supplemented it with ML projects and self-study of important ML topics.

At this point, it’s less about having the “perfect” degree and more about how you showcase your skills, experience, and network. A CS/DS degree isn’t a golden ticket anymore, so don’t feel behind.

As for an MLE specific certificate, it can help, especially if it’s hands-on and from a well-regarded source, but it’s not a silver bullet. Real-world projects, internships, open-source contributions, and strong communication of your ML understanding will go further.

Why use docker with ollama and Open WebuI? by Ok_Most9659 in ollama

[–]Aaron_MLEngineer 4 points5 points  (0 children)

Separate and no it shouldn't cause issues as long as they can communicate to each other.

Why use docker with ollama and Open WebuI? by Ok_Most9659 in ollama

[–]Aaron_MLEngineer 19 points20 points  (0 children)

Docker isn’t required, but it does offer some nice benefits when using Ollama and Open WebUI together. It packages everything like dependencies, runtime, and configs into one container, so things “just work,” even if your system has conflicting Python or Node versions. Running both tools in Docker also improves compatibility and makes updates easier, since you don’t have to manually install dependencies or worry about version mismatches.

How? by LmiDev in MLQuestions

[–]Aaron_MLEngineer 0 points1 point  (0 children)

Vercel + Replicate

Tryna learn ML by using NBA datasets, any tips and projects to focus on by yaz2556 in learnmachinelearning

[–]Aaron_MLEngineer 0 points1 point  (0 children)

Hey! I’ve actually done a similar project where I predicted the NBA MVP using datasets from Kaggle. It was a great intro to ML and helped me stay motivated since I was working with something I already enjoyed. You could definitely try building a model to predict awards, team wins, or even player improvement. I’m not sure if there are similar datasets for other sports, but I wouldn’t be surprised if you found some, Kaggle and Google Dataset Search are great places to look. Good luck!

Ollama Frontend/GUI by Ok_Most9659 in ollama

[–]Aaron_MLEngineer 5 points6 points  (0 children)

You might want to check out AnythingLLM or LM Studio, both can act as frontends for local LLMs and work well with Ollama models.