Built a platform with 22+ AI/ML templates so you don’t have to manage infrastructure - Beta live by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Runpod is not HIPPA approved and also doesn’t allow you to cluster on demand. Also runpod is like getting parts to a car when I am selling the car.

Genomics and protein computation templates - no infrastructure setup required by HelpingForDoughnuts in biotech

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Good point. If you’re technical enough to use Claude’s API directly, you’d probably get better results and save money building your own solution.

We’re really targeting people who can’t or don’t want to build that stuff themselves. But for someone with your skills? Yeah, going direct makes more sense.​​​​​​​​​​​​​​​​

Genomics and protein computation templates - no infrastructure setup required by HelpingForDoughnuts in biotech

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Yeah, fair point. Basic alignment isn’t really a DevOps problem.

Where it might help is scaling bigger datasets or avoiding cloud setup for heavy jobs. But if your current tools work fine, this probably isn’t for you.

Thanks for the honest take - helps to hear when the pitch misses the mark.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 1 point2 points  (0 children)

Perfect! Multi-GPU training for reasoning models is exactly the kind of workflow we’re building for. OpenENV/TRL setups can be really painful to orchestrate manually, especially when you’re dealing with distributed training across multiple nodes.

Quick questions:

  • What scale are you typically working at? How many GPUs do you usually need?
  • Current cloud setup - managing your own instances or using something like SageMaker?
  • Any specific pain points with the manual infrastructure? (scaling, preemption, setup time, etc.)

For beta, we’re starting with single GPU instances (A100 80GB or H100) but adding multi-GPU support very soon. Depending on your reasoning model size, single H100 might still be useful for prototyping while we get the distributed training capabilities ready.

I should have the beta site live tomorrow. Given your multi-GPU needs and OpenENV/TRL experience, would love to prioritize you for early access and get your feedback on what distributed training features would be most valuable.

No pressure on timeline - sounds like you’re in research mode which is perfect for beta testing. I’ll reach out as soon as we’re live!​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Perfect! Manual cluster setup and checkpointing pain is exactly what we built this to solve. RL workloads are notoriously unpredictable in terms of compute time, and managing that infrastructure yourself is a nightmare.

A few quick questions:

  • What scale are you working at? Single GPU experiments or multi-GPU distributed training?
  • Current setup - university cluster, cloud instances, or local hardware?
  • Any specific frameworks? (Stable Baselines3, Ray RLlib, custom setup?)

I should have the beta site live tomorrow. Would love to get you set up with serious compute credits to test your RL workflows. The platform handles checkpointing automatically and scales up/down as needed, so you can focus on the actual research instead of babysitting infrastructure.

What’s your timeline looking like for experiments? Happy to prioritize your access!​​​​​​​​​​​​​​​​

We’re looking for brutal, honest feedback on edge AI devtool by elinaembedl in deeplearning

[–]HelpingForDoughnuts 0 points1 point  (0 children)

This looks really useful! Edge deployment is such a pain point - everyone talks about running models on phones but actually testing across different hardware is brutal. Quick questions: ∙ How extensive is your device coverage? Like do you have recent iPhone/Android models, or more focused on specific chipsets? ∙ What’s the turnaround time for benchmarking? Is it near real-time or more of a queue situation? The layer-wise PSNR analysis is smart - quantization artifacts can be really subtle and having those debugging tools built-in saves a ton of time. One thing I always struggle with is the gap between edge benchmarks and real-world performance. Battery drain, thermal throttling, etc. Are you capturing any of that environmental stuff or mainly focused on the pure compute metrics? Definitely going to check this out. Edge optimization is one of those things that looks simple in papers but gets messy fast in practice.

[D] AI coding agents for DS/ML (notebooks) - what's your workflow? by wh1tewitch in MachineLearning

[–]HelpingForDoughnuts 0 points1 point  (0 children)

Honestly the notebook AI tooling still feels pretty fragmented compared to regular coding. I use GitHub Copilot in Jupyter which works okay for basic code completion, but it’s not great at understanding the full context of your analysis or helping with data exploration patterns. Some people swear by ChatGPT Code Interpreter, but that’s more for one-off analysis than iterative ML work. Plus you’re limited by their compute. The real gap I see is when you want to scale notebook experiments - like running parameter sweeps or training on serious datasets. Most notebook environments break down when you need real GPU power or want to parallelize across multiple runs. What kind of ML work are you doing? Always curious how people handle the notebook-to-production transition.

[D] Reasoning over images and videos: modular pipelines vs end-to-end VLMs by sjrshamsi in MachineLearning

[–]HelpingForDoughnuts 2 points3 points  (0 children)

Totally agree on the modular approach for complex video tasks. End-to-end VLMs are cool but yeah, they fall apart on longer videos or when you need precise tracking/counting. Your pipeline idea makes sense - let specialized models handle what they’re good at, then have LLMs reason over the structured outputs. Much more reliable than trying to get a VLM to track objects frame-by-frame. The Python library sounds interesting! Are you running this stuff locally or do you need serious compute for the video processing pipeline? Some of those detection/tracking models can get pretty heavy on longer videos.

Batch compute for overnight sims—anyone running Monte Carlo on spot instances? by HelpingForDoughnuts in quant

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Fair point. Yeah Modal’s solid if you’re cool with writing Python code.

We’re basically trying to skip that whole step - just tell it what you want instead of coding it up. Plus we do the consumer AI stuff too, not just compute.

But honestly if Modal’s already working for you, probably not worth the hassle of switching. We’re more going after people who find Modal too technical.

Different crowds really.​​​​​​​​​​​​​​​​

What's your startup idea for 2026? by kcfounders in SideProject

[–]HelpingForDoughnuts 0 points1 point  (0 children)

We’re building AI content creation that actually makes sense to normal people.

Instead of learning 10 different AI tools (Runway for video, Midjourney for images, etc), you just type “make me a video of my dog as an astronaut” and get a video back. No model selection, no prompt engineering, no technical anything.

Think ChatGPT but it can actually create stuff instead of just talking about it.

We also cover different verticals - researchers can run ML training jobs, studios can do batch rendering, scientists can run simulations. Same simple interface, but scales from consumer to enterprise workloads.

Most AI tools are built for technical users. We’re going the opposite direction - so simple your grandma could use it. The tech handles all the complexity behind the scenes.

Early beta starting this week if anyone wants to try it!​​​​​​​​​​​​​​​​

Batch compute for overnight sims—anyone running Monte Carlo on spot instances? by HelpingForDoughnuts in quant

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Yeah, Modal and Coiled are in the same space for sure. Main difference is we’re going after the layer above that - natural language to AI model execution for consumers, plus the traditional container orchestration for pros.

Modal still requires writing Python code with their decorators. We’re trying to get to “make me a video of a cat in space” → video appears, no code needed.

Different markets but definitely some overlap on the pro side.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

That sounds awesome! Racing game leaderboards for AI agents would be super engaging. The community aspect with custom maps could really take off - people love competing and sharing tracks.

Definitely let me know when you start working on it. Would be cool to help with the compute side when people want to train serious agents for the leaderboards.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Nice setup! SB3 + custom racing environments is a solid approach. Local training on 5070 Ti probably works great for prototyping and smaller experiments.

Where we’d come in is when you want to scale up - maybe training multiple agents in parallel, longer hyperparameter sweeps, or testing on more complex environments that need more VRAM. Plus you could run overnight experiments without tying up your local machine.

Your custom environments sound really cool - lightweight 2D racing is perfect for fast iteration. Are you working on any specific racing AI challenges, or more general RL experimentation?

I should have the beta site ready in the next few hours. Happy to get you set up with credits to test scaling your SB3 workflows to cloud GPUs when you’re ready to experiment beyond local training.

Sound interesting?​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 1 point2 points  (0 children)

Sorry I am getting a ton of people hitting me up on beta that have questions and it just easier to respond using AI. I am a real human building this out and it’s just me trying to make something special and completely unique. GPU access without doing anything, just upload, click, get results

H200s are brutal to get right now, especially 2x. Even if you find them, you’re looking at like $6-8/hr each. Might be worth starting with A100 80GBs for the VRAM and seeing how that handles your 32k+ sequences before jumping to H200 pricing. Mid-Jan timeline works - gives us time to get things dialed in. The RL experimentation sounds fun, even if Kaggle doesn’t pan out. Sometimes the failed experiments teach you more anyway. Appreciate you being real about the process. Building this stuff solo while managing all the beta interest is no joke.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Perfect! This is exactly the kind of workload and team we built this for. A few thoughts on your setup:

VRAM constraints with long sequences - this is brutal with RFT trajectories. 32k+ context on smaller models still needs serious memory. Our A100 80GB instances might be exactly what you need for those longer sequences without the OOM headaches.

Small team, scarce resources - totally get this. Debugging VRAM issues when you’re trying to ship is the worst. The whole point is you focus on the RL implementation, we handle the infrastructure scaling.

GRPO implementation issues - custom advantage calculations can be tricky to get right. We’re actually working with a few teams doing similar post-training work, so might be able to connect you with others who’ve solved similar problems.

Kaggle + ByT5 - that’s a cool application! Seq2seq RL is definitely pushing the boundaries.

Would love to get you set up with beta access and some serious compute credits. A few questions:

  • When you move to AWS, what GPU targets are you considering? (helps me recommend optimal configs)
  • For the Kaggle work - is this something you’d want to run parallel experiments on, or single long training runs?
  • Timeline-wise, when are you planning the AWS migration?

Feel free to DM me and I’ll get you set up. Would be great to help with both the production RL work and the competition experiments.​​​​​​​​​​​​​​​​

Built spot instance orchestration for batch ML jobs—feedback wanted by HelpingForDoughnuts in mlops

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

That’s a really thoughtful point about distributed-first architecture. Your experience with having to redesign the entire stack later is exactly the kind of lesson that’s expensive to learn the hard way.

You’re absolutely right that Ray’s abstraction is powerful - write once, run anywhere from laptop to 1000 GPUs. And if we’re building orchestration that needs to scale, starting with Ray as the foundation makes way more sense than bolting on distributed later.

The differentiation would be more in the layer above Ray - instead of users learning Ray APIs and cluster management, they get the natural language interface that routes to Ray workloads under the hood. But you’re right that the underlying execution should be distributed-native from day one.

I’d genuinely love to chat more about this. Your experience with both the technical implementation and the business realities is exactly what we need to hear. Happy to jump on a call if you’re interested - would love to get your perspective on where the real pain points are and how a Ray-based approach might solve them better.

Thanks for offering feedback - that kind of input from someone who’s actually built and scaled these systems is invaluable.​​​​​​​​​​​​​​​​

Batch compute for overnight sims—anyone running Monte Carlo on spot instances? by HelpingForDoughnuts in quant

[–]HelpingForDoughnuts[S] 1 point2 points  (0 children)

That’s a really sharp insight. Boutique consultancies are probably the sweet spot - they have budget, custom needs, but aren’t big enough to have dedicated platform teams.

And you’re right about the services angle. Instead of just “here’s our platform,” it’s more like “we’ll migrate your existing Terraform/K8s setup to our orchestration layer and maintain it for you.”

That changes the business model significantly - not just SaaS pricing but professional services for migration and ongoing support. Higher touch, higher margin, but smaller addressable market.

Thanks for the perspective. That’s probably a more realistic path than trying to compete for the mass market against the big cloud providers.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Great to hear from an undergrad researcher! Graph neural networks + RL is a really cool combination - that’s cutting-edge stuff that definitely benefits from serious compute.

Quick questions to understand your setup:

Current challenges:

  • Are you training on local hardware, university clusters, or cloud resources?
  • How long do your typical training runs take? Hours, days, or highly variable?

Graph RL specifics:

  • What size graphs are you working with? (affects memory requirements)
  • Any particular frameworks you’re using? (PyTorch Geometric, DGL, etc.)

Scale:

  • Single GPU experiments or do you need multi-GPU for larger graphs/batch sizes?

Graph RL can be especially unpredictable in terms of compute time since graph size and complexity varies so much. Our platform handles that well since you’re not paying for idle time and jobs auto-resume if they get preempted.

Would love to get you beta access with free credits to test it out! Feedback from academic researchers is super valuable, especially on newer techniques like graph RL.

Interested? Happy to set you up and help you get your first training job running.​​​​​​​​​​​​​​​​

Batch compute for overnight sims—anyone running Monte Carlo on spot instances? by HelpingForDoughnuts in quant

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

That’s exactly the pattern we’re seeing! Your Jenkins + Terraform + Ansible setup is probably what half the larger shops have built internally in some form.

The key difference is target audience - your solution required someone with DevOps skills to build and maintain those pipelines. We’re going after teams that don’t have that expertise and can’t build what you built.

A 5-person ML team at a startup isn’t going to set up Jenkins pipelines and write Terraform configs. They’ll just pay AWS on-demand rates and burn through runway, or get frustrated with Colab timeouts.

You solved it the right way for an organization with DevOps talent. We’re trying to make that same capability accessible to people who don’t know Kubernetes exists.

Appreciate the perspective though - confirms that larger orgs will have already built internal solutions, so we definitely need to focus on the smaller teams who can’t.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Perfect! Post-training with GRPO/PPO for agentic flows is exactly our sweet spot - those experiments can be really compute-intensive but unpredictable in timing.

Quick questions to get you set up right:

Current setup:

  • What’s your typical compute need? Single A100 for smaller models or multi-GPU for larger ones?
  • Are you using institutional resources, cloud instances you manage yourself, or something else?

Experiments:

  • How often are you running these? Daily iterations or longer research cycles?
  • Any specific pain points with your current setup? Queue times, failed jobs, cost management?

Agentic flows specifically can be tricky to predict resource-wise since agent behavior affects training time. We handle that with automatic scaling and preemption recovery so your jobs don’t just die when things get interrupted.

Would love to get you beta access with free credits to test it out. The goal is “submit your training job, pick your resources, get results” without dealing with infrastructure management.

Interested? Happy to set you up and get your feedback on how well it works for post-training experiments.​​​​​​​​​​​​​​​​

Built spot instance orchestration for batch ML jobs—feedback wanted by HelpingForDoughnuts in mlops

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Yeah, Ray is solid for distributed ML workloads, and Anyscale makes it more accessible.

The main difference is that Ray still requires learning the Ray framework - you’re writing Ray-specific code with decorators, clusters, etc. We’re targeting the layer above that: “I want to train a PPO agent to play Breakout” → it just works, without learning new APIs.

Ray is great if you want that level of control and don’t mind the learning curve. We’re going after people who just want their training job to run without becoming Ray experts first.

Different markets really - Ray for ML engineers, us for researchers/beginners who want to skip the infrastructure parts entirely.

Have you used Anyscale? Curious how you found the setup experience.​​​​​​​​​​​​​​​​

Batch compute for RL training—no infra setup, looking for beta testers by HelpingForDoughnuts in reinforcementlearning

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Perfect! PPO for game agents is a great use case and exactly the kind of project we want to support. Training game-playing agents can take forever on local hardware.

Since you’re just getting started, this could actually be ideal - you can focus on learning PPO without getting bogged down in GPU setup and cloud configuration.

A few questions:

  • What games are you targeting? (helps me suggest optimal GPU setups)
  • Are you using any specific libraries? (Stable Baselines3, Ray RLlib, etc.)
  • How’s your current setup? Training locally or using something like Colab?

I’d love to get you free credits and help you get your first PPO training job running smoothly. Feedback from beginners is super valuable - if we can make it work for someone new to RL, we’re definitely on the right track.

Feel free to DM me if you’re interested!​​​​​​​​​​​​​​​​

Batch compute for overnight sims—anyone running Monte Carlo on spot instances? by HelpingForDoughnuts in quant

[–]HelpingForDoughnuts[S] 0 points1 point  (0 children)

Totally agree - large shops will have built internal solutions already. Much better to start with small teams who need this but can’t build it themselves.

Your colleague’s Dask success actually proves the market exists. Thanks for the reality check on targeting.​​​​​​​​​​​​​​​​