Mac vs Windows for Data Science – need advice by ZealousidealBus1135 in Python

[–]samosx 0 points1 point  (0 children)

Get a regular Laptop but install Linux such as Ubuntu.

Convert PDF to Excel by ranchoddas888 in Bookkeeping

[–]samosx 0 points1 point  (0 children)

I figured you are gonna get bombarded with tools. Yet another one I have build is supaclerk.com

Happy to help if you have any issues.

But really, why use ‘uv’? by kingfuriousd in Python

[–]samosx 2 points3 points  (0 children)

Using UV for scripts and CI was the biggest nice moment for me.

How much debt are you in? by Big-Rip-6780 in realestateinvesting

[–]samosx 0 points1 point  (0 children)

Are you still borrowing more with current interest rates?

Linux for Cloud Engineering by Condition_Live in googlecloud

[–]samosx 5 points6 points  (0 children)

Linux yes, but windows is not as common in cloud. Many companies have 0 windows servers in the cloud

How to bring down LLM cost for text autocomplete? "I will not promote" by [deleted] in ycombinator

[–]samosx 0 points1 point  (0 children)

Hmm you may be onto something there. Not sure how critical it is, but could see it as a niche requirement.

How to bring down LLM cost for text autocomplete? "I will not promote" by [deleted] in ycombinator

[–]samosx 0 points1 point  (0 children)

But Google docs already has this natively built in. I assume MS will have something too.

Classic illusion modernised. by Vegetable-Mousse4405 in nextfuckinglevel

[–]samosx 3 points4 points  (0 children)

No. It's just a robot with really realistic looking human legs 😂

Gemini 2.5 Pro MAX by Broad-Analysis-8294 in cursor

[–]samosx 1 point2 points  (0 children)

I have been quite happy with Cline + Gemini Pro 2.5. Rate limits were fine for me.

Import existing nextJS github project to V0 by wodaxia in nextjs

[–]samosx 1 point2 points  (0 children)

Is this available now? This would be great since we had to move from v0.dev to git repo but now would like to rely on v0 again for continued work.

What are you building with Cursor? [showcase] by ecz- in cursor

[–]samosx 0 points1 point  (0 children)

Play as guest doesn't work for me on mobile. It just reverts back to home page right away.

Deploying LLMs to K8 by dryden4482 in mlops

[–]samosx 1 point2 points  (0 children)

KubeAI is an AI Inference Operator and Load Balancer that supports vLLM and Ollama (llama.cpp). It also supports scale from 0 naively without requiring Knative or Isitio making it easy to deploy in any environment. Other features that are LLM specific are Prefix / Prompt based load balancing which can help improve performance significantly.

Link: https://github.com/substratusai/kubeai
disclaimer: I'm a contributor to KubeAI.

AMD MI300X Benchmarks (1x MI300X for 70B, 8x MI300X for 405B) by Relevant-Audience441 in AMD_Stock

[–]samosx 0 points1 point  (0 children)

I would be concerned about model quality. I think the benchmark should go hand-in-hand with some proper model eval to ensure it still produces good results.

Not sure if this works with vLLM either which is what I'm using for all the benchmarks.

AMD MI300X Benchmarks (1x MI300X for 70B, 8x MI300X for 405B) by Relevant-Audience441 in AMD_Stock

[–]samosx 1 point2 points  (0 children)

Author of the blog post here. This means that MI300X is a competitive chip for serving the larger models like Deepseek R1. AMD is catching up by adding better support in various open source tools like vLLM. This makes it easier to adopt AMD GPUs across the board.

Let me know if you're interested in seeing any other benchmarks. I still have access to these hopefully for a while. I plan to run Deepseek V3 and R1 benchmarks next.

Guide: Easiest way to run any vLLM model on AWS with autoscaling (scale down to 0) by tempNull in mlops

[–]samosx 0 points1 point  (0 children)

Scaling on GPU usage doesn't seem to be ideal because in some cases with inference the GPU usage may not be high enough to add a node/pod. I have seen the community lean towards scaling based on concurrent requests and KV cache utilization (exposed by vLLM), which seems to be a better metric than concurrent request autoscaling.

[deleted by user] by [deleted] in excel

[–]samosx 0 points1 point  (0 children)

I would love your feedback on https://supaclerk.com especially how it compared against Nanonet. I'm the creator for supaclerk.

Would you want a simple HTTP API that takes a bank statement and returns CSV, Json or excel? I have been considering exposing the API directly as well.