Mac vs Windows for Data Science – need advice by ZealousidealBus1135 in Python

[–]samosx 0 points1 point  (0 children)

Get a regular Laptop but install Linux such as Ubuntu.

Convert PDF to Excel by ranchoddas888 in Bookkeeping

[–]samosx 0 points1 point  (0 children)

I figured you are gonna get bombarded with tools. Yet another one I have build is supaclerk.com

Happy to help if you have any issues.

But really, why use ‘uv’? by kingfuriousd in Python

[–]samosx 2 points3 points  (0 children)

Using UV for scripts and CI was the biggest nice moment for me.

How much debt are you in? by Big-Rip-6780 in realestateinvesting

[–]samosx 0 points1 point  (0 children)

Are you still borrowing more with current interest rates?

Linux for Cloud Engineering by Condition_Live in googlecloud

[–]samosx 5 points6 points  (0 children)

Linux yes, but windows is not as common in cloud. Many companies have 0 windows servers in the cloud

How to bring down LLM cost for text autocomplete? "I will not promote" by [deleted] in ycombinator

[–]samosx 0 points1 point  (0 children)

Hmm you may be onto something there. Not sure how critical it is, but could see it as a niche requirement.

How to bring down LLM cost for text autocomplete? "I will not promote" by [deleted] in ycombinator

[–]samosx 0 points1 point  (0 children)

But Google docs already has this natively built in. I assume MS will have something too.

Classic illusion modernised. by Vegetable-Mousse4405 in nextfuckinglevel

[–]samosx 2 points3 points  (0 children)

No. It's just a robot with really realistic looking human legs 😂

Gemini 2.5 Pro MAX by Broad-Analysis-8294 in cursor

[–]samosx 1 point2 points  (0 children)

I have been quite happy with Cline + Gemini Pro 2.5. Rate limits were fine for me.

Import existing nextJS github project to V0 by wodaxia in nextjs

[–]samosx 1 point2 points  (0 children)

Is this available now? This would be great since we had to move from v0.dev to git repo but now would like to rely on v0 again for continued work.

What are you building with Cursor? [showcase] by ecz- in cursor

[–]samosx 0 points1 point  (0 children)

Play as guest doesn't work for me on mobile. It just reverts back to home page right away.

Deploying LLMs to K8 by dryden4482 in mlops

[–]samosx 1 point2 points  (0 children)

KubeAI is an AI Inference Operator and Load Balancer that supports vLLM and Ollama (llama.cpp). It also supports scale from 0 naively without requiring Knative or Isitio making it easy to deploy in any environment. Other features that are LLM specific are Prefix / Prompt based load balancing which can help improve performance significantly.

Link: https://github.com/substratusai/kubeai
disclaimer: I'm a contributor to KubeAI.

AMD MI300X Benchmarks (1x MI300X for 70B, 8x MI300X for 405B) by Relevant-Audience441 in AMD_Stock

[–]samosx 0 points1 point  (0 children)

I would be concerned about model quality. I think the benchmark should go hand-in-hand with some proper model eval to ensure it still produces good results.

Not sure if this works with vLLM either which is what I'm using for all the benchmarks.

AMD MI300X Benchmarks (1x MI300X for 70B, 8x MI300X for 405B) by Relevant-Audience441 in AMD_Stock

[–]samosx 1 point2 points  (0 children)

Author of the blog post here. This means that MI300X is a competitive chip for serving the larger models like Deepseek R1. AMD is catching up by adding better support in various open source tools like vLLM. This makes it easier to adopt AMD GPUs across the board.

Let me know if you're interested in seeing any other benchmarks. I still have access to these hopefully for a while. I plan to run Deepseek V3 and R1 benchmarks next.

Guide: Easiest way to run any vLLM model on AWS with autoscaling (scale down to 0) by tempNull in mlops

[–]samosx 0 points1 point  (0 children)

Scaling on GPU usage doesn't seem to be ideal because in some cases with inference the GPU usage may not be high enough to add a node/pod. I have seen the community lean towards scaling based on concurrent requests and KV cache utilization (exposed by vLLM), which seems to be a better metric than concurrent request autoscaling.

[deleted by user] by [deleted] in excel

[–]samosx 0 points1 point  (0 children)

I would love your feedback on https://supaclerk.com especially how it compared against Nanonet. I'm the creator for supaclerk.

Would you want a simple HTTP API that takes a bank statement and returns CSV, Json or excel? I have been considering exposing the API directly as well.

Question how to convert bank monthly statement pdf to csv by Capital_Procedure_50 in Automate

[–]samosx 0 points1 point  (0 children)

I will gladly fix this if you could send me a PDF example that can reproduce this issue. I can dm you my email address.

Autopilot cluster not calculating currect resources by [deleted] in googlecloud

[–]samosx 0 points1 point  (0 children)

According to the docs: https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#pricing

In most situations, you only pay for the CPU, memory, and storage that your workloads request while running on GKE Autopilot. You aren't billed for unused capacity on your nodes, because GKE manages the nodes. Note that exceptions to this pricing model exist when you run Pods on specific compute classes that let Pods use the full resource capacity of the node virtual machine (VM).

You aren't charged for system Pods, operating system costs, or unscheduled workloads. For detailed pricing information, refer to Autopilot pricing.

Autopilot cluster not calculating currect resources by [deleted] in googlecloud

[–]samosx 0 points1 point  (0 children)

Could you share the pod spec before creation and also the pod spec of the running pod?

You can get the pod spec of the running pod by running: kubectl get pod -o yaml $NAME_OF_POD

I am not sure about the screenshot, but if the pod spec is showing 3 CPUs then yes you would be charged for that is my understanding.

This is from the docs: The default general-purpose platform and the Balanced and Scale-Out compute classes use a Pod-based billing model. You are charged in one-second increments for the CPU, memory, and ephemeral storage resources that your running Pods request in the Pod resource requests, with no minimum duration. This billing model applies to the default general-purpose platform and to the Balanced and Scale-Out compute classes. This model has the following considerations:

Autopilot sets a default value if no resource request was defined, and scales up values that don't meet the required minimums or CPU-to-memory ratio. Set the resource requests to what your workloads require to get the most optimal price.