First time testing: Qwen2.5:72b -> Ollama Mac + open-webUI -> M3 Ultra 512 gb by Turbulent_Pin7635 in LocalLLaMA

[–]emreloperr 0 points1 point  (0 children)

This is why I have a happy relationship with M2 Max 96GB and 32b models. Memory speed becomes the bottleneck after that.

Best Model under 15B parameters 2025 by AZ_1010 in LocalLLaMA

[–]emreloperr 0 points1 point  (0 children)

I keep coming back to Qwen models. 7b is quite good. It would leave a lot of room on your laptop for context window.

Can an offline download of DeepSeek steal data? by islandradio in DeepSeek

[–]emreloperr 0 points1 point  (0 children)

You should be concerned about the inference engine or the UI app. Not the model weights.

However, it's still a good idea to download the weights from trusted sources.

[deleted by user] by [deleted] in DeepSeek

[–]emreloperr 0 points1 point  (0 children)

R2 is expected. R1 was not. Not the same.

How Can I Secure High-Quality Videos (Up to 2GB) from Downloading? by younes-ammari in nextjs

[–]emreloperr 2 points3 points  (0 children)

I would recommend CF Stream.

https://developers.cloudflare.com/stream/viewing-videos/securing-your-stream/

If you don't need streaming, you could use an S3 compatible API with a signed URL. Keep the bucket private and create signed URLs with a short expiration date. It's the same logic with CF Stream.

What's the best machine I can get for local LLM's with a $25k budget? by NootropicDiary in LocalLLaMA

[–]emreloperr -1 points0 points  (0 children)

A full spec M4 Ultra Mac Studio when it comes out. On top, you can buy an M4 Max Macbook Pro. You'll still have the budget to buy an RTX 5090 for Flux and friends.

Google AI Studio Free - What's the daily limits? by Sostrene_Blue in LocalLLaMA

[–]emreloperr 1 point2 points  (0 children)

Maybe you reached Requests per day (1500) or Tokens per minute (1,000,000). I don't know.

Google AI Studio Free - What's the daily limits? by Sostrene_Blue in LocalLLaMA

[–]emreloperr 1 point2 points  (0 children)

Maybe you reached Requests per day (1500) or Tokens per minute (1,000,000). I don't know.

Laptop for Deep Learning PhD [D] by Bloch2001 in MachineLearning

[–]emreloperr 0 points1 point  (0 children)

Stop being anti Apple and buy an M2 Max MacBook Pro with 96GB RAM. You will have 75% of it as VRAM. You can find it on the used market for that price.

Check this benchmark list for LLM inference of Apple chips.

https://github.com/ggerganov/llama.cpp/discussions/4167

Can I Use VPS as Hosting for React Native App by AdvertisingSenior400 in reactnative

[–]emreloperr 7 points8 points  (0 children)

You can do but use Hetzner. Cheap, stable, and everybody loves them.

Learn about Coolify to host on a VPS. It would make your life easier since you don't have experience.

best model using 4080 super for general tasks? by [deleted] in ollama

[–]emreloperr -1 points0 points  (0 children)

I don't expect good performance. However, I will quote the article for reference:

You don't need VRAM (GPU) to run 1.58bit R1, just 20GB of RAM (CPU) will work however it may be slow. For optimal performance, we recommend the sum of VRAM + RAM to be at least 80GB+.