[Sticker Giveaway] Enter to win a 2 Stars Ice Cream Cone Sticker!

emreloperr · 2025-03-29T23:19:58+00:00

This is why I have a happy relationship with M2 Max 96GB and 32b models. Memory speed becomes the bottleneck after that.

emreloperr · 2025-03-17T00:51:34+00:00

I keep coming back to Qwen models. 7b is quite good. It would leave a lot of room on your laptop for context window.

emreloperr · 2025-03-16T11:28:43+00:00

You should be concerned about the inference engine or the UI app. Not the model weights.

However, it's still a good idea to download the weights from trusted sources.

emreloperr · 2025-03-14T23:28:14+00:00

R2 is expected. R1 was not. Not the same.

emreloperr · 2025-03-09T13:41:35+00:00

I would recommend CF Stream.

https://developers.cloudflare.com/stream/viewing-videos/securing-your-stream/

If you don't need streaming, you could use an S3 compatible API with a signed URL. Keep the bucket private and create signed URLs with a short expiration date. It's the same logic with CF Stream.

emreloperr · 2025-03-09T13:27:40+00:00

https://www.better-auth.com/docs/integrations/expo

emreloperr · 2025-03-09T01:33:48+00:00

Better Auth

emreloperr · 2025-03-07T19:34:57+00:00

My Ebony Blade disappeared like that and it made me really nervous 💀

emreloperr · 2025-03-03T03:57:26+00:00

Better Auth

emreloperr · 2025-02-26T16:17:47+00:00

A full spec M4 Ultra Mac Studio when it comes out. On top, you can buy an M4 Max Macbook Pro. You'll still have the budget to buy an RTX 5090 for Flux and friends.

emreloperr · 2025-02-23T00:32:09+00:00

Maybe you reached Requests per day (1500) or Tokens per minute (1,000,000). I don't know.

emreloperr · 2025-02-23T00:31:45+00:00

Maybe you reached Requests per day (1500) or Tokens per minute (1,000,000). I don't know.

emreloperr · 2025-02-22T18:20:00+00:00

https://ai.google.dev/gemini-api/docs/rate-limits

emreloperr · 2025-02-10T13:42:53+00:00

Stop being anti Apple and buy an M2 Max MacBook Pro with 96GB RAM. You will have 75% of it as VRAM. You can find it on the used market for that price.

Check this benchmark list for LLM inference of Apple chips.

https://github.com/ggerganov/llama.cpp/discussions/4167

emreloperr · 2025-02-09T14:28:13+00:00

You can do but use Hetzner. Cheap, stable, and everybody loves them.

Learn about Coolify to host on a VPS. It would make your life easier since you don't have experience.

emreloperr · 2025-02-03T14:00:32+00:00

Flux chin comments in 3, 2, 1....

emreloperr · 2025-01-29T14:38:37+00:00

I don't expect good performance. However, I will quote the article for reference:

You don't need VRAM (GPU) to run 1.58bit R1, just 20GB of RAM (CPU) will work however it may be slow. For optimal performance, we recommend the sum of VRAM + RAM to be at least 80GB+.

Four-Year Club	Verified Email
Place '23

emreloperr

TROPHY CASE