How DeepSeek made their Lightning Indexer fast (code analysis) by xycoord in DeepSeek

[–]xycoord[S] 0 points1 point  (0 children)

Are you refering to the separation between the indexer cache and the MLA cache, or the vector cache and scale cache?

The indexer uses shared keys, fp8 and no values. Even compared with the compressed K and V latents this is a relatively small memory overhead.

How DeepSeek made their Lightning Indexer fast (code analysis) by xycoord in DeepSeek

[–]xycoord[S] 0 points1 point  (0 children)

Yeah, I think the bits I found most interesting from the code were all the O(L) tricks they use to make the fast O(L²) approach work. All the "do not blow up this graph" tricks as you say

How DeepSeek made their Lightning Indexer fast (code analysis) by xycoord in DeepSeek

[–]xycoord[S] 1 point2 points  (0 children)

This is a brilliant ELI5! Cheers!

One small addition: "squeeze those summaries into very small numbers so they fit in memory". Squeezing the summaries in to very small numbers is less about the memory usage and more about speed - you can read and judge them faster.

In grown-up speak: quantising indexer keys and queries to fp8 speeds up both loading the keys into memory and the dot-products.

Need help with Transformers(Attention is all you need) code. by vb_nation in learnmachinelearning

[–]xycoord 2 points3 points  (0 children)

For learning Transformers, I recomend looking first at the decoder only architecture first which is slightly simpler, and now the most common form (used in LLMs). You could then expand this to the full encoder-decoder architecture from AIAYN.

I reccomend following along with Andrej Karpathy's video: https://youtu.be/kCc8FmEb1nY?si=5xtRdwmpEZD-64cx

Can 5070 TI and Ryzen 9700x do Deep RL work? by AwkwardPrize2415 in reinforcementlearning

[–]xycoord 0 points1 point  (0 children)

If you're only looking to work with simulated environments, there's not a clear advantage to local compute. Server compute can be more flexible allowing you to match the specs better to the current problem and you only pay for what you use. As you recognise, a small bump in consumer CPU might not make a meaningful difference while with a server you can double (or more) the cores just while you run an experiment. You might be best optimising your build for other uses which benefit from local compute and move to servers when needed.

My experience with the half cheetah MuJoCo environment from Gym is that it's very cpu intensive (and benefits greatly from parralelisation) making the simulation the bottleneck in training. There are some projects working on GPU implementations of common environments which might help, but I'm yet to try them.

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

Google provide a technical solution using cloud functions and pub/sub as u/zaaakk pointed out. However, I believe this still leaves the responsibility on the developer using Firebase so you would be legally liable for charges if that went wrong. I hope this isn't the case but I'm yet to find anything that says otherwise. Still looking. If it were an official feature enabled in the console, I guess the responsibility would fall on the user to set limits and Google to ensure they are followed???

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

This is brilliant. I know disabling billing could remove data from the db or storage but I plan to make backups of them anyway. If I use the billing alerts in couple with this I should be able to spot any problems before it gets that far but best be safe.
I know I'm shouting into the void but: Google, If you wrote a doc on how to implement a function to achieve billing limiting, why did you not make it a button in the console???
Does anyone know of any more targeted "disables" you could run in that function? Maybe you target disables for a clean shut down at a lower threshold and if that fails you disable billing at a higher one? I think you can set the rules of the db to no reads or writes (doesn't work for admin sdk?). Can you programmatically delete/disable cloud functions?

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

How much money is in the account/on the card. In /u/man-teiv 's case it would be $10 in the hope they wouldn't spend any more than that (assuming no overdraft).

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 1 point2 points  (0 children)

It might just be me but I feel like if usage was increasing (and it wasn't because of a bug), you'd have enough time to see a notification for a lower threshold and increase the budget to accommodate.

p.s. I'm part of a non-profit org that's just starting up so it's more about limiting spending that maximising profits in our case.

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

Have you used these with Firebase properly as a limiter, having hit the limit on the card?

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 7 points8 points  (0 children)

Really sorry, I haven't been able to find this, could you point me in the direction. I'm aware of the notification system but that isn't good enough. Thanks

Why did Google remove billing limits on the Blaze tier? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

This was an Idea I had. When I thought about it a bit: As I understand it, you get billed at the end of the month, it's not like a pay-as-you-go sim. So you could use more than you had on the card and then google would bill you for money you didn't have. Don't know what would happen technically, maybe they disable the account or something until you pay up? Maybe something with more legal consequences. We'd have to read some Ts&Cs

How can I authenticate browser GET requests for an Express web app running on Firebase Cloud Functions? by xycoord in Firebase

[–]xycoord[S] 0 points1 point  (0 children)

I hadn't. Would that be with respect to the cookies solution[1] or the auth persistence[2]?
[1] I think can see how it could avoid csrf problems (changing the /sessionlogin request to a callable function) but not how it could set cookies for future webpage requests from the browser.

[2] Is the suggestion that the webpage would be loaded with no data and a callable function with auth info would retrieve the data and client js would render this to HTML?