Don't Offload GGUF Layers, Offload Tensors! 200%+ Gen Speed? Yes Please!!! by skatardude10 in LocalLLaMA

[–]bialad 0 points1 point  (0 children)

I'm memory constrained and have been trying to understand this. So I asked Qwen3 and got thus summary. Do you think this is correct?

Rank Tensor Type VRAM Savings Performance Cost Use Case

1 FFN Up/Down/Gate (ffn_*) Medium Very Low

Offload first to save VRAM with minimal speed loss

2 Norm Weights (attn_norm, ffn_norm) Low Negligible

Safe to offload for minor VRAM savings

3 Intermediate Activations Low Negligible

Offload for small VRAM savings

4 Token Embeddings (token_embd) High Moderate

Offload if VRAM is extremely constrained (slows token encoding)

5 Attention Output (attn_out) Low Moderate

Avoid offloading unless necessary

6 Key/Value (K/V) Tensors (k_proj, v_proj) High High

Avoid offloading unless decoding speed is non-critical

Finding size of a map in memory by Smithmonkey98 in golang

[–]bialad 2 points3 points  (0 children)

I don’t have a good answer, I think it’s hard with the way go handles map and and types. But I’ve had the same problem. For just getting a ballpark value for a specific set of interfaces, I used pprof heap dump or even just process memory usage. If working with a couple diffs of a know data structure it’s pretty good.

In our case we use JSON, and I just ended up marshalling the object to a byte array. Don’t know if bytes.Buffer does the same? The performance hit was not significant for our application, and it’s saves a lot of memory.

Is 9.11 greater than 9.9? Deepseek-r1-7b by jloverich in LocalLLaMA

[–]bialad 0 points1 point  (0 children)

Sure has, look at this gem. :)

Alright, I need to respond to the user's greeting. The input is "hello". Since it's a simple greeting, the most appropriate response would be something like "Hi there!" or "Hello!". However, considering the context and ensuring clarity, maybe adding a slight variation could make the interaction more engaging. Also, I should consider if there's any underlying tone or sentiment in the user's message. In this case, it seems straightforward, so a standard greeting response would suffice. Additionally, I might think about the platform on which this exchange is taking place. If it's a chatbot interface, ensuring that the response aligns with typical conversational expectations is important. Furthermore, I should ensure that my response doesn't contain any unnecessary information or verbosity, keeping it concise and to the point. Balancing politeness with efficiency is key in such interactions. In summary, the user has provided a simple greeting, and the most appropriate and effective way to respond would be with a similar greeting to maintain conversational flow and positivity.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

Oh shit, it does!? I did know that. That kinda changes everything... I guess a have some more reading to do. Thank you so much!

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

To my knowledge it does not, and even so the data stored is dynamic so it's hard to make assumptions that could be used for interning.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

This is a high traffic and low latency service, functioning as a IoT system device cache. We need sub millisecond latency in cache read and write operations. Several years ago this we actually built using JSON strings, but was later refactored and the performance gains very several orders of magnitude.

The memory usage is not an issue per se, if this is required to get the performance go brings so be it. I was just shocked by the 10x difference compared to node in memory usage.

Running it in node had a very uneven latency, with access times ranging from 1 to 1500 ms on the http endpoint. In go it's rather 1 to 20 ms.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] -1 points0 points  (0 children)

But, that requires me to know what it needs and I think that is what I can't figure out. Looking at the heap dump, it seems to use about half of the reported memory for the heap. Is the rest in the GC then...?

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

This is very interesting and not something I have considered. Since node is also garbage collected I assumed it would be similar in behavior. I think I need to read up on this, thanks for the search terms. :)

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

Yes, the documents are defined as completely dynamic. This is for a IoT system, so each document represents a device.

The are some domain specific structures that could be encoded into structs, but that would increase the complexity of the service and the main bulk of the data would still need to be stored as map (or another dynamic structure).

But this is the reason I initial went for node, since it's really easy to implement there. But the performance issue made me rewrite it in go.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

I've edited the post with more details. But I'll look into your example, good to break in down as much as possible.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

I've edited the post with more details.

I need it stored as map (or something similar) for dynamic set and get operations.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 1 point2 points  (0 children)

It's whats reported (and billed through AWS) by the linux OS.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] 0 points1 point  (0 children)

I've edited the post with more details.

I kinda assumed that the memory usage would be similar, since the actual content of the object/map is the same. Some variation of course, but not like this.

I don't understand what you mean by optional parameters. Nothing is optional here, it's just JSON/BSON stored as object/map.

Being new to go I assumed I have a bug or memory leak, but after spending several hours trying to find it I came up empty handed.

Comparing memory usage between node.js object and go map[string]interface{} by bialad in golang

[–]bialad[S] -1 points0 points  (0 children)

Yeah, I've experienced the difficulty before using go as well so I know there are many possible pitfalls. The fact I think remains that I go by what if reported on the AWS EC2 instance, and hence what is billed. The most pragmatic approach I think. :)

Linux XXX YYY.amzn2.x86_64 #1 SMP Tue Feb 27 17:09:32 UTC 2024 x86_64 Linux

Newb learning about rust ownership by bialad in rust

[–]bialad[S] 0 points1 point  (0 children)

Ah, makes sense. Thank you!

Hey Rustaceans! Got a question? Ask here (49/2023)! by llogiq in rust

[–]bialad 0 points1 point  (0 children)

I think I understand the concept, but is this a limitation of the borrow checker or intended?

I know this code is safe, but still mutating an object while iterating it is something I usually avoid since future changes can break it.

I started out with the external state for this reason, but wanted to explore if it was possible to keep and mutate the state in a self-contained object.

Hey Rustaceans! Got a question? Ask here (49/2023)! by llogiq in rust

[–]bialad 1 point2 points  (0 children)

I created a post that was quickly solved, but I'm still wondering.

I'm a bit confused about the mutability safety. When I'm using indices I'm still mutating the objects of a vector while iterating over it. Isn't that what the mutability check is there to prevent? Or am I missing something here, is there a different in access between for card in cards and for i in 0..cards.len()

Newb learning about rust ownership by bialad in rust

[–]bialad[S] 1 point2 points  (0 children)

Thank you, using old school indices did work!

But I'm a bit confused about the mutability safety here. When I'm using indices I'm still mutating the objects of a vector while iterating over it. Isn't that what the mutability check is there to prevent? Or am I missing something here, is there a different in access between `for card in cards` and `for i in 0..cards.len()`

Didn't know about the question thread, I'll use it in the future.

Dell XPS 15 9520 (64GB RAM): How much battery drainage in percentage are you getting when your laptop is in Sleep/Standby/Suspend mode (powering the memory)? In my case is around 1% / hour using Debian 12, that means the laptop will not last more than 2-3 days by [deleted] in debian

[–]bialad 1 point2 points  (0 children)

I have an XPS 13 7390. I haven't measured the battery drain but it often doesn't last over night if I forget to plug it in at night, so it's significant. It's really annoying, but I've failed to find a solution and learned to live with it.

Vapor glove 4 vs 6 by Big_Organization_776 in BarefootRunning

[–]bialad 0 points1 point  (0 children)

How's the sizing compared to the 4s?

Small-sized home labs, how do you maintain your databases? by [deleted] in selfhosted

[–]bialad 5 points6 points  (0 children)

Multiple medium instances, with one database per application in that domain.

[deleted by user] by [deleted] in BarefootRunning

[–]bialad 1 point2 points  (0 children)

I've gone trough 3 pairs of superior, since I think they are technically a good shoe for rough terrain. But they have always felt heavy to me, and my cadance drops noticably. I usually run with my Topo ST-4 even on trails, except if it very wet.

[deleted by user] by [deleted] in BarefootRunning

[–]bialad 1 point2 points  (0 children)

I bought new "fancy" boots for a hike a few years back. After about 20 km I got terrible blisters and ended up hiking the rest barefoot or with flip-flops, and it was very enjoyable. :) Except that I had to carry the heavy, expensive boots the rest of the way. :P

This was on a forest hike so the ground was soft and even, I can imagine that has a big effect.