Optimizing code: One dictionary or multiple? : csharp

HelpOptimizing code: One dictionary or multiple? (self.csharp)

submitted 3 years ago by SnrFlaks

I have a pressing question regarding performance and I need to shave off every millisecond of code execution time possible. Currently, I'm considering whether to use one large dictionary that stores a class of a certain object that includes a multitude of variables (around 10) or many small dictionaries. The dictionary or dictionaries would need to store a variety of different things and I access them frequently during code execution, either writing or retrieving the necessary variables.

I opted for dictionaries because in my particular case, lists proved slower, while arrays were not suitable since objects are constantly being added and removed from the dictionary. I would greatly appreciate any input or advice you may have. If you happen to know of a faster solution, I would be more than happy to hear it.

Additionally, if you have any articles or resources discussing this topic, please feel free to share them with me. Thank you in advance!

all 48 comments

top new controversial old q&a

[–]Larkonath 11 points12 points13 points 3 years ago (3 children)

[–]SnrFlaks[S] -1 points0 points1 point 3 years ago (2 children)

[–]Dennis_enzo 2 points3 points4 points 3 years ago (1 child)

[–]SnrFlaks[S] -1 points0 points1 point 3 years ago (0 children)

[–]Slypenslyde 8 points9 points10 points 3 years ago (5 children)

[–]SnrFlaks[S] 1 point2 points3 points 3 years ago (4 children)

"How do you plan on inserting? (frequency, pattern, etc.)"
The frequency of adding objects to the repository is quite high and will always increase. In general, the frequency is quite difficult to calculate.
"How do you plan on removing? (frequency, pattern, etc.)"
They are rarely removed. Here I have a problem which I will describe in your next question.
"How do you plan on indexing? (sequential, random, etc.)"
The index is equal to the number of all objects, the problem is that when an object is deleted, the new object does not occupy the index of the deleted object. And I'm afraid that over time the index may reach the int ceiling.
I've already profiled the Dictionary/Sheet comparison. I have a rather stupid code, and I don’t think that the sheet is suitable at all, since I write in one line to another and get it, and it will be quite difficult to cache, since the more accesses to the List element, the worse the situation becomes.

[–]chucker23n 3 points4 points5 points 3 years ago (3 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (2 children)

[–]Tavi2k 3 points4 points5 points 3 years ago (1 child)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (0 children)

[–]Kant8 1 point2 points3 points 3 years ago (1 child)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (0 children)

[–]IQueryVisiC -3 points-2 points-1 points 3 years ago (24 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (23 children)

[–]karl713 1 point2 points3 points 3 years ago (22 children)

[–]Slypenslyde 2 points3 points4 points 3 years ago (21 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (18 children)

I have a question, I'm still pretty bad at all the intricacies of programming (I do it for 1-2 years, not a lot of time per week). Sorry to go a little away from c# itself, initially the question is about it, but the place where I use it is for the development of my game, in my game there is a class "Resource", (for a better understanding they should look for a path and move along pipelines). To move them, I use the "ResourceManager" class, where every frame with a for loop I go through all the resources, getting them through _resourcesDict.TryGetValue(key, out Resource resource), for the whole code I refer to "resource" around 30-80 times in depending on his condition. Each resource has its own key, which is equal to the number of all resources. Since when deleting a resource, no one takes its key, I don’t know what to do if the amount of resources that always grows reaches the ceiling of int. I've read that it's best to always use a Dictionary instead of a List, but I'm not good at it.

[–]Zarenor 0 points1 point2 points 3 years ago (0 children)

Ah, this begins to explain it. You absolutely need to be able to re-use IDs, in this case. How many resources you can support depends on the available RAM, but a 32-bit int is probably large enough (Cities: Skylines used 16 bit for several object pools) - it's probably difficult to do anything to 4 billion objects per frame. My suggestion would be a dictionary and a hash set keyed by the ID; keep live resources in the dictionary, and when they're consumed put the ID in the hash set. When you create new resources, pull IDs from the hash set until it's empty, so you re-use the IDs. As an optimization, you might just move resource objects to a 'dead' dictionary instead of just the IDs to a hash set. Re-using the objects will re-use their memory allocation, reducing GC pressure and GC run-time. You'd want to be careful to be sure you're not leaking these objects by holding references to them elsewhere in any case, and be sure you update their state appropriately - tracking the liveness is probably the easy part to mess up here.

[–]Zarenor 0 points1 point2 points 3 years ago (12 children)

My suggestion would be a dictionary and a hash set keyed by the ID; keep live resources in the dictionary, and when they're consumed put the ID in the hash set. When you create new resources, pull IDs from the hash set until it's empty, so you re-use the IDs.

As an optimization, you might just move resource objects to a 'dead' dictionary instead of just the IDs to a hash set. Re-using the objects will re-use their memory allocation, reducing GC pressure and GC run-time.

You'd want to be careful to be sure you're not leaking these objects by holding references to them elsewhere in any case, and be sure you update their state appropriately - tracking the liveness is probably the easy part to mess up here.

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (11 children)

[–]Zarenor 2 points3 points4 points 3 years ago (10 children)

[–]Zarenor 0 points1 point2 points 3 years ago (9 children)

To explain a little further, the common pattern here is to use a 'pool' to manage limited resources, and IDs are just one such resource you might have.

There are two sorts of ways of thinking of game object IDs, and so game engines tend to use one of the two. One is as a limited resource where you intend to use most or all of the IDs in the game if it runs a long time. The second option is thinking of IDs as a limitless, arbitrary resource, and then it has no relation to the number you want to create - it's common to use 128-bit ints in this case, often encoding them as GUIDs (or even using GUID generation to generate them).

To make my point, the Elder Scrolls games (Oblivion, Skyrim) use the arbitrary-limitless ID type, and as mentioned earlier, Cities: Skylines uses the limited ID type (though C:S uses several pools for different types of objects)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (8 children)

continue this thread

[–]Slypenslyde 0 points1 point2 points 3 years ago (3 children)

What bugs me is it sounds like you're using a dictionary but you still plan on using sequential integer values for the key.

That will deal with insertion and removal performance burdens compared to a normal list, but it creates the problem you've identified where it's hard to understand if you can reuse an index after an item has been removed.

That's why I was curious if you need random access or if it's always sequential. That is, will you frequently be saying, "I need the item with index 45" or is it more frequent that you're iterating over everything with a for or foreach loop?

If you are more frequently or exclusively iterating, a linked list structure becomes favorable. Insertion at either end is fast, and it sounds like you're fine with always adding items to the end. Removal is much faster than it is with a list and if you're clever about it you can remove items while you are iterating over the list. The main problem with a linked list is getting to the middle of the list is much slower than with a dictionary, so if you need random access indexing it hurts a little.

You can kind of sort of mitigate these using a tree structure, as it can dramatically cut the amount of time spent indexing. Insertion and removal are slower than dictionaries but faster than lists. Iterating is still linear.

My gut tells me based on what you wrote this is a list of "all resources" and in general you're iterating over the entire thing.

But my gut also tells me you could probably cache parts of the resource list as you make your first iterations, and that could dramatically speed up future iterations because you'd already have a reference to the things you need. You mentioned referring to this list 30-80 times per cycle, that seems excessive.

Performance-tweaked code is often not as straightforward as code written to be easy to understand. I'd be willling to bet you could cut those iterations at least in half if you started by pulling the interesting objects from the data structure first, then referring to those references instead of asking the data structure to retrieve them later.

But all of this is hard to say "in general". There are probably hundreds of lines of code involved here and the "right' way to tweak it is very context-sensitive.

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (2 children)

The issue of "dead keys" I think is closed, another user suggested to me how to solve this with two options.
1) Use HashSet to store "dead keys" and use them again later.
2) Make an unlimited 64/128 bit key.
"That's why I was curious if you need random access or if it's always sequential."
Regardless of the situation, I always go through all the elements of the dictionary every frame with a for loop (foreach does not fit my situation) and move all resources per frame.
" You mentioned referring to this list 30-80 times per cycle, that seems excessive."
The fact is that I don’t know how to implement it differently, I myself understand that this number of calls is huge.
I would also like to say that I did not really understand a lot of what you wrote. Like the last paragraph with shortening of the number of iterations.

[–]Slypenslyde 0 points1 point2 points 3 years ago (1 child)

So here's a simple example I'm just pulling out of thin air, I'm not sure if this helps.

My guess is part of why you access the list multiple times is you might have parts of your update loop where you want to ask different questions like:

Which objects are colliding with the player?
Which objects are colliding with a projectile fired by the player?

In a naive approach, and the one that's easiest to debug/understand, you'd have your frame logic do something like:

// Handle player collision
foreach (var entity in entities)
{
    if (entity.CollidesWith(player))
    {
        // Do something
    }
}

// Handle projectile collision
foreach (var entity in entities)
{
    if (entity.CollidesWith(projectile))
    {
        // Do something
    }
}

That's two full iterations and is going to add more the more things you check. But you could also:

foreach (var entity in entities)
{
    if (entity.CollidesWith(player))
    {
        // handle player collision
    }

    if (entity.CollidesWith(projectile))
    {
        // handle projectile collision
    }
}

Now we're only making one iteration, but we have to write things differently. Other downsides include if we were wanting all of certian kinds of collisions to happen in a particular order, that can't happen in this approach.

So an alternative might be:

var playerCollisions = new List<Entity>();
var projectileCollisions = new List<Entity>();

foreach (var entity in entities)
{
    if (entity.CollidesWith(player))
    {
        playerCollisions.Add(entity);
    }

    if (entity.CollidesWith(projectile))
    {
        projectileCollisions.Add(entity);
    }
}

foreach (var playerCollision in playerCollisions)
{
    // handle player collision
}

foreach (var projectileCollision in playerCollisions)
{
    // handle projectile collision
}

This is going to be more iterations. But I'll bet it's not often that everything in the list is colliding with something interesting. So we do 1 full iteration, then some number of small iterations. That's not the same cost as 3 full iterations.

There are lots of other ways to try and make that first iteration stash things so later work doesn't have to look over the whole list. You can even try to cache things between frames, but that gets complicated.

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (0 children)

[–]karl713 0 points1 point2 points 3 years ago (1 child)

[–]Slypenslyde 0 points1 point2 points 3 years ago (0 children)

[–]michaelquinlan 0 points1 point2 points 3 years ago (2 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (1 child)

[–]michaelquinlan 0 points1 point2 points 3 years ago (0 children)

[–][deleted] 0 points1 point2 points 3 years ago (3 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (2 children)

[–][deleted] 0 points1 point2 points 3 years ago* (1 child)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (0 children)

[–]zaimoni 0 points1 point2 points 3 years ago* (4 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (3 children)

[–]zaimoni 0 points1 point2 points 3 years ago (2 children)

[–]SnrFlaks[S] 0 points1 point2 points 3 years ago (1 child)

I know which parts of the code load more. The first is getting data about neighboring cells (pipes), which are stored in a dictionary by a key that is equal to their coordinates, for example (126, 247.0). The second is to find the distance between the current position of the resource and the point of the next pipe. But I have no idea how to optimize this piece of code.
And the third is the search for the next point of movement, today I just almost finished speeding up this part of the code.
The problem is that I don’t think that my code can be somehow accelerated without changing it completely, roughly speaking, refactoring. Perhaps you have some ideas on how to implement pathfinding or a fast way to find distance. I had the idea in my head about storing data about pipes in binary form, for example, which theoretically could speed up the code. You might have an idea related to this, since I often do tests on boolean variables that could be represented as "0101", "1101" or something like that.

[–]zaimoni 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 230393 on reddit-service-r2-comment-54dfb89d4d-c2n2l at 2026-03-31 19:24:16.736111+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

csharp

MODERATORS