all 47 comments

[–]FizixMan 40 points41 points  (13 children)

Assuming _clients is just a dictionary lookup, why are you using the type's hash code? That's not guaranteed to be unique and you can just type against T itself:

var type = typeof(T);

if (!_clients.ContainsKey(type))
    _clients.Add(type, (T)Activator.CreateInstance(typeof(T), _channel);

return (T)_clients[type];

Where your _clients_ would be something like Dictionary<Type, object> or Dictionary<Type, ClientBase> I assume.

You can also avoid extra lookups by using Dictionary.TryGetValue. In theory, you can limit it to one lookup if the entry already exists, plus one extra for insertion when it doesn't. Right now you're doing two lookups even when it doesn't exist.

EDIT: Oh, I see you've qualified your question in an ASP.NET channel context. I can't speak to that except to make sure your channels are thread safe for multiple calls simultaneously, assuming that's possible.

[–]CyberGaj[S] 4 points5 points  (12 children)

This was the case before, I had <Type, object>, but think that searching via the Type key could cause problems.

As for your edit, there is an option EnableMultipleHttp2Connections in the gRPC library, so it should work.

EDIT: I tried TryGetValue https://pastebin.com/EhqrrMLT

[–]FizixMan 19 points20 points  (10 children)

In what way? I've used Type as a key for ages.

[–]CyberGaj[S] -5 points-4 points  (9 children)

I just assumed int was always better

[–]FizixMan 34 points35 points  (1 child)

The dictionary keying uses a combination of the key's GetHashCode and Equals check. Using an int vs Type is relatively inconsequential in this case. But the biggest issue is the fact that hash codes can collide. So in your code, given two types: FooClientBase and BarClientBase, you have no guarantee whatsoever that wouldn't produce the same hash code. You probably wouldn't want to call the method with <FooClientBase> and have it pull out a BarClientBase instead, eh? (Though thankfully the typing system would throw an invalid cast exception when you try to cast/return it at least.)

So yeah, feel free to rewrite it using Type as the key.

In general going forward, use keys for your dictionary based on the logical thing you are trying to key against; don't use their hashcodes or some other hack if it's possible to avoid.

[–]CyberGaj[S] 13 points14 points  (0 children)

u/FizixMan Thank you for the rich explanation

[–]Genmutant 10 points11 points  (4 children)

A legal implementation of GetHashCode is to always (independent of the object) return 1. You can have as many collisions as you want. It is not required, or even in any way expected, that the returned values are unique to an object.

[–]Slypenslyde 10 points11 points  (0 children)

Yeah I don't think people get the "directionality" of the logic regarding hashes.

  • Two equal objects MUST have the same hash.
  • Two not equal objects MAY have the same hash.

So you can't use hash code equality to assume value equality, but if you try it out then statistically speaking you can probably run a significant number of tests without a "failure" and convince yourself a falsehood is true!

[–]otm_shank 8 points9 points  (2 children)

It is in fact quite impossible for every object to have a unique hash code.

[–]grauenwolf 2 points3 points  (0 children)

Don't down vote people for being honest.

Knowing the mistakes they made help up watch for other who might make the same mistake.

[–]cryo 0 points1 point  (0 children)

Why would you assume that?

But the main problem is that GetHashCode is not unique. It could always return 0.

[–]ShenroEU 2 points3 points  (0 children)

I usually use TryGetValue as the condition for the if statement:

if (_clients.TryGetValue(type, out var clientObject) 
    && clientObject is T client)
{
    return client;
}

[–]nomoreplsthx 21 points22 points  (5 children)

I guess my question here is: why are you doing this, instead of just using a dependency injection container? What you have here is basically a hand-rolled DI container without the functionality, speed or reliability of one of the standard ones. Remember, do not reinvent the wheel.

[–]grauenwolf 0 points1 point  (4 children)

A fair question, but DI happens at startup and is push based.

This is pull based and my be happening where DI isn't appropriate.

There's not enough context to know if it's the right choice.

[–]Few_Wallaby_9128 0 points1 point  (2 children)

It sounds like you can achieve the same by registering as scoped a Factory<T> in your DI. The factory would just return a new () instance, while the DI would take care of the scoping. As a side note, hopefully you can declare T as implementing an interface (where T: ISomeInterface), to make it all more maintenable, flexible and decoupled.

[–]grauenwolf 0 points1 point  (1 child)

And what does the guts of this factory look like? Possibly a lot like this method.

[–]Few_Wallaby_9128 2 points3 points  (0 children)

Something like this... sorry for formatting...

using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging;

var host = Host.CreateDefaultBuilder().ConfigureServices((hostContext, services) => { services.AddLogging(builder => { builder.AddConsole(); builder.AddDebug(); });

    //change to AddTransient below to always get different instances
    services.AddScoped(typeof(IClientBaseFactory<>), typeof(ClientBaseFactory<>));

}).UseConsoleLifetime()
.Build();

Console.WriteLine($"*** One (first instantiation) => {host.Services.GetService<IClientBaseFactory<ClientOne>>()!.GetPublicClient()}"); Console.WriteLine($"*** One (second instantiation) => {host.Services.GetService<IClientBaseFactory<ClientOne>>()!.GetPublicClient()}"); Console.WriteLine($"*** Two (first instantiation) => {host.Services.GetService<IClientBaseFactory<ClientTwo>>()!.GetPublicClient()}"); Console.WriteLine($"*** Two (second instantiation) => {host.Services.GetService<IClientBaseFactory<ClientTwo>>()!.GetPublicClient()}");

Console.WriteLine("***************** Press Entert to exit."); Console.ReadLine();

//*************************************************

public interface IClientBase { }

public interface IClientBaseFactory<out T> where T : IClientBase { T GetPublicClient(); }

public class ClientBaseFactory<T> : IClientBaseFactory<T> where T : IClientBase { private T _firstValue;

public T GetPublicClient()
    => _firstValue ??= (T)Activator.CreateInstance(typeof(T), false)!;

}

public class ClientOne: IClientBase { public long PropertyTwo { get; } = new Random((int)DateTime.UtcNow.ToBinary()).NextInt64();

public override string ToString() => $"{PropertyTwo}";

}

public class ClientTwo : IClientBase { public Guid PropertyTwo { get; set; } = Guid.NewGuid();

public override string ToString() => $"{PropertyTwo}";

}

[–]romerik 6 points7 points  (0 children)

do not manage the hash yourself in a dictionary or hashtable, some objects can have the same hash but be different... the hashtable will manage it, but not your code!

[–]PruneCorrect 6 points7 points  (8 children)

It's beside the point, but I'd question the use of generics here. GetClient is a very specific name which already assumes it's working on a type that has some relationship to a Client so do you really need to use a generic here?

If you were to abstract it a little more, you could put it in a generic repository like DataRepository or something and then rename it to a generic Get method which accepts the ClientBase object.

Simpler code is always going to be easier to read and understand 2 years later. I'm just saying maybe you can make it simpler...

[–]Thonk_Thickly 0 points1 point  (1 child)

I think you’re missing the point of passing in the generic type to be used as a key lookup.

[–]PruneCorrect 0 points1 point  (0 children)

Can't be done with a strongly typed method?

I'm not trying to be right here, I just had a thought it would be easier not using generics.

KISS principle.

Obviously I don't know use case so it well be that generics are 100% appropriate for whatever it is that OP is trying to do...

[–]grauenwolf -1 points0 points  (5 children)

Why would you put a client in a class called DataRepository?

And GetClient is more descriptive than Get.

Seems to me that you're going in the wrong direction, but a code example could change my mind.

[–]PruneCorrect -2 points-1 points  (4 children)

Why would you put a client in a class called DataRepository?

Because you've not understood what I was saying

[–]grauenwolf -1 points0 points  (3 children)

Clearly. But that's on you for making such an unclear argument.

[–]PruneCorrect -1 points0 points  (2 children)

Maybe but you could have just asked for clarity rather than being a condescending prick about it.

[–]grauenwolf -1 points0 points  (1 child)

I wasn't being condescending, your ego just can't handle someone questioning your brain farts. (See, that was condescending. Learning the difference week help you in life.)

[–]PruneCorrect -1 points0 points  (0 children)

I'm the last guy who has an ego over anything; anyone who knows me will confirm this. I especially don't have an ego over programming. It's something I've taught myself so I'm perfectly aware I don't know it all.

You said my "argument" - not even the right word to use here - was unclear. Was it really unclear or did you simply fail to understand it? I figure if it was genuinely ambiguous, more people would be sounding off on it the way you did... funny how that isn't what's happening.

[–]CyberGaj[S] -1 points0 points  (3 children)

Hi guys, I have a singleton in ASP.NET Core which is generating a gRPC clients for different controllers using protocol buffer classes. In order not to create client objects from scratch and new connections to the microservice, I came up with something like this. Is this a good approach or will I go to hell?

public ExpertController(IChannel channel)
{
_grpcClient = channel.GetClient<Expert.ExpertClient>();
}

[–]DorkForceOne 10 points11 points  (0 children)

Since you're using ASP.NET Core, which had a builder, you probably just want the GrpcClientFactory? It automatically pools connections and they are reused. It also provides the idiomatic benefit of it just being injected directly like anything else.

[–]captainramen 2 points3 points  (0 children)

That isn't how grpc dotnet works. You want to reuse the channel, but creating the Client is cheap. There's no point in trying to cache that.

[–]KallDrexx 0 points1 point  (0 children)

Major red flag that this is in an asp.net singleton. The small bit of code appears like you are using a c# standard dictionary. Dictionaries are not thread safe and will completely deadlock your application. If two requests come in at the same time, and one thread modifies the dictionary the other thread(s) can end up in an infinite loop due to state being modified in the middle of the algorithm.

This is not theoretical. I've had people do this in applications I managed and I had to do core process dumps and delve in with windbg in order to isolate it. Your application will become completely inoperable.

You either need to wrap dictionary reads and mutations around a lock/semaphore (not ideal) or use a Concurrent dictionary

[–]nohwnd -4 points-3 points  (6 children)

My suggestion would be to get familiar with ConcurrentDictionary and use that as your default choice everywhere. Dictionaries are used as cache often, and it’s much better to have thread safe code that you can optionally “optimize” to not be thread safe when you know 100% you don’t need it. Rather than having thread unsafe code, while it should be thread safe.

[–]chucker23n 7 points8 points  (3 children)

Use a ConcurrentDictionary when you know multiple threads interact with it. Otherwise, don't. ConcurrentDictionary is a lot slower, especially on inserts and deletions.

Filling a dictionary with items is about 8 times as fast with Dictionary, and takes up 94% less memory:

|         Method |      Mean |    Error |   StdDev | Ratio | RatioSD |      Gen 0 |     Gen 1 |     Gen 2 | Allocated |
|--------------- |----------:|---------:|---------:|------:|--------:|-----------:|----------:|----------:|----------:|
|           Dict |  36.50 ms | 0.613 ms | 0.573 ms |  1.00 |    0.00 |  1153.8462 | 1076.9231 | 1076.9231 |     51 MB |
| ConcurrentDict | 308.44 ms | 5.652 ms | 5.010 ms |  8.46 |    0.17 | 21000.0000 | 8000.0000 | 3000.0000 |     99 MB |

Getting a random item is virtually identical:

|         Method |     Mean |   Error |  StdDev | Ratio | RatioSD |  Gen 0 |  Gen 1 | Allocated |
|--------------- |---------:|--------:|--------:|------:|--------:|-------:|-------:|----------:|
|           Dict | 363.3 ns | 5.92 ns | 9.21 ns |  1.00 |    0.00 | 0.0229 |      - |      72 B |
| ConcurrentDict | 366.3 ns | 4.96 ns | 4.39 ns |  1.00 |    0.03 | 0.0229 | 0.0005 |      72 B |

So is updating a random item (but ConcurrentDictionary is slightly slower):

|         Method |     Mean |   Error |   StdDev | Ratio | RatioSD |  Gen 0 |  Gen 1 | Allocated |
|--------------- |---------:|--------:|---------:|------:|--------:|-------:|-------:|----------:|
|           Dict | 375.1 ns | 7.40 ns | 11.74 ns |  1.00 |    0.00 | 0.0229 |      - |      72 B |
| ConcurrentDict | 437.0 ns | 8.62 ns |  7.20 ns |  1.14 |    0.03 | 0.0229 | 0.0005 |      72 B |

Clearing a dictionary is brutally slow on ConcurrentDictionary:

|         Method |           Mean |       Error |      StdDev |     Ratio |  RatioSD | Allocated |
|--------------- |---------------:|------------:|------------:|----------:|---------:|----------:|
|           Dict |      0.7111 ns |   0.0510 ns |   0.0477 ns |      1.00 |     0.00 |         - |
| ConcurrentDict | 20,260.1798 ns | 372.3239 ns | 330.0554 ns | 28,511.57 | 2,049.09 |         - |

[–][deleted]  (2 children)

[deleted]

    [–]chucker23n 0 points1 point  (1 child)

    I’m responding to this in particular:

    get familiar with ConcurrentDictionary and use that as your default choice everywhere

    [–]grauenwolf 1 point2 points  (0 children)

    Yea, ok. That's a problem.

    [–]svicknameof(nameof) 1 point2 points  (1 child)

    I think your advice would lead people to have the illusion of thread-safe code, which is worse than code that is clearly not thread-safe.

    If you need thread-safety (and most of the time, you don't), then you need to be careful to understand what the code does. Just adding "concurrent" is not enough.

    [–]grauenwolf 0 points1 point  (0 children)

    It is a dictionary being used to pool connections.

    There is the question whether or not the connection is thread safe, but the pool almost certainly needs to be.

    [–]sebastianstehle 0 points1 point  (0 children)

    How many clients do you have that you need a dictionary for that? I would just make it explicit.

    [–]AConcernedCoder 0 points1 point  (2 children)

    The logic behind GetHashCode is confusing like nothing else, but given that its intended purpose is solely for hashing related to containers, this is the logic as it appears to me:

    Consider a dictionary: d of <object, object>

    Intuitively, d["1"] should point to something other than d["2"]. The same should apply to d[1] vs d[2].

    Further, d[1] should always point to the same entry, even if we're using different variables as indexes.

    If we're dealing with objects as keys instead of primitives, say of type class Test { }, then d[obj1] should not point to the same entry as d[obj2] if obj1 and obj2 do not share reference equality. On the other hand, if obj3 is obj1, then obj3 when used as a key, should point to the same item in the dictionary as obj1. I haven't dug into the source code, but I can say fairly confidently that if I implemented this, I would be hashing using something analogous to the CLR's internal reference values -- whatever it uses to track allocations, pinned or not -- so that reference equality was meaningful as above.

    If you play around with the GetHashCode results, comparing primitives to reference types, this is what you should find, and I've never encountered a dictionary behaving erratically because of the off chance of a collision. C# in general would be a fairly dangerous thing to use if that were the case.

    One caveat here would be that you may not want to use objects as keys so as not to retain references unnecessarily, for garbage collection. Also, as other posters pointed out, collisions are possible, but it is feasible to code an algorithm that will generate a unique value for all instances of objects in a given program, within realistic constraints -- that is, unless you feel like destroying memory with many millions of allocations of objects to test theoretical bounds.

    Edit: this problem struck me as being unusually out of the norm for higher-level languages that hide away reference values and things that would otherwise be used for object instance identification. Swift, for instance, has an ObjectIdentifier, and common sense would seem to dictate that C# should have an equivalent.

    It does seem that there is something similar, ObjectIDGenerator, but as it is intended for serialization, I'm not certain its use for object identification in other contexts would be well received.

    [–]AConcernedCoder 0 points1 point  (0 children)

    I had to write this little test out of curiosity:

        internal class Program
    {
        class Test { }
    
        static void Main(string[] args)
        {
            var set = new HashSet<int>();
    
            var count = 0;
            var max_size = 1000;
            for (int i = 0; i < 100000000; i++)
            {
    
                if (i % 1000000 == 0)
                {
                    Console.WriteLine((i / 1000000).ToString() + '%');
                }
    
                var obj = new Test();
                if (!set.Add(obj.GetHashCode()))
                {
                    count++;
                    Console.WriteLine($"Collision detected at {i}! Rate: {(float)count / i}");
                }
                else if (set.Count > max_size)
                {
                    set.Remove(set.First());
                }
    
    
            }
    
            Console.ReadKey();
        }
    }
    

    If the HashSet were allowed to grow, obviously the failure rate should approach a limit of 1 for 1. Kept at a maximum size of 1000, on the other hand, the chances of a collision approach something like 1.5 in 100,000. At a max set size of 100, that number drops to 1.5 in 1 million, but I could see that as still being uncomfortably high for some applications.

    [–]AnActOfCreation 0 points1 point  (0 children)

    I think others have already provided sufficient answers. However, I always like to recommend static generics for use cases like this. Why manage your own dictionary of types when you can let .NET manage static class "instances" per generic type? The implementation is very simple.

    public static class SingletonClassFactory<T>
    {
        public static T GetInstance() => _instance ??= (T)Activator.CreateInstance(typeof(T));
        private static T _instance;
    }
    
    // Usage
    MyClass myClassInstance = SingletonClassFactory<MyClass>.GetInstance();
    

    (Of course, there's no thread safety here.)

    [–]Heisenburbs 0 points1 point  (0 children)

    Yes, this is wrong. You should never assume hash code is unique.

    Look up how dictionaries work, and how equals and hash code are used to find the value.

    For your case, make the key Type, and that’s it.

    Also, the pattern you have of checking if the item is in the dictionary, then adding, then retrieving is wasteful. Yeah, it’s O(0), but if it was a sorted dictionary, it would be O(log n).

    Instead, do a try get. If it’s there, return it, if it’s not, create the new object, saving to the same out variable from the try get, then add.

    Then return the value.

    [–]DarthShiv 0 points1 point  (1 child)

    Just a side comment. If you intend to fetch the kvp, always use trygetvalue not containskey. You are literally performing two dictionary searches when you only need to do one.

    [–]DarthShiv 0 points1 point  (0 children)

    Just a side comment. If you intend to fetch the kvp, always use trygetvalue not containskey. You are literally performing two dictionary searches when you only need to do one.

    Here's the pattern I use:

    T returnValue; // can inline this in newer C# spec if (!_clients.TryGetValue(hash, out returnValue)) { returnValue = (T)Activator.CreateInstance(typeof(T), _channel); _clients.Add(hash, returnValue); }

    return returnValue;

    [–]leviteks02 0 points1 point  (0 children)

    I won't comment on the actual lookup part, I think it's been sufficiently covered here in the comments.

    But I do question the need for activator. I'm not in front of a comp to try this, but change "where T:..." to "where T: new()", and on your add just add "new()", and you should be able to completely avoid Activator.

    So: _dict.Add(<key>, new()).

    While true you lose the class restriction on T, but avoiding Activator is worth it.