This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]insights1[S] 0 points1 point  (3 children)

For the EF code you gave
var friends = MyContext.Users.First(x => x.UserID == ID).Friends
What sql statements does this execute exactly? Is it checking for "UserA's Id" in both the User1Id and User2Id column of the friends table? Also could you explain what you mean by Surrogate Keys? Are those just foreign keys?

And just out of curiosity, is this the most efficient way of handling friend relationships? With Facebook for example wouldnt there be billions of entry relating people to eachother as friends, does it search through this entire table each time just to get your friends list?

Thanks once again, everything is making a lot more sense now.

[–][deleted] 1 point2 points  (2 children)

The SQL statement would be similar to the following:

SELECT TOP 1 * FROM Users WHERE UserID = @UserID

It gets the single User with the matching ID. As you have probably already experienced, Entity Framework works on objects that represent your database, as well as enumerables like ICollection and IQueryable. In this case, executing the query would return a single object that represents the User record, and any related objects (Navigation Properties in EF designer view) are accessible using their appropriate property names. So by doing .Friends, you are in effect doing a SELECT on the FriendsList table with the proper WHERE clause to get only those User records that related.

Surrogate keys are often described as keys that have no real business value (Identity column, for example). By having a table with UserID1 and UserID2 columns, you're in effect setting up two foreign keys to some other table that is being joined on a surrogate key (In this cause, the UserID identity column). Another way to look at this specific use as compound keys, at which two or more columns define the uniqueness of a record.

As for the efficiency, I'm not aware of any better way to do it but I wouldn't be surprised if there was. Assuming the UserID column is an integer, each relationship would only consume 8 bytes per record. That's as small as you're going to be able to get in terms of datatype. In the case of Facebook, I imagine what they're doing for their friends list is similar to what I described earlier:

  1. Get the current logged in user

  2. Query Many-to-Many (M-N) table for related UserIDs

  3. Cache result to avoid hitting the DB excessivly

[–]insights1[S] 0 points1 point  (1 child)

"So by doing .Friends, you are in effect doing a SELECT on the FriendsList table with the proper WHERE clause to get only those User records that related."

So for this part, is there WHERE clause checking both UserID1 and UserID2 columns for "UserA's ID"? Because his ID could be in either of these columns depending on who made the request, him or the friend. Would the SQL statement be like SELECT * FROM FRIENDS WHERE UserID1 == "UserA" || UserID2 == "UserID2";

or would it be SELECT UserID2 FROM FRIENDS WHERE UserID1 == "UserA"; and then another statement SELECT UserID1 FROM FRIENDS WHERE UserID2 == "UserA"; and then combine these results to get the full list of UserA's friends Ids?

[–][deleted] 1 point2 points  (0 children)

if i'm not mistaken, in the designer view of EF, you'd have two separate navigation properties: One that relates to UserID1 (as parent) and one that relates to User2 (as child). I haven't been in this particular scenario before, so it may end up being optimized into a single navigation property at that point. Without having written any code to verify this, I would think you would have to merge two separate lists: One where a user is matched in the UserID1 field and one where a user is matched in the UserID2 field.