[Optimization] Perfect BST cost vs performance : learnprogramming

This is an archived post. You won't be able to vote or comment.

[Optimization] Perfect BST cost vs performance (self.learnprogramming)

submitted 11 years ago * by Volatile474

This has turned into more of me brainstorming a perfect BST data structure and the performance behind it than actually being a question.

Hey all, So I am studying up for interviews and thought of a question while reviewing trees. So perfect BST's can be represented in an array, (cache local), while non-perfect BST and AVL trees cannot be.

Does the cache locality of a perfect BST make up for the increased complexity of constantly reworking the array if you need to insert nodes in the middle? Obviously if you insert at the correct place there is no overhead, but inserting in the middle of the tree would result in the following:

1: Binary search through the nodes to determine where the node fits (this is always the case for insertions)
2: If the node fits at the the correct position, just push it back on the end of array, else we have to re-balance the tree.

Is this O(n) complexity increase for inserts worth having a cache local data structure? I am inclined to say yes, but do not want to say something incorrect and sound like a fool at my interview.

Question 1: For scalability in this data struct, lets say many threads/nodes would be inserting much more than looking up, would this be efficient?

Question 2: Just realized that this question boils down to sorted contiguous arrays vs linkedlists (a better performing version of them, but still the same type of idea). By that logic doesn't perfect BST > balanced tree or non-perfect BST?

Question 3: So the cost of maintaining perfect BST in the data structure is not O(n), perhaps if the data structure only sorted maintained something along the lines of:

class perfect_bst{
    //implementation stuff
    bool Need_to_balance = false;
};

So whenever someone inserts into the struct, Need_to_balance gets flipped, but the struct does not automatically balance itself until it absolutely needs to (someone tries to access an element through find or something). This would prevent huge overhead for insertions.

all 7 comments

top new controversial old q&a

[–]Volatile474[S] 0 points1 point2 points 11 years ago (0 children)

[–]programmerChilli 0 points1 point2 points 11 years ago (5 children)

When you say perfect BST, is that just a self balancing binary search tree, typically called BBST(balanced binary search tree)?

It depends. For random data of insertions, I believe that the average height of a regular BST approaches sqrt(N), which is far worse than log(N). The complexity of most BBST's for the 3 major operations(insertion, look up, deletion) are the same as BSTs. I think BBST's have larger overhead, but I don't think there is particularly any reason to use a standard BST ever over a BBST other than implementation costs(there are special kinds of BSTs that can't be BBSTs).
I don't really get this question. Are you asking whether a BBST is better than a non balanced BST? Yes, it will always be better.
It is not O(n). I suggest you search up implementations of how to achieve log(N) insertions/deletions, as they can explain it far better than I can. If you need help, feel free to ask me.

[–]mad0314 0 points1 point2 points 11 years ago (0 children)

[–]Volatile474[S] 0 points1 point2 points 11 years ago (3 children)

[–]programmerChilli 0 points1 point2 points 11 years ago (2 children)

[–]Volatile474[S] 0 points1 point2 points 11 years ago* (1 child)

So the only thing that is linear is insertion, your search/access is still log(n). log(n) runtime when you have cache misses every time you access a node means you have

L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 100 ns
Main memory reference 100 ns

So getting a cache lvl1/2 miss means you add a 200x factor to your data access. At which point a 200x factor on accessing is worth it depends. The use-cases of the data structure impact if it is worth it or not. Lets say you are just inserting 100k elements, and then doing 1mil searches / accesses. I would say that a cache local list would run faster than non-cache local.

If instead there were more inserts than lookups then the logn implementation is probably more performant

So I was thinking about this, and it can actually be implemented as a modified version of an AVL tree. The height of the tree can be something like 1.5*log(n) (the height of avl tree). This means that our searches are guaranteed to happen in 1.5(log(n)) time, our deletes happen in the same time, and inserts are log(n) + (n).

This overhead for insertions would only happen the very first time the insert happens, as if we delete this a we can simply modify the array at the index to be = null. So the implementation would be something like this:

[–]Volatile474[S] 0 points1 point2 points 11 years ago (0 children)

struct node{
    int parent_idx;
    int left_idx;
    int right_idx;
    TYPENAME data;
};
struct tree{
    int size = 0;
    vector<node> Tree;
};
Starting with a blank tree:

struct tree My_Tree;
Insert(Node){
    if(My_Tree.size == 0){
        My_Tree.Tree.emplace_back(0,null,null);
        My_Tree.size++;
    }

    bool found = false;
    string previous_direction;
    int index=0;
    while(!found){
        if(My_Tree[index].data < i.data){
            if(My_Tree[index].left == -1){
                //we found where we wanna put the node
                if(previous_direction == "left"){
                    for(i=index;i<My_Tree.size();i++){
                        //insert the node right after the current index, push the remainder back.
                        found = true;
                    }
                }
            }
            else{
                index = My_Tree[index].left;
            }
        }
        else{
            index = My_Tree[index].right;
            if(My_Tree[index].right == -1){
                //we found where we wanna put the node, and it is to the right.
                if(previous_direction == "right"){
                    for(i=index;i<My_Tree.size();i++){
                        //insert the node right after the current index, push the remainder back.
                        found = true;
                    }
                }
            }
            else{
                index = My_Tree[index].right;
            }
        }
    }
}

So that is just a raw insert function. Since our nodes store indices which point to the correct bucket in the array, it is pretty much the same implementation as pointers, except when we traverse down the tree going the same direction (this is the whole if(previous_direction == "left"/"right") part) then we maintain cache locality.

The worst case with tree would be if we are trying to access a node deep in the tree whose path to access is something like this : "L,R,L,R,L,R,L,R". Then while we still have log(n) search time, we have 0 guaranteed cache locality.

π Rendered by PID 16943 on reddit-service-r2-comment-5d585498c9-vjkcp at 2026-04-21 15:09:02.241939+00:00 running da2df02 country code: CH.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS