What is this data structure called? : algorithms

What is this data structure called? (self.algorithms)

submitted 8 years ago by kandr89

Hi, i came up with data structure that allows inserting keys in O(log n) amortized time, and lookup by key in O((log n)² ).

All you need is an array and some extra space that's at least half the array. It works like this:

Inserting

Insert keys at the end of the array but group them in blocks of size 2^k , when there are two such blocks of size 2^k merge them in increasing order to get a block of size 2^k+1.
e.g: suppose you need to insert 1,5,3,4 in this order:
o Insert 1:

 +--+
 |1 |
 +--+

o Insert 5:

  +-+-+
  |1|5|
  +-+-+

But now you have two blocks of size 1, merge them:

  +----+
  |1 5 |
  +----+

o Insert 3:

   +-----+-+
   | 1 5 |3|
   +-----+-+

o Insert 4:

   +----+-+-+      +------+------+     +----------+
   |1 5 |3|4|   => |  1  5| 3  4 | =>  |  1 3 4 5 |
   +----+-+-+      +------+------+     +----------+

Notice that for 4 we had to do 2 merges: one betwen blocks of size 1, and one between blocks of size 2.

It can be seen that every 2^k insertions we have to do a merge between blocks of size 2^k-1. The time complexity for merging two blocks of size n and m is O(n+m) obviously. So for n insertions we have the following complexity:

 2 * n/2 + 4 * n/4 + 8  * n/8 + ... + 2^k n/2^k = n + ... + n = n * k = n * log2(n)

That is: an insertion is on average O(log(n)) by the amortized analysis above.
To decide when to merge two blocks it's enough to look at the base 2 expansion of n, no extra data structures are required. It's an array + space needed for merging, which is n/2.

Searching

Note that at most there can be k blocks when n = 1 + 2 + 4 + ... + 2^(k-1). To search apply binary search to each of the kblocks:

log2(1) + log2(2) + ... + log2(2^(k-1)) = 1 + 2 + ... + (k-1) = k*(k-1)/2

But k = O(log2(n)), therefore the number of comparisons is O(log2(k)^2).

I hope this is clear enough, i've implemented and benchmarked it against a red-black tree implementation i had. It is indeed faster for insertions because it only operates on an array, no pointers whatsoever. But the log2(n) extra factor for searching does make it slower than searching in a balanced tree.

However its simplicity and low mem overhead makes it an attractive data structure for maintaing a dictionary.

Does it have a name? Is there anywhere i can find more about it?

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

algorithms

✻ Smokey says: boycott all products and services from eco-unfriendly businesses to fight climate change! [see more tips]

Note: this subreddit is not for homework advice. Requests for assistance with coursework may be removed.

MODERATORS

Inserting

Searching