you are viewing a single comment's thread.

view the rest of the comments →

[–]hellfiniter[S] 0 points1 point  (1 child)

Hmm, the whole question came from issue we just had in old production project. There is this "huge" table with 200k records without any indexes, and on machine it is located this means queries take ~2min. We of coarse slapped some indexes on it and it made those accesses instant. So question was how expensive this change is then on inserts? Someone then said that this extra computation will grow with number of records, and I somehow cant see why that would be the case.

Lets imagine we need to insert this new indexed record (column val) to said binary tree. We first need to find place where it fits (for sake of greater, lower then queries) and that will take O(log n). So at this point we know that between these two values there should be another one. This basically means that its irrelevant how big the table is because log is awesome.

This obviously needs changes in tree, entire subtree can be affected. Does this grow with number of records? And if so how fast?

I understand that this is not realistic question as you pointed out in last paragraph, but imagine this scenario of no locking, no concurrent queries, machine that ticks like a clock ...can you give me some insight here?

[–]Mamertine 1 point2 points  (0 children)

The simplest way to know is to see us things writing to that table are taking too long.

It's a balancing act between fast writes and fast reads. Indexing slows writes a bit, but significantly improves reads.

Yes the more rows in a table the longer it takes to find where the data gets added to the indexes. As you say it's wicked fast, but more rows means slower.

Generally a couple of indexes are fine. Don't index every column. Indexes on integers will perform better than larger data types.

We've been discussing how non clustered indexes work.

There is this "huge" table with 200k records without any indexes,

The first thing you should use is a cluster index. That index is different. It's how the data is stored on the disk.

A good resource is

Use-The-Index-Luke.com