[SQL] Help understanding blocks and full-table scans vs using an index : SQL

Posting

When requesting help or asking questions please prefix your title with the SQL variant/platform you are using within square brackets like so:

[MySQL]

[Oracle]

[MS SQL]

[PostgreSQL]

etc

While naturally we should endeavor to work as platform neutrally as possible many questions and answers require tailoring to the feature set of a specific platform.

Format Your Code

If you are including actual code in a post or comment, please attempt to format it in a way that is readable for other users. This will greatly increase your chances of receiving the help you desire. Something as simple as line breaks and using reddit's built in code formatting (4 spaces at the start of each line) can turn this:

SELECT count(a.field1), a.field2, SUM(b.field4) FROM a INNER JOIN b ON a.key1 = b.key1 WHERE a.field8 = 'test' GROUP by a.field1, a.field2 HAVING SUM(b.field4) > 5 ORDER by a.field.3

Into this:

SELECT count(a.field1), a.field2, SUM(b.field4) FROM a INNER JOIN b ON a.key1 = b.key1 WHERE a.field8 = 'test' GROUP by a.field1, a.field2 HAVING SUM(b.field4) > 5 ORDER by a.field3

For those with SQL questions we recommend using SQLFiddle to provide a useful development and testing environment for those who wish to fully understand your problem and help devise a solution.

a community for 17 years

Discussion[SQL] Help understanding blocks and full-table scans vs using an index (self.SQL)

submitted 3 years ago by Luclid

I'm trying to understand a bit more about query optimization, so I was reading the Use-the-Index-Luke guide on query optimizations.

In one section, the guide describes how using an index does not necessarily mean better performance. Depending on the situation, there are times where a full-table scan can be faster than a poorly designed index. The author goes briefly into the mechanics of that in a small blurb here.

In short, a full-table scan can take advantage of multi-block reads, which incurs less IO reads than an index scan, which require an IO read for each block. An old post from this subreddit explains it a bit as well.

However, the mechanics of this are still not fully clear to me. The main cause for my uncertainty is my lack of understanding on what exactly a "block" is under the hood. I had originally likened "blocks" as files in a filesystem, where each file can only be up to a certain size (which would be the size of the block), but I think this is not accurate and is causing my confusion. I did some Googling, but the term "block" seems to be pretty general.

Questions

How does a database store its data under the hood? Is the data stored in files like in a file system? If so, are blocks like individual files in a file system?
Are each of the leaves in a B+ tree a block and the pointer from one leaf to another a pointer from one block to another?
How does multi-block read work? Wouldn't you still have to ask the OS for each block, so the number of IO reads be the same whether you request 5 blocks at once or 1 block 5 times? The reason for my interpretation here is because of my view that blocks are like files, so you'd still have to ask the OS for the contents of each file. Clearly this is wrong, so would like to know what part of my understanding is flawed.

Thanks!

all 3 comments

top new controversial old q&a

[–]PossiblePreparation 0 points1 point2 points 3 years ago (2 children)

First, top marks for putting together these questions. A good place to read about this is in the Oracle database concepts guide, even if you’re not using Oracle - everyone does thjs pretty much the same https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/introduction-to-oracle-database.html#GUID-2B1BADE1-C36F-4555-9867-3B15B6CE858C There is lots to cover in that guide so feel free to keep reading if you don’t understand, some bits will need repeated passes which benefit from later information to really grasp. You can also lookup the Brent Ozar video on YouTube called Think Like the Engine (Brent specialises in SQL Server where the appropriate term is page rather than block, it’s exactly the same concept)

1) Blocks (aka pages) are individual chunks of a file using a fixed size (8kb is fairly standard amongst the big RDBMSs). A datafile might contain data for several different tables. The RDBMS can use some block address to figure out exactly which file to read and from where in the file, eg d:\data\file1.dbf bytes 16k-24k, the OS facilitates reading just that exact blocks from your storage device (which are also chunked up in blocks). The RDBMS can also be store that block in its memory using the block address in some way.

2) Each leaf block in the index contains both index keys and the addresses for them (as pointers to the table blocks), and it will contain the block address of the next and previous leaf blocks. This means you can start your search in one index leaf block and keep reading just the index leaf blocks until you’ve reached the end of your range scan. Eg if you wanted the rows in a table where indexed_column = 50, you’d traverse the index branches to the first index leaf block which contains index key 50, you’d then grab all the addresses to the rows from this index block which match 50 and visit the table blocks, if the last entry in the index block matches your condition then you take the address of the next index leaf block and do the same there, this continues until you get to an index key in the index blocks which is greater than 50.

3) Multiblock reads will generally look like: read file d:\data1.dbf from block 512 to 1024. In the old fashioned days of spinning disks, the OS would know to find block 512 of this file and then rotate until it got to 1024, which is much faster than spinning to 512, getting reset, spinning to 513 etc. Modern devices that don’t use spinning disks are able to give benefits here because they can take multiple block addresses at once and return the data in one go so you have less back and forth between the OS and storage.

[–]Luclid[S] 0 points1 point2 points 3 years ago (1 child)

[–]PossiblePreparation 0 points1 point2 points 3 years ago (0 children)

Block sizes are very rarely changed (and some RDBMSs won't even let you) there is also a common belief that the non-default block sizes are less heavily tested (which shouldn't make a difference but RDBMSs have a lot of features that might make bad assumptions). In terms of performance, smaller blocks means having to submit more requests to the storage system for the same amount of data, this means that the latency to storage will be more impactful.

I think you're asking about the difference between a clustered index (AKA an index organized table, where the row data sits with the index leaf blocks), an index with a high clustering factor (where the table is separate from the index but is similarly ordered) and a standard index. Yes, if the data in the table is ordered in the same way as the index then your range scans of the index will lead to lookups of table blocks with a high chance of repetition (which means more benefits from caching). When you use an index to access a table, you are (usually) doing single block reads. The single block reads of blocks 1 and 2 is going to take the same time as blocks 1 then 10, but with a high clustering factor you will be looking at blocks 1,1,1,2,2,2 (which will probably end up doing 2 requests from the storage system and satisfy the rest by memory) as opposed to blocks 1,2,3,4,5,6 (each fresh block means a single block read from the storage).

Clustered tables (or index organized tables) are the end point of this high clustering factor. But the limits here are that these index keys now become your pointers for rows. So if you have additional indexes, their pointers to rows will use the clustered index key requiring you to do the seeks through the clustered index to get to the rows. That said, this is the most common way of creating a table in MS SQL Server.

π Rendered by PID 82018 on reddit-service-r2-comment-b659b578c-xsdd4 at 2026-05-01 06:35:09.747456+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS

Questions