1. Database Internals#
1.1. How are Tables stored and queried on Disk#
For Instance :
Table with only one integer col of 32 bits (4 Bytes)
Insert one row->4 bytes into disk
Insert another row-> another 4 bytes into disk
1.2. But how this querying works internally?#
This is how a disk looks like from inside
Disk is divided into Tracks,sectors,blocks
Eg, Track no 7,Sector no 5,Block number 2=>Gives block of disk
Each Block is a fixed size memory block, 512 Bytes lets say
So,while inserting data into memory:
Insert row 1-> 4 bytes->Stored in that Given block
Block storage left=(512-4) Bytes
Insert row 2-> 4 bytes->Stored in that same block
That block will get filled until complete
1.3. How will database remeber which row is where?#
Database stores Metadata,like Row one is at Track no 7,Sector no 5,Block number 2=>Gives block of disk
So,when the IO req will come for this row,disk will start rotating till that particular block is found by the spin head and then db will pull that entire block of 513bytes you wanted rows but you will receive all the rows
1.4. What in case of SSDs?#
There are page numbers instead of Block numbers
1.4.1. In Depth:#
Database needs an identifier to identify each row
For that,it uses Primary key,if its present else it create a new column,row_id in the backend for its reference
Now these rows are stored in memory as heap
So Heap memory contains something called Pages(fixed size memory locations) and rows goes into these pages
First,rows will go into page one till this page is full then other rows will go to page 2 and so on
All these pages are in Heap.
So, Hierarchy becomes-> Rows->Pages(Part of Heap)
By default,It will scan all the pages,all the rows,all the data
But with indexing on one or more column,Lets say like here we did indexing on just on one column,employee id,it will store diff pages with just that column and the row identifier and the page identifier.
So now,It will allready know to which specific page he has to search for a particular column
1.5. Are the blocks or Pages stored simultaneausly in disk??#
Like if block one has last row no. 128 , Will the row no. 129 in next block?
Maybe or maybe not
It depends on How the data is fragmented