Saturday, October 12, 2013

INDEX IN SQL SERVER

Indexes are used to make the data retrieval fast index in relational database can be compared to index available at the end of a book using which you can find the required topic in the book fast.

Sql server provides a system tables called sysindexes that contains details of index available on tables in the database
Every table in the database will have minimum one row in this table even if it has no index.

Internally the data will be stored in the database in pages where size of each page is 8 kb.

 A continuous 8 pages is called as an Extent. When a table is created, than one extent will be allocated for it, and when that extent is completely filled than another extent will be allocated that may or may not be continuous to the first Extent.

In database indexes are created on particular column.

The column on which the index is created is called “Index Key”.


In sql server searching for a particular record is of two types.
1 table scan
2 index scan

Table scan
Each and every record as to be compared than sql server scan the entire file.
It consumes more time.

Explanation:

When  you write a select statement on the table with a condition in “where condition” than first sql server will refer to “IndId” column of sysindexes table to verify wheather the column on which you write the condition has an index in the table.

When this column contains zero, than it indicates that the table has “No Index” and in this case it will refer to the column “firstIAM” of sysindexes table, to get the address of the first extent of the table and check the given condition with each and every row available in every page of every extent of the table and displays only the rows that satisfy the given condition. This process of checking the given condition with each and every row of the table is called as table scan.

Drawbacks of table scan

As no of rows increase in the table, the time taken to retrieve the data will increase and will effect the performance of the database.


Index scan
In index scan, an average sql server scans only half of the file which is must faster than table scan.

Sql server uses index in the following cases:
If where condition or order by clause is based on the column on which the index is created.
Note:

Sql server does not follow index in the following case.
1 If where clause or order by clause is based on the column on which the index is not available.

2 creating no of index on particular table degrades performance of DML operations.

To maintain the balance, so create indexes on columns frequently accessed in where clause and order by clause.



Index in sql server are classified into
·          Clustered index
·          Non clustered index

Clustered index:
Is the index that will arrange the rows physically in the memory in sorted order

Advantage of clustered index is searching for a arrange of values will be fast. You can create only one clustered index per table.

If order of the values in table and if order of values in the index is same this index is called as clustered index.

Explanation:
Clustered index is internally maintained with the help of B-Tree Data Structure and leaf node of B-Tree of clustered index contains table Data. When you write select statement on the table with condition in “where clause” then first sql server will refer to “INDID” column of sys index table to verify whether or not the column on which you write the condition has index.

When INID column contains the value 1 , than it indicates that the table has a clustered Index and in this case it will refer to a column called “Root “ to get the address of the Root Node of B-Tree of clustered index and searches in the B-Tree to find the leaf node  that contains the first row that satisfy the given condition and retrieves all the rows that satisfy the given condition , which will be continuous in the memory.
A Drawback for clustered Index is insert, update will be slow because the root must be inserted or updated in sorted order.

To make insert , update fast we use fillfactor option while creating the clustered index.
If you set the fillfactor to 70, than in every page of the table 70% will  be filled with Data and remaining 30% will be left free.

As “freespace” is available on every page inserted and update will be little bit fast.

Sql server by default creates clustered index on primary key column.

Example:
Create table xyz(x int, y int, z int)

Create clustered index i10 on xyz(z)

Insert into xyz values (1,2,10)
Insert into xyz values (4,3,12)

Non clustered Index

Is the index that will not arrange the rows physically in memory in sorted order.
Advantages of Non Clustered Index
Searching for the values that are not in the arrange will be fast.
You can create a maximum of 999 Non Clustered Index per Table.
Non Clustered Index is also maintained with the help of B-Tree data structure, But leaf Node of B-Tree of Non Clustered Index contains pointer to the pages containing data and Not Table Data.

Explanation:
When you write a select statement with a condition in “where clause” then first sql server will refer to “InDId” column of  sysindexes table to verify whether  or not the column on which you write the condition has an Index.

When “IndId” col contains value in the range of 2 to 1000 then it indicates that the column on which you write the condition has a
“Non Clustered Index” and in this case  it will refer to the column Root to get address of the root Node of B-Tree of Non Clustered Index and then searches in the B-Tree to find the leaf Node corresponding to the value you are searching for and retrieves all rows containing that value based on pointer available in the leaf Node.


With non clustered index there will be no effect on Insert and Update.


Clustered index
Non Clustered Index
It will arrange the rows physically in memory in sorted order
It will not arrange the rows physically in the memory in sorted order
The leaf node of B-Tree of clustered index contains the table data.
The leaf node of B-Tree of Non Clustered Index contain pointer to the pages containing data.
It will effect insert and update
It will not effect insert and update
It will be fast in searching for a arrange of values
It will be fast in searching for values that are not in a range.
You can create  only one clustered index per table
You can create a maximum of 999 Non clustered Index per table.



Based on No of columns on which index was created, Index are classified into
1 simple
2 composite

Simple:
When an index is created on single column than that index is called as simple index.

Composite:
When an index is created on combination of multiple columns than that index is called composite Index.
Create index i2 on EMP(deptno, job)

Select * from emp order by deptno, job

Select * from EMP where deptno=10 and job=’clerk’

Sql server does not perform composite index on OR operator.

Unique Index:
When an index is created using the keyword unique than that index is called as unique index.
When a unique index is created on a column than along with index unique constraint will be created on that column, which will not allow duplicate values. When you try to create a unique index on a column that contains duplicate values than you will get error.

Primary key- clustered index
Unique- non clustered index
Create a index
To create a index use the create index command that has the following syntax

Create [unique] [clustered| Non clustered Index] index <indexname>  on <tablename>(<columnname> [asc|desc]
[include(<collist>)]
[with fillfactor=<n>]

The following example creates a non-clustered index on department no col of emp table.

Create index dnojobidx on emp(deptno,job)

Altering an Index
To alter an index use Alter Index Command that has the following syntax

Alter Index <idxname> on <tablename>
Rebuild| Reorganise|Disable

Rebuild:
It willrecreate the entire Index.
ReOrganize:
Will rearrange only leaf nodes of B-Tree of the Index
Disable:
Will disable the Index.

When you want to enable the disable Index, than you must alter the Index using Rebuild option.
Example
The following example rebuild the Index dnoldx available on deptno col of emp table

Alter index dnoldx on emp Rebuild

Getting List Of Index
To get the list of index available on  a table used the stored procedure that has the following syntax
Example
The following example gets the list of index available on emp table
Sp_helpindex ‘emp’

Deleting the Index
To delete an Index use the Drop Index command that has the following syntax
Drop Index <idxname> on <tablename>

The following example deletes the Index DnoIdx available on Dept No col of emp table

Drop Index DnoIdx on Emp

 Using Index
It is not possible to explicitly specify to use an index while retrieving the data, and sql server will automatically determine whether or not to use an Index.

Sql server will use an Index to retrieve the data only in the following cases

When the index is simple index, than the column on which the index is created must be in the condition in “where clause”

When an index is a “composite Index” then all the columns on which combination the index was created must be in the condition in where clause.

Important Points
The column on which index was created are called key column for that index.
Maximum no of key columns per an index is 16 and total size  of key column must not exceeds 900 bytes
Key columns   16               900 bytes
Deptno                int                4
Job                      varchar(20)  20

                                           24 bytes

When you have to create an index on combination of more than 16 columns or combinations of columns whose size exceed 900  bytes than specify the additional columns as include columns for the index.
Maximum no of include columns per an Index is 1023(less than 1 of Total no of columns to create table)

Include columns will be included in the Index but there size will not be counted towards the size of the Index.

A column can be either a key column or include column but not both.

Observing the difference between data with and without index.
To observer the difference in retrieving data without index and with index follow the following steps

1 create a copy of the table emp with the name emp1 as follow
Select * into emp1 from emp

2 write the following insert statement on table emp1 to take the rows from table emp and insert them into emp1 again and again repeatedly execute this statement until you get 1000 of rows on the table emp1
 Insert emp1 select * from emp

Write the select statement on table emp1 to find the employees working in dept no 20

Select that statement and use the shortcut (Ctrl+l) to get estimated execution plan , place the mouse pointer over the table scan icon and note down the estimated c.p.u cost.



Select * from Emp where Deptno=20 



No comments:

Post a Comment

Thank you for visiting my blog

Kubernetes

Prerequisites We assume anyone who wants to understand Kubernetes should have an understating of how the Docker works, how the Docker images...