Sometimes we expect the queries we run on the database to return results faster. Of course, in addition to this expectation, we also do not want to strain the database while running these queries. Is this possible?
I want to start my article with the sentence “Correct Index usage speeds up”.
In this article, I will discuss the INDEX structure and usage in the database under 4 main headings,
These are;
1. What is an Index
2. Index Types
3. How to Use an Index
4. Functions Used in an Index
So what is this Index?
Indexes are structures that aim to access data much faster by reading less data when the data in the tables of a database is queried and return the result of the operation faster.
While indexes speed up the performance of data retrieval queries (SELECT statements), they can slow down these operations a bit because they require an additional step in data writing queries (INSERT and UPDATE statements). However, considering that read operations occur more frequently than write operations in most database applications, the performance increase provided by indexes usually goes far beyond this small loss. However, indexes do not have any effect on the data content.
You can think of Index as the table of contents of a book, the path you follow to find the book you are looking for in a library, or the guide on your mobile phone.
Let's say you registered for an IT event and wanted to find your place on the day of the event, but the document containing the names was prepared in a mixed manner, you would have to scan the entire document to find yourself.
If the document had been prepared in the order of registration, you could search by your registration number (Clustered Index).
If there was a second document sorted by name and surname, you could find your place earlier (Non-Clustered Index).
So when should we use Index?
If your queries return answers much later than desired.
If there is a column that is constantly used in the condition in your queries.
If there are many different values in the queried column and it does not contain many NULL values.
If one or more columns are frequently used with a WHERE statement and a join operation.
When is it not right to use Index?
The common set of items in the list is that they apply to “columns that are used very frequently in the query” or “data that is intended to be accessed very quickly”.
• Indexes should not be used on small tables.
• They should not be used on tables that frequently receive large bulk updates or inserts.
• Indexes should not be used on columns that contain a large number of NULL values.
• Columns that are manipulated frequently should not be indexed.
For more information: SQL Server and Azure SQL index architecture and design guide
For example; when a SELECT statement is executed on a table with 1414273 rows of data;
Figure 1. Customer Table
When no index definition is made to the table, the data is brought by performing a table scan.
The number of pages read with the table scan is 95943 (Figure 2), and the execution plan time is 0.285 s (Figure 3).
Figure 2. Number of Readings with Table Scan
Figure 3. Execution Plan with Table Scan
When the Clustered Index is created, it brings the data from the clustered index.
The number of pages read with Clustered Index is 88530 (Figure 4), and the execution plan time is 0.261 s (Figure 5).
Figure 4. Clustered Index Read Count
Figure 5. Clustered Index Execution Plan
When a non-clustered index is defined, it uses both clustered and non-clustered indexes to fetch the data.
The number of pages read with the non-clustered index is 36 (Figure 6), and the execution plan time is 0.000 s (Figure 7).
Figure 6. Non – Clustered Index Read Count
Figure 7. Non-Clustered Index Execution Plan
Index Types
1. Clustered Index
a. Physically sorts the table's data.
b. There can be one clustered index on a table.
Figure 8. Clustered Index Modeling
2. Non-Clustered Index
a. There can be “n” non-clustered indexes in a table.
b. Non-clustered indexes are tree structures created with the columns specified in the index and contain the row reference of the main table. The indexes found at the end of books are the best example of these. They indicate which page the relevant data is on and allow us to easily locate it.
c. In the index, sorting is not done with the fields added to the include. Include (lookup) fields are used to avoid going back to the table when the index is accessed.
Figure 9. Non-Clustered Index Modeling
3. Unique Index
a. If the data in the column selected as a table indexing field is not desired to be repeated, the UNIQUE word should be used in the CREATE INDEX command when indexing.
b. If it is done on the screen, the UNIQUE feature should be selected.
4. Filtered Index
a. Filtered index is created by adding a filter to a non-clustered index.
b. There are 1 billion records, 9999 of which are null. If you want the null records to be listed as none, you can add a WHERE filter to the index you created.
5. Composite Index
a. When more than one column is added to the index while creating the index, it is called a composite index.
b. If there is more than one parameter, this becomes a composite index.
6. Covered Index
a. Covered index is used to locate values in an index without needing the original table.
7. Column Store Index
a. It is used in data warehouse reporting / compression and column-based sorting / analytical sorting of data.
8. Full-Text Index
a. When searching for a word in a 10,000-word text, the structures that list these texts in words are called full-text indexes.
b. Instead of searching with Like, functions such as CONTAINS, which are special functions belonging to the full-text index, can be used.
Creating an Index in MSSQL
Index adding, deleting and updating operations can be done in 2 ways:
By query;
A new index can be created using the CREATE function.
CREATE INDEX index_name
ON table_name (column1, column2, …);
USE [example]
GO
SET ANSI_PADDING ON
GO
CREATE NONCLUSTERED INDEX [IX_customer_accountnumber] ON [Customer]
(
[Accountnumber] ASC,
)
WITH
(PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON,
FILLFACTOR = 95) ON [db]
GO
DROP INDEX index_name
ON table_name (column1, column2, …);
USE [BOA]
GO
DROP INDEX [IX_customer_accountnumber] ON [customer]
GO
Via SQL
Right click on Indexes from the table where the index will be created and select the index type to be created with the new index option.
Figure 10. Creating an Index
After selecting the index, the information on the New Index page will automatically be filled. The Index Name can be changed, and if you want the index to be UNIQUE, it can be selected from this screen.
Figure 11. Creating an Index: General Settings
You can add the table fields you want to use in the Index with the Add button.
With the Options option, you can select the features you created with the WITH clause in the CREATE command.
Figure 12. Creating an Index: Column Selection
Figure 13. Creating an Index: Options
Functions Used in Index
Pad_Index
• Used to apply the percentage of free space specified by FillFactor to the index intermediate pages during index creation.
• Takes ON/OFF values.
Fillfactor
• Used to set the percentage of free space that the SQL Server engine will leave at the leaf level of each index page during index creation.
• FillFactor must be an integer value between 0 and 100.
Sort_In_Tempdb
• Specifies whether the intermediate sort results created during index creation will be stored in tempdb.
• Takes ON/OFF values.
Ignore_Dup_Key
• Specifies whether an error message will be displayed when duplicate key values are added to the unique index.
Statistics_Norecompute
• Specifies whether old index statistics will be automatically recalculated.
Updating statistics can be resource intensive. If you have a very large table and changes are made frequently, automatically updating statistics can cause performance problems.
Drop_Existing
• Specifies that the current named index will be dropped and recreated.
• Takes ON/OFF values.
Online
• Specifies whether the underlying tables can be accessed for queries and data changes during the index creation process.
• When creating an index, it should be created/altered with the WITH(ONLINE=ON) parameter so that other DML operations are not affected.
Allow_Row_Locks
• Specifies whether row locks are allowed to access the index data.
• Takes ON/OFF values.
Allow_Page_Locks
• Specifies whether page locks are allowed to access the index data.
• Takes ON/OFF values.
Maxdop
• Used to limit the maximum number of processors used in parallel execution of the index creation process.
Data_Compression
• Specifies the data compression level with NONE, ROW and PAGE values for the specified index, partition number or partition range.
In this article, we have discussed in detail what indexes are, their types, how they are used and the functions used in them in order to increase database performance. We have discussed how indexes affect the speed of queries, which types are suitable according to data types and needs, and how you can optimize database performance. We have also explained how you can make the use of indexes more effective by providing information about the various functions used in indexes. Optimizing database design and performance requires a balanced and conscious approach. If you want to get more information or do an in-depth review on this subject, you can check out the links in the bibliography section.
I would like to thank Ahmet Fatih Aktaş for his contributions and support.
References and Useful Links:
- https://learn.microsoft.com/en-us/sql/relational-databases/indexes/indexes?view=sql-server-ver16 (Erişim Tarihi: 01.12.2023)
- https://www.youtube.com/watch?v=wWf1fUQ9wSc&themeRefresh=1 (Erişim Tarihi: 01.12.2023)
- https://blog.sqlauthority.com/2023/09/08/sql-server-the-comprehensive-guide-to-statistics_norecompute/#:~:text=The%20STATISTICS_NORECOMPUTE%20option%2C%20used%20within,the%20statistics%20for%20that%20index. (Erişim Tarihi: 01.12.2023)
- https://sqlperformance.com/2019/04/sql-performance/ignore_dup_key-slower-clustered-indexes (Erişim Tarihi: 01.12.2023)
- https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-index-design-guide?view=sql-server-ver16(Erişim Tarihi: 01.12.2023)