MySQL Clustered Indexes

A clustered index in MySQL is a type of index that determines the physical order of data in a table. Each table can have only one clustered index, typically the primary key, which optimizes data retrieval by minimizing disk I/O.

Usage

Clustered indexes are used to enhance the speed of data retrieval operations that involve primary key lookups. They are automatically created when a primary key is defined on a table. If no primary key exists, some storage engines may choose a unique key or an internal row identifier as the clustered index.

CREATE TABLE my_table (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);

In this example, the id column serves as the primary key and clustered index, organizing the data in the table based on the id values.

Examples

1. Creating a Table with a Clustered Index

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    emp_name VARCHAR(50)
);

This example creates an employees table where emp_id is the primary key and acts as the clustered index.

2. Adding a Clustered Index to an Existing Table

ALTER TABLE orders
ADD PRIMARY KEY (order_id);

Here, the order_id column is altered to become the primary key, thus making it the clustered index for the orders table.

3. Viewing Clustered Index Information

SHOW INDEX FROM employees WHERE Key_name = 'PRIMARY';

This example retrieves information about the clustered index (primary key) of the employees table.

Performance Implications

Clustered indexes influence various SQL operations as follows:

Insertion: Can be slower because the data needs to be inserted in the index order, which might require moving existing rows.
Update: Updates can be slow if the primary key or indexed columns are changed, as this may require reordering the data.
Deletion: Similar to updates, deletions can result in gaps that may need to be managed to maintain order.

Clustered indexes can also improve performance for table scanning and sorting operations by reducing the need to sort data explicitly.

Comparison with Non-Clustered Indexes

Clustered Index: Determines the physical order of the data in the table. Each table can have only one.
Non-Clustered Index: Does not affect the physical order of data. A table can have multiple non-clustered indexes.

Tips and Best Practices

Choose wisely. Select a primary key that is unique and will not change to serve as the clustered index.
Optimize for queries. Design the clustered index to support the most common and performance-critical queries.
Limit size. Keep the clustered index small to reduce storage requirements and boost retrieval speeds.
Monitor fragmentation. Regularly check and defragment clustered indexes to maintain optimal performance.
Understand limitations. Remember that a table can only have one clustered index; plan your table design accordingly.

SQL Upskilling for Beginners

Gain the SQL skills to interact with and query your data.

Start Learning for Free