MySQL DISTINCT ON Clauses
The `DISTINCT ON` clause is not directly available in MySQL as it is in some other SQL databases like PostgreSQL. However, MySQL's `DISTINCT` keyword can be used to return unique records by filtering out duplicate rows based on the selected columns.
Usage
In MySQL, `DISTINCT` is used to eliminate duplicate rows from the result set. It helps in retrieving unique combinations of column values from the selected data. Note that `DISTINCT` applies to the entire row for the selected columns, not individual columns.
SELECT DISTINCT column1, column2, ...
FROM table_name;
This syntax returns only unique sets of values for the specified columns from the table.
Examples
1. Basic Distinct Usage
SELECT DISTINCT city
FROM customers;
This query retrieves a list of unique city names from the `customers` table, removing any duplicates.
2. Multiple Columns
SELECT DISTINCT city, country
FROM customers;
Here, the query returns distinct combinations of `city` and `country` from the `customers` table, ensuring no duplicates for the pair.
3. Simulating DISTINCT ON with Subquery
SELECT *
FROM (
SELECT id, name, ROW_NUMBER() OVER (PARTITION BY category ORDER BY id) as rn
FROM products
) subquery
WHERE rn = 1;
This example simulates `DISTINCT ON` by using a subquery and `ROW_NUMBER()` to select the first row of each category from the `products` table.
Additional Example
SELECT *
FROM (
SELECT id, name, price, ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) as rn
FROM products
) subquery
WHERE rn = 1;
This query retrieves the product with the highest price within each category, demonstrating another way to simulate `DISTINCT ON` using window functions.
Tips and Best Practices
- Limit the columns used with `DISTINCT`. The more columns you include, the fewer distinct results you will get.
- Combine with ORDER BY for predictable results. For example, when using subqueries or window functions, use `ORDER BY` to define which row to keep:
SELECT DISTINCT city, country FROM customers ORDER BY country, city;
- Consider performance implications. Using `DISTINCT` can be resource-intensive on large datasets. For instance, in a table with millions of rows, running `SELECT DISTINCT` might significantly slow down query performance. Evaluate its necessity carefully.
- Simulate `DISTINCT ON` with subqueries. Use window functions or subqueries to mimic `DISTINCT ON` functionality in MySQL, as shown in the examples above.