The SELECT DISTINCT statement in SQL is used to return only unique values in the result set of a query. It eliminates duplicate rows from the result set. The
DISTINCT keyword is used in the
SELECT statement to accomplish this.
Here is the basic syntax of the
SELECT DISTINCT statement:
SELECT DISTINCT column1, column2 FROM table_name;
If you have a table called
employees with columns
age, you could use the following query to select all the unique first names in the table:
SELECT DISTINCT first_name FROM employees;
This will return a list of all the distinct first names in the "employees" table, without any duplicates.
You could also use multiple columns in the
SELECT DISTINCT statement. For example, the following query will return a list of all the unique first name and last name combinations in the
SELECT DISTINCT first_name, last_name FROM employees;
You can use DISTINCT in combination with other clauses such as
ORDER BY, for example:
SELECT DISTINCT first_name, last_name FROM employees WHERE age > 25 ORDER BY last_name;
It would select distinct
last_name combinations from the table
employees, filtering only the rows where
age is greater than 25, ordering the results by
There are a few best practices to keep in mind when using the SELECT DISTINCT statement in SQL:
DISTINCTonly when necessary: The
DISTINCTkeyword is used to eliminate duplicate rows from the result set, but it can also make a query slower by increasing the amount of work that the database server needs to do. So, you should use it only when it is really needed.
- Use the right column or columns: If you are only interested in unique values of a single column, you should use that column in the
DISTINCTclause. If you want to see unique combinations of multiple columns, you should use those columns in the DISTINCT clause.
- Use indexing: If you are using a
DISTINCTclause on a column that is not indexed, it can make the query slow. Therefore, it is a good idea to make sure that the column or columns you are using in the
DISTINCTclause are indexed.
- Avoid using
DISTINCTwith aggregate functions: Aggregate functions such as
MINalready return a single value for each group of rows. Therefore, it is not necessary to use
- Be aware of sorting: When using the DISTINCT clause, the order of the rows in the result set is undefined. If the order is important, you should use the
GROUP BYclause instead of the
- Keep an eye on performance:
DISTINCTstatements can be heavy in terms of performance, since it is required to look at all rows and compare the data to discard duplicates. If it is an option, is suggested to use
LIMITand/or to combine it with index-backed queries to achieve better performance.