HOME

TheInfoList



OR:

In a SQL
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
query, a correlated subquery (also known as a synchronized subquery) is a subquery (a query nested inside another query) that uses values from the outer query. Because the subquery may be evaluated once for each row processed by the outer query, it can be slow. Here is an example for a typical correlated subquery. In this example, the objective is to find all employees whose salary is above average for their department. SELECT employee_number, name FROM employees emp WHERE salary > ( SELECT AVG(salary) FROM employees WHERE department = emp.department); In the above query the outer query is SELECT employee_number, name FROM employees emp WHERE salary > ... and the inner query (the correlated subquery) is SELECT AVG(salary) FROM employees WHERE department = emp.department In the above nested query the inner query has to be re-executed for each employee. (A sufficiently smart implementation may cache the inner query's result on a department-by-department basis, but even in the best case the inner query must be executed once per department.) Correlated subqueries may appear elsewhere besides the WHERE clause; for example, this query uses a correlated subquery in the SELECT clause to print the entire list of employees alongside the average salary for each employee's department. Again, because the subquery is correlated with a column of the outer query, it must be re-executed for each row of the result. SELECT employee_number, name, (SELECT AVG(salary) FROM employees WHERE department = emp.department) AS department_average FROM employees emp


Correlated subqueries in the FROM clause

It is generally meaningless to have a correlated subquery in the FROM clause because the table in the FROM clause is needed to evaluate the outer query, but the correlated subquery in the FROM clause can't be evaluated before the outer query is evaluated, causing a
chicken-and-egg problem The chicken or the egg causality dilemma is commonly stated as the question, "which came first: the chicken or the egg?" The dilemma stems from the observation that all chickens hatch from eggs and all chicken eggs are laid by chickens. "Chic ...
. Specifically,
MariaDB MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Development is led by some of the ori ...
lists this as a limitation in its documentation. However, in some database systems, it is allowed to use correlated subqueries while joining in the FROM clause, referencing the tables listed before the join using a specified keyword, producing a number of rows in the correlated subquery and joining it to the table on the left. For example, in
PostgreSQL PostgreSQL (, ), also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the In ...
, adding the keyword LATERAL before the right-hand subquery, or in
Microsoft SQL Server Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which ma ...
, using the keyword CROSS APPLY or OUTER APPLY instead of JOIN achieves the effect.


Computation of correlated subqueries

A commonly used computational method for a correlated subquery is to unnest it into an equivalent flat que
.
The algorithm development in this direction has an advantage of low complexity. Because this is a customized approach, existing database systems cannot unrest arbitrary correlated subqueries by following certain general rules. In addition, this approach requires high engineering efforts to implement unnesting algorithms into a database engine. A general computational approach is to directly execute the nested loop by iterating all tuples of the correlated columns from the outer query block and executing the subquery as many times as the number of outer-loop tupl
.
This simple approach has an advantage of general-purpose because it is not affected by the type of correlated operators or subquery structures. However, it has a high computational complexity. A GPU acceleration approach is used to significantly improve the performance of the nested method of high algorithmic complexity by exploiting massive parallelism and device memory locality on GPU, which accomplishes the goal for both general-purpose software design and implementation and high performance in subquery processing.


References

{{reflist


External links


Correlated subquery with examples
SQL