Correlated and NonCorrelated Subqueries (best practices)


Do we know what is correlated and noncorrelated subqueries?

As the names signify : the inner subquery is dependent on outer query or not.Lets look at 2 queries below:

1)  SELECT Continent, Name, Population FROM Country c WHERE Population =
(SELECT MAX(Population) FROM Country c2 WHERE c.Continent=c2.Continent
AND Population > 0 );

2) SELECT company FROM stock WHERE listed_on_exchange = (SELECT ric FROM
market WHERE country=’Japan’);

If we look into subquery part of both example in 1st example subquery can’t be executed alone (reference of outer query used in inner query “c.Continent=c2.Continent”) but in 2nd example subquery can. So 1st one is correlated and 2nd one is noncorrelated.

Some Factors-
If we use “EXPLAIN” in a noncorrelated subquery in most of the cases it will show
“DEPENDENT SUBQUERY”. Which signifies MySQL (especially before 5.6) treats nearly
all subqueries as correlated, So if you use a NONcorrelated subquery, you may get bad
performance as the entire subquery
is evaluated every time it’s visited.

If you have to use a NONcorrelated subquery (ie, if you are using the results in a group-by or limit in a join, etc), you want to use a form like:

select <stuff>
FROM table1 t,
(select group by ) subq

WHERE subq.colval = t.colval;

MySQL can handle a FROM-list noncorrelated subquery reasonably well, but beware of
other forms of noncorrelated subqueries, particularly this one (which is now in the “MySQL
Worst Practices” office wiki):

select <whatever you want>
FROM table AS t1
WHERE t1.colval IN (SELECT t2.col1 FROM table AS t2 GROUP BY );

Its almost impossible to “convince” the MySQL optimizer to behave differently.
The FROM-list form will evaluate the subquery once, while the WHERE form will evaluate the subquery for every row in table1.

According to mysql official site “Rewriting the query as a join might improve performance” if possible.

Hope it will help to smile 🙂

Advertisements