换种说法:使用JOIN,SUBSELECT和UNION进行(复杂)查询是否有可能(或不能)通过使用一些转换规则将其简化为更简单的等效sql语句,从而产生相同的结果?
这正是优化程序谋生的手段(不是说他们总是做得很好)。
由于sql
是一种基于集合的语言,因此通常存在不止一种将一个查询转换为另一个查询的方法。
像这个查询:
SELECT *
FROM mytable
WHERE col1 > @value1 OR col2 < @value2
可以转换成这样:
SELECT *
FROM mytable
WHERE col1 > @value1
UNION
SELECT *
FROM mytable
WHERE col2 < @value2
或这个:
SELECT mo.*
FROM (
SELECT id
FROM mytable
WHERE col1 > @value1
UNION
SELECT id
FROM mytable
WHERE col2 < @value2
) mi
JOIN mytable mo
ON mo.id = mi.id
,看起来比较丑陋,但可以产生更好的执行计划。
最常见的事情之一是替换此查询:
SELECT *
FROM mytable
WHERE col IN
(
SELECT othercol
FROM othertable
)
与此:
SELECT *
FROM mytable mo
WHERE EXISTS
(
SELECT NULL
FROM othertable o
WHERE o.othercol = mo.col
)
在某些RDBMS
(如Postgresql
)中,DISTINCT
并GROUP BY
使用不同的执行计划,因此有时最好将其中一个替换为另一个:
SELECT mo.grouper,
(
SELECT SUM(col)
FROM mytable mi
WHERE mi.grouper = mo.grouper
)
FROM (
SELECT DISTINCT grouper
FROM mytable
) mo
与
SELECT mo.grouper, SUM(col)
FROM mytable
GROUP BY
mo.grouper
在中Postgresql
,DISTINCT
排序和GROUP BY
散列。
MysqL
缺少FULL OUTER JOIN
,因此可以将其改写为以下形式:
SELECT t1.col1, t2.col2
FROM table1 t1
LEFT OUTER JOIN
table2 t2
ON t1.id = t2.id
与
SELECT t1.col1, t2.col2
FROM table1 t1
LEFT JOIN
table2 t2
ON t1.id = t2.id
UNION ALL
SELECT NULL, t2.col2
FROM table1 t1
RIGHT JOIN
table2 t2
ON t1.id = t2.id
WHERE t1.id IS NULL
,但请参阅我的博客中有关如何更有效地执行此操作的文章MysqL
:
此分层查询在Oracle
:
SELECT DISTINCT(animal_id) AS animal_id
FROM animal
START WITH
animal_id = :id
CONNECT BY
PRIOR animal_id IN (father, mother)
ORDER BY
animal_id
可以转换为:
SELECT DISTINCT(animal_id) AS animal_id
FROM (
SELECT 0 AS gender, animal_id, father AS parent
FROM animal
UNION ALL
SELECT 1, animal_id, mother
FROM animal
)
START WITH
animal_id = :id
CONNECT BY
parent = PRIOR animal_id
ORDER BY
animal_id
,后者表现更好。
要查找与给定范围重叠的所有范围,可以使用以下查询:
SELECT *
FROM ranges
WHERE end_date >= @start
AND start_date <= @end
,但是在sql Server
这种更复杂的查询中,更快地产生相同的结果:
SELECT *
FROM ranges
WHERE (start_date > @start AND start_date <= @end)
OR (@start BETWEEN start_date AND end_date)
sql Server
还缺少执行累积聚合的有效方法,因此此查询:
SELECT mi.id, SUM(mo.value) AS running_sum
FROM mytable mi
JOIN mytable mo
ON mo.id <= mi.id
GROUP BY
mi.id
可以使用,主帮助我更有效地重写,游标(你听我的权利:cursors
,more efficiently
并sql Server
在一个句子)。
在金融应用程序中通常会遇到一种查询,查询某种货币的有效汇率,例如Oracle
:
SELECT TO_CHAR(SUM(xac_amount * rte_rate), 'FM999G999G999G999G999G999D999999')
FROM t_transaction x
JOIN t_rate r
ON (rte_currency, rte_date) IN
(
SELECT xac_currency, MAX(rte_date)
FROM t_rate
WHERE rte_currency = xac_currency
AND rte_date <= xac_date
)
可以大量重写此查询以使用等于条件,HASH JOIN
而该条件允许a而不是NESTED LOOPS
:
WITH v_rate AS
(
SELECT cur_id AS eff_currency, dte_date AS eff_date, rte_rate AS eff_rate
FROM (
SELECT cur_id, dte_date,
(
SELECT MAX(rte_date)
FROM t_rate ri
WHERE rte_currency = cur_id
AND rte_date <= dte_date
) AS rte_effdate
FROM (
SELECT (
SELECT MAX(rte_date)
FROM t_rate
) - level + 1 AS dte_date
FROM dual
CONNECT BY
level <=
(
SELECT MAX(rte_date) - MIN(rte_date)
FROM t_rate
)
) v_date,
(
SELECT 1 AS cur_id
FROM dual
UNION ALL
SELECT 2 AS cur_id
FROM dual
) v_currency
) v_eff
LEFT JOIN
t_rate
ON rte_currency = cur_id
AND rte_date = rte_effdate
)
SELECT TO_CHAR(SUM(xac_amount * eff_rate), 'FM999G999G999G999G999G999D999999')
FROM (
SELECT xac_currency, TRUNC(xac_date) AS xac_date, SUM(xac_amount) AS xac_amount, COUNT(*) AS cnt
FROM t_transaction x
GROUP BY
xac_currency, TRUNC(xac_date)
)
JOIN v_rate
ON eff_currency = xac_currency
AND eff_date = xac_date
尽管笨拙,但后者查询的6
速度要快几倍。
这里的主要思想是将替换<=
为=
,这需要构建一个内存日历表。要JOIN
与。