MySQL Query Optimization Tips

Common SQL Optimization

Select only need columns

Convert a complex SQL statement to multiple simple SQL statements

Convert queries to use indexes. Using indexes operators: =, >, <, >=, <=, BETWEEN, IN. Not using indexes operators: LIKE, <>, NOT IN. Using MySQL built-in functions on column make query cannot use index.

NOT IN

Using LEFT JOIN or NOT EXISTS to replace NOT IN

A NOT IN query example

select * 
from table_a
where table_a.id is not in (
select aid from table_b where table_b.status = 1)
-- or
select *
from table_a
where table_a.id is not in (1,2,3, ...)

Use NOT EXISTS

select * 
from table_a
where not exists (
select * from table_b where table_a.id = table_b.aid and table_b.status = 1
)

Use LEFT JOIN

select *
from table_a
left join table_b on table_a.id=table_b.aid and table_b.status = 1
where table_b.id is null;

LIKE

Using Full-text search MATCH ... AGAINST... or search engine replace LIKE

OR

Using IN, or UNION, or AND NOT to replace OR

ORs are notoriously bad performers, because it splinters the execution path. The UNION alleviates that splintering, and the combines the two results sets. That said, IN is preferable to ORs because though being logically the same, the execution of IN is generally more optimized.

The OR query example 1:

select *
from {table_name}
where name = "Tom" or name = "Jack"

Using IN

select *
from {table_name}
where name in ("Tom", "Jack")

The OR query example 2:

select *
from {table_name}
where name like "T%" or name = "J%"

Using UNION

select *
from {table_name}
where name like "T%"
union
select *
from {table_name}
where name like "J%"

The OR query example 3:

select *
from {table_name}
where {col1} = "a" or {col2} = "b"

Using AND NOT

select *
from {table_name}
where {col1} = "a" and {col2} != "b";
select *
from {table_name}
where {col2} = "b";

Built-in Functions

Avoid using functions in predicates

Use BETWEEN to replace DATE(), DATE_FORMAT()

A DATE_FORMAT example:

DATE_FORMAT(`date`, "%Y-%m") = "2022-07"
YEAR(`date`) = "2022" and MONTH(`date`) = "7"

Use BETWEEN

`date` BETWEEN '2022-07-01' and '2022-07-31'

Convert function to use on right side

WHERE f(g(h(column))) > value
WHERE column > H(G(F(value)))

Page Query

Basic Optimization

  • Query only the necessary columns instead of select *
  • Query part of the TEXT column by SUBSTR(column, start, length) if it could be.

High Offset Page Query Improvement

Late row lookups: Inner join ids instead of select *

  • Inner join is first to find offset + size rows id, then find all rows data by id set
  • select * is just to find offset + size rows data.

select *

select *
from {table_name} as a
where ...
limit {x}, 10

Inner join ids

select *
from {table_name} as a
inner join (
select id
from {table_name}
where ...
order by id
limit {x}, 10
) as b on a.id=b.id;

In most cases, the inner join has a 5x performance improvement, but in some cases there is no difference in performance.

The performance gap is more evident for large row-width tables and queries with large offset.

Next Page Query Improvement

Specify start row id for next page query

first page query

select *
from {table_name} as a
inner join (
select id
from {table_name}
where ...
order by id
limit x, 10
) as b on a.id=b.id;

next page query

select *
from {table_name} as a
inner join (
select id
from {table_name}
where id > {last_page_max_id} and ...
order by id
limit 10
) as b on a.id=b.id;

Cache Page Query

  • Caching total row number for the same condition queries.
  • Caching every page start row id value and end row id value for the same condition queries.

References

General

Optimizing SELECT Statements - MySQL 8.0 Reference Manual

High Offset

MySQL ORDER BY / LIMIT performance: late row lookups

OR

Optimize MySQL COUNT (*) query