MySQL Optimization in Practice: Index What You Query, Not What You Store (Hands-On)

MySQL Optimization in Practice: Index What You Query, Not What You Store (Hands-On)

Most MySQL “performance problems” aren’t solved by throwing more CPU at the database—they’re solved by making queries predictable and index-friendly. This guide walks through a practical workflow junior/mid developers can apply today: measure, inspect with EXPLAIN, add the right indexes, and verify improvements. All examples work on MySQL 8+ and InnoDB.

1) Start With a Realistic Table and a Slow Query

Let’s imagine an e-commerce-ish table of orders. This structure is common, and it’s easy to accidentally query it in expensive ways.

CREATE TABLE orders ( id BIGINT UNSIGNED PRIMARY KEY AUTO_INCREMENT, user_id BIGINT UNSIGNED NOT NULL, status VARCHAR(32) NOT NULL, total_cents INT UNSIGNED NOT NULL, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, KEY idx_user_id (user_id) -- we start with a basic index ) ENGINE=InnoDB;

A typical endpoint: “Show a user’s most recent paid orders, newest first, paginated.”

SELECT id, total_cents, created_at FROM orders WHERE user_id = 42 AND status = 'paid' ORDER BY created_at DESC LIMIT 20 OFFSET 0;

This looks harmless, but if your dataset grows, MySQL may scan too many rows unless you help it.

2) Use EXPLAIN to See What MySQL Plans to Do

EXPLAIN is your first debugging tool. Run it before you “optimize” anything.

EXPLAIN SELECT id, total_cents, created_at FROM orders WHERE user_id = 42 AND status = 'paid' ORDER BY created_at DESC LIMIT 20;

You’ll often see signs like:

  • type = ref or range (good-ish), but rows is huge (bad)
  • Using where; Using filesort (sorting can become expensive)
  • key chosen is not what you expect

Using filesort doesn’t always mean “slow,” but it often means “the index can’t serve the ORDER BY,” so MySQL sorts a big intermediate set.

3) Add a Composite Index That Matches Your WHERE + ORDER BY

Indexes work best when they match how you filter and sort. For our query:

  • Filter: user_id, status
  • Sort: created_at DESC

Create an index in that order:

ALTER TABLE orders ADD KEY idx_user_status_created (user_id, status, created_at);

Now re-run EXPLAIN. The goal is:

  • MySQL uses idx_user_status_created
  • rows estimate drops sharply
  • Using filesort disappears (often) because the index can deliver rows in created_at order for the filtered prefix

Why this helps: MySQL can traverse the index entries for (user_id=42, status='paid') already ordered by created_at, then stop after 20 rows.

4) Make the Index “Cover” the Query When It Makes Sense

Covering indexes can avoid extra table lookups. If the index contains all columns needed by the query, MySQL can read only the index.

Our query selects id, total_cents, created_at. In InnoDB, the primary key (id) is implicitly part of secondary index entries. If we also include total_cents, the index can cover the query:

ALTER TABLE orders ADD KEY idx_user_status_created_cover (user_id, status, created_at, total_cents);

Now check EXPLAIN for Extra showing Using index (meaning it’s covered). Don’t blindly “cover everything” though—bigger indexes cost RAM and slow writes. Cover hot queries only.

5) Fix Pagination Before It Fixes You

LIMIT ... OFFSET ... gets slower as OFFSET grows because MySQL still needs to walk past skipped rows. For “infinite scroll” or page 200, switch to keyset pagination.

Instead of OFFSET, use the last seen created_at (and a tie-breaker like id):

-- First page SELECT id, total_cents, created_at FROM orders WHERE user_id = 42 AND status = 'paid' ORDER BY created_at DESC, id DESC LIMIT 20; -- Next page: pass last_created_at and last_id from previous page SELECT id, total_cents, created_at FROM orders WHERE user_id = 42 AND status = 'paid' AND (created_at, id) < ('2026-04-01 12:00:00', 987654) ORDER BY created_at DESC, id DESC LIMIT 20;

Make sure your index matches the ordering:

ALTER TABLE orders ADD KEY idx_user_status_created_id (user_id, status, created_at, id);

This keeps pagination fast no matter how deep you go.

6) Avoid Common “Index Killers”

These patterns prevent MySQL from using indexes effectively (or at all):

  • Functions on indexed columns: WHERE DATE(created_at) = '2026-04-01'
  • Leading wildcard LIKE: WHERE email LIKE '%@gmail.com'
  • Implicit type conversion: comparing user_id (INT) to a string like '42' can hurt
  • OR across different columns: WHERE status='paid' OR total_cents > 10000 often degrades plans

Rewrite these to be index-friendly:

-- Bad: function on column WHERE DATE(created_at) = '2026-04-01' -- Good: range query (indexable) WHERE created_at >= '2026-04-01 00:00:00' AND created_at < '2026-04-02 00:00:00'

7) Optimize JOINs With the Right Indexes (Both Sides)

Slow pages often come from JOINs. Example: list paid orders with user email.

CREATE TABLE users ( id BIGINT UNSIGNED PRIMARY KEY AUTO_INCREMENT, email VARCHAR(255) NOT NULL, created_at DATETIME NOT NULL, UNIQUE KEY uk_email (email) ) ENGINE=InnoDB; SELECT o.id, o.total_cents, u.email FROM orders o JOIN users u ON u.id = o.user_id WHERE o.status = 'paid' ORDER BY o.created_at DESC LIMIT 50;

Checklist:

  • users.id is already indexed (primary key).
  • Make sure orders.user_id is indexed (it is).
  • But we also filter by status and sort by created_at. So index that path too:
ALTER TABLE orders ADD KEY idx_status_created_user (status, created_at, user_id);

Yes, you might have multiple indexes that start with different columns. That’s normal: each index exists to accelerate a specific query pattern. Just keep them intentional.

8) Measure Properly With a Mini Benchmark Script

You don’t need a full load test to validate an index. Here’s a small Python script that runs a query multiple times and prints average duration. (Use a staging dataset if possible.)

import os import time import statistics import mysql.connector DB = { "host": os.getenv("MYSQL_HOST", "127.0.0.1"), "user": os.getenv("MYSQL_USER", "root"), "password": os.getenv("MYSQL_PASSWORD", ""), "database": os.getenv("MYSQL_DB", "app"), } QUERY = """ SELECT id, total_cents, created_at FROM orders WHERE user_id = %s AND status = %s ORDER BY created_at DESC, id DESC LIMIT 20 """ def main(): cnx = mysql.connector.connect(**DB) cnx.autocommit = True cur = cnx.cursor() timings = [] for _ in range(30): t0 = time.perf_counter() cur.execute(QUERY, (42, "paid")) cur.fetchall() timings.append((time.perf_counter() - t0) * 1000) print(f"avg_ms={statistics.mean(timings):.2f} p95_ms={statistics.quantiles(timings, n=20)[18]:.2f}") cur.close() cnx.close() if __name__ == "__main__": main()

Run this before and after adding an index. If timings barely move, either:

  • the query wasn’t actually the bottleneck, or
  • the index doesn’t match the query path, or
  • the dataset is too small to show differences.

9) Practical Rules of Thumb You Can Apply Immediately

  • Index your filters first: columns in WHERE (most selective first in many cases).
  • Then index your sort: columns in ORDER BY right after the filter prefix.
  • Keep indexes minimal: every extra index slows INSERT/UPDATE.
  • Use keyset pagination for deep lists.
  • Verify with EXPLAIN and a repeatable timing script—not vibes.
  • Prefer range queries over wrapping indexed columns in functions.

10) A Quick “Before You Ship” Checklist

  • Did you run EXPLAIN on the slow query and confirm which index is used?
  • Does the index match WHERE columns and ORDER BY order?
  • Did you avoid OFFSET for large pagination?
  • Are you adding indexes only for real production queries (not hypothetical ones)?
  • Did you benchmark before/after with a consistent script?

MySQL optimization gets much easier once you treat it like a loop: observe → explain → index → verify. Start with your slowest endpoint, apply the patterns above, and you’ll usually get a noticeable speedup without touching infrastructure.


Leave a Reply

Your email address will not be published. Required fields are marked *