
When a database maintains materialized views, aggregation caches, and
join precomputation tables, a single DELETE can trigger hundreds of
cascading operations. We measured a single-row DELETE taking 1.6 seconds
on a modest cross-join (100 × 200 rows) — entirely due to trigger cascades.
This post describes how we reduced that to 12 seconds for a full test
suite that previously took 58 seconds, by batching trigger side effects
and flushing them after lock release.
The Problem: Death by a Thousand Scans
Consider a query like:
SELECT a.grp, SUM(a.val)
FROM customers a, regions b
GROUP BY a.grp
The database materializes the cross-join into a precomputed table (20,000 rows)
and maintains an aggregation cache (the GROUP BY result) on top of it. Both
are kept in sync via triggers:
- INSERT/DELETE on
customers→ incremental update of the precomputed join - INSERT/DELETE on the join table → incremental update of the aggregation cache
- DELETE on the join table → cleanup check: if no rows remain for a group key,
delete the aggregation cache entry
When we delete one row from customers, the cascade looks like:
- Trigger scans the join table, deletes 200 matching rows (one per region)
- Each of those 200 deletions fires:
- An incremental decrement on the aggregation cache (
-1per row) - A cleanup scan: “does this group key still have rows?”
- Total: 600+ scans for a single source DELETE
Each cleanup scan costs ~10ms (it must verify row existence via index lookup).
200 cleanup scans × 10ms = 2 seconds of pure overhead.
The Fix: Batched Side Effects
The key insight: all 200 join-table deletions happen within a single scan.
Instead of firing triggers per row during the scan (while holding locks),
we collect the side effects and flush them after the scan completes and
locks are released.
Three types of side effects are batched:
1. Incremental Updates (Aggregation)
Before: 200 individual increment(-1) calls on the aggregation cache,
each requiring a lock acquisition and value update.
After: Deltas are aggregated per target row during the scan. At flush time,
one increment(-200) call per unique group key. For 3 group keys, that’s
3 calls instead of 200.
2. Invalidation Coalescing
Before: 200 individual “mark as stale” calls on computed columns, each
doing a binary search to find the affected sort position.
After: A single “invalidate all” call per computed column. One O(1) bitmap
reset instead of 200 O(log n) binary searches.
3. Trigger Cascade Deferral
Before: Each of the 200 deletions immediately fires cleanup triggers while
the source scan holds a write lock. The cleanup triggers try to scan other
tables → deadlock or severe lock contention.
After: Deleted rows are collected in a batch. The batch is flushed after the
write lock is released. Cleanup triggers run lock-free, eliminating contention.
The Deadlock Problem
The most subtle issue was defer ordering. In Go, deferred calls execute in
LIFO (last-in-first-out) order. If the trigger flush is deferred after the
lock acquisition, it executes before the lock release — while still holding
the write lock. Trigger handlers that scan other tables then deadlock.
The fix: explicit lock release before the flush, with a guard flag to prevent
double-release in the deferred cleanup:
// During scan: side effects collected, not executed
// After scan completes:
1. Release write lock (explicit, set guard flag)
2. Flush batched side effects (triggers run lock-free)
3. Deferred cleanup skips lock release (guard flag set)
Results
Test suite: incremental invalidation with cross-join GROUP BY,
concurrent INSERT/DELETE during scans, 10k-row window functions.
| Metric | Before | After | Speedup |
|---|---|---|---|
| Test suite duration | 58s | 12s | 4.8x |
| Single DELETE | 1.6s | ~0.3s | 5x |
| Scans per DELETE | 600+ | ~10 | 60x |
The improvement comes entirely from reducing redundant work:
- 200 individual increments → 3 aggregated increments
- 200 individual invalidations → 1 bulk invalidation
- 200 lock-contended trigger calls → 1 lock-free batch flush
Architecture Takeaway
The general principle: never execute triggers inside a scan loop.
Collect side effects during the scan, flush after the scan. This enables:
- Aggregation: 200 × increment(-1) → 1 × increment(-200)
- Deduplication: 200 cleanup checks on 3 unique keys → 3 checks
- Lock freedom: trigger handlers run without holding source locks
- Vectorization potential: batched triggers can be further optimized
with IN-filter scans instead of per-row equality scans
The infrastructure supports automatic trigger vectorization: trigger bodies
are analyzed at registration time, and if they match known patterns (e.g.,
scan-with-equality-filter-on-OLD-value + $update), a batch-aware version
is generated that replaces N individual scans with one IN-filtered scan.
This is currently implemented for the join-table DELETE pattern and will
be extended to cleanup and aggregation patterns.
Comments are closed