Skip to content

fix: remove default case in waitWithContext to prevent busy-spin deadlock#1503

Open
nodece wants to merge 1 commit into
apache:masterfrom
nodece:fix/memory-limit-controller-busy-spin
Open

fix: remove default case in waitWithContext to prevent busy-spin deadlock#1503
nodece wants to merge 1 commit into
apache:masterfrom
nodece:fix/memory-limit-controller-busy-spin

Conversation

@nodece
Copy link
Copy Markdown
Member

@nodece nodece commented May 20, 2026

Fixes #1455

Motivation

When the memory limit is exhausted, ReserveMemory() enters a loop that calls waitWithContext() to block until memory is released. However, waitWithContext() contained a default branch in its select statement that caused it to return immediately without ever blocking:

select {
case <-n:
    return true
case <-ctx.Done():
    return false
default:
    return true  // ← always fires, never waits
}
  1. Consumer messages fill the shared memory limit via ForceReserveMemory
  2. App goroutines processing those messages call producer.Send() (blocking mode)
  3. ReserveMemory() spins at 100% CPU, never sleeping
  4. Since all processing goroutines are stuck spinning, no messages are ever acked — consumer memory is never released
  5. TryReserveMemory keeps failing → the spin loop never exits

This matches the reported symptoms exactly: apparent deadlock with very high CPU usage.

Modifications

  • pulsar/internal/channel_cond.go: Remove the default branch from waitWithContext(). The function now properly blocks on the channel until either ReleaseMemory() triggers a broadcast (memory freed) or the context is cancelled. This converts the busy-spin into a correct condition-variable wait — goroutines sleep while memory is exhausted and wake up only when memory becomes available.

  • pulsar/internal/channel_cond_test.go: Add TestChCondWithContextBlocks to explicitly verify that waitWithContext actually blocks and does not return via a default case, preventing regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid potential deadlock: in client struct split mem limit controller to two, in client options separate mem limit for consumer and producer

1 participant