Skip to content

fix(fslock): delete lock file on Release#375

Merged
zeroshade merged 4 commits into
mainfrom
fslock-delete-on-release
May 5, 2026
Merged

fix(fslock): delete lock file on Release#375
zeroshade merged 4 commits into
mainfrom
fslock-delete-on-release

Conversation

@zeroshade
Copy link
Copy Markdown
Member

Summary

  • Release now deletes the lock file instead of leaving it on disk.
  • Closes the TOCTOU that previously motivated keeping the file: a naive delete-on-release races with a concurrent Acquire that has opened the path but not yet taken the lock, leaving two processes with locks on different inodes at the same path.
  • Unix: unlink-before-close so new openers can't see our inode, plus an inode recheck after flock — if the path no longer points at our fd's inode, we release, close, and reopen via a retry loop.
  • Windows: Close then os.Remove. Go opens files without FILE_SHARE_DELETE, so Remove only succeeds once no other process holds a handle — we swallow the error if cleanup can't happen this round.

Test plan

  • go test ./internal/fslock/ -race -count=1 -v — all pass, including two new tests:
    • TestReleaseRemovesFile verifies the file is gone after Release (would have failed on old code).
    • TestConcurrentAcquireIsExclusive spawns 8 goroutines that acquire/hold/release and asserts the max concurrent holder count is 1, exercising the delete/reopen path.
  • GOOS=windows go build ./internal/fslock/ compiles cleanly.
  • Full go test ./... -short passes.

Release previously left the lock file on disk to avoid a TOCTOU where
one process unlinks the file while another has just opened it and is
about to flock — the two would end up holding locks on different
inodes at the same path.

Delete safely by: (unix) unlinking before close so new openers can't
see our inode, plus an inode recheck after flock that reopens if the
path no longer points at our fd; (windows) Close then os.Remove, which
Go opens without FILE_SHARE_DELETE so Remove fails harmlessly while
any handle is open.
@zeroshade zeroshade requested a review from amoeba May 5, 2026 16:00
…ows Release error swallow

- Unix: stale-inode retry loop now checks the deadline before continuing
  and returns a proper timeout error when the budget is exhausted, so
  Acquire cannot exceed its requested timeout.
- Windows: Release only swallows os.Remove failures that indicate another
  handle is still open (ERROR_SHARING_VIOLATION / ERROR_ACCESS_DENIED) or
  the file is already gone. Other filesystem errors now propagate.

Addresses roborev review 1845 (M+L).
ERROR_ACCESS_DENIED is not a reliable signal that another handle is
keeping the file open — Windows also returns it for read-only attributes,
ACL changes, or the path being replaced with a directory. Only
ERROR_SHARING_VIOLATION unambiguously indicates the deferred-cleanup
case, so narrow isBenignRemoveErr to just that and ErrNotExist. Other
filesystem errors now propagate from Release.

Addresses roborev review 1846.
@zeroshade zeroshade deployed to snapshot May 5, 2026 16:07 — with GitHub Actions Active
…omment

- Update Release doc comment: only ERROR_SHARING_VIOLATION (and ErrNotExist)
  are swallowed; ERROR_ACCESS_DENIED now propagates.
- Add fslock_windows_test.go covering sharing-violation (benign),
  not-exist (benign), access-denied (propagated), and generic
  (propagated) cases to lock in the distinction going forward.

Addresses roborev review 1847.
Copy link
Copy Markdown
Member

@amoeba amoeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested manually and it looks good.

@zeroshade zeroshade merged commit 816b1c9 into main May 5, 2026
11 checks passed
@zeroshade zeroshade deleted the fslock-delete-on-release branch May 5, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants