Skip to content

Commit 5516ed3

Browse files
committed
Update 2PC plan: dedicated blocking thread for MutTxId
Replace the open problem section with the concrete solution: a dedicated blocking thread per participant transaction holds the MutTxId for its entire lifetime. Async HTTP handlers communicate via channels. The MutTxId never crosses a thread boundary. Includes the TxCommand enum design, session management, and ASCII diagram of the HTTP handler / blocking thread interaction.
1 parent f9fdcf9 commit 5516ed3

1 file changed

Lines changed: 51 additions & 16 deletions

File tree

crates/core/2PC-IMPLEMENTATION-PLAN.md

Lines changed: 51 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
The TPC-C benchmark on branch `origin/phoebe/tpcc/reducer-return-value` (public submodule) uses non-atomic HTTP calls for cross-database operations. We need 2PC so distributed transactions either commit on both databases or neither. Pipelined 2PC is chosen because it avoids blocking on persistence during lock-holding, and the codebase already separates in-memory commit from durability.
66

7-
## Protocol (Corrected)
7+
## Protocol
88

99
### Participant happy path:
1010

@@ -37,14 +37,59 @@ The TPC-C benchmark on branch `origin/phoebe/tpcc/reducer-return-value` (public
3737
10. Send COMMIT to durability worker
3838
11. **Barrier down** -- flush buffered requests
3939

40-
### Key correctness properties:
40+
## Key correctness properties
4141

4242
- **Serializable isolation**: Participant holds write lock from CALL through END_CALLS. Multiple CALLs from the same coordinator transaction execute within the same MutTxId on the participant. The second call sees the first call's writes.
4343
- **Persistence barrier**: After PREPARE is sent to durability (step 7/8 on participant, step 5/6 on coordinator), no speculative transactions can reach the durability worker until COMMIT or ABORT. Anything sent to the durability worker can eventually become persistent, so the barrier is required.
4444
- **Two responses from participant**: The immediate result (step 3) and the later PREPARED notification (step 10). The coordinator collects both: results during reducer execution, PREPARED notifications before deciding COMMIT.
4545
- **Pipelining benefit**: Locks are held only during reducer execution (steps 1-6), not during persistence (steps 7-14). The persistence and 2PC handshake happen after locks are released on both sides.
4646

47-
### Abort paths:
47+
## Holding MutTxId: dedicated blocking thread
48+
49+
`MutTxId` is `!Send` (holds `SharedWriteGuard`). The participant must hold it across multiple CALL requests from the coordinator for serializable isolation. The solution: a **dedicated blocking thread per participant transaction** that holds the `MutTxId` for its entire lifetime. Async HTTP handlers communicate with this thread via channels. The `MutTxId` never crosses a thread boundary or touches an async context.
50+
51+
```
52+
HTTP handler (async) Blocking thread (sync, holds MutTxId)
53+
--------------------- -------------------------------------
54+
CALL request arrives ---> receive on channel, execute reducer
55+
<--- send result back on channel
56+
return HTTP response
57+
58+
CALL request arrives ---> execute next reducer (same MutTxId)
59+
<--- send result
60+
return HTTP response
61+
62+
END_CALLS arrives ---> commit in-memory, release write lock
63+
send PREPARE to durability, barrier up
64+
wait for durability...
65+
<--- send PREPARED
66+
return HTTP response
67+
68+
COMMIT arrives ---> send COMMIT to durability, barrier down
69+
thread exits
70+
```
71+
72+
On first CALL for a new 2PC transaction:
73+
1. Spawn a blocking thread (`std::thread::spawn` or `tokio::task::spawn_blocking`)
74+
2. Thread creates `MutTxId` (acquires write lock)
75+
3. Thread blocks on a command channel (`mpsc::Receiver<TxCommand>`)
76+
4. Store the command sender (`mpsc::Sender<TxCommand>`) in a session map keyed by session_id
77+
5. Return session_id to coordinator along with the first CALL's result
78+
79+
Subsequent CALLs and END_CALLS look up the session_id, send commands on the channel. The blocking thread processes them sequentially on the same `MutTxId`.
80+
81+
The blocking thread also needs access to a WASM module instance to execute reducers. The instance must be taken from the pool on thread creation and returned on thread exit (after COMMIT or ABORT).
82+
83+
```rust
84+
enum TxCommand {
85+
Call { reducer: String, args: Bytes, reply: oneshot::Sender<CallResult> },
86+
EndCalls { reply: oneshot::Sender<()> },
87+
Commit { reply: oneshot::Sender<()> },
88+
Abort { reply: oneshot::Sender<()> },
89+
}
90+
```
91+
92+
## Abort paths
4893

4994
**Coordinator's reducer fails (step 2):**
5095
- Send ABORT to all participants (they still hold write locks)
@@ -64,17 +109,7 @@ The TPC-C benchmark on branch `origin/phoebe/tpcc/reducer-return-value` (public
64109
- Coordinator inverts its own in-memory state, discards buffered durability requests
65110

66111
**Crash during protocol:**
67-
- See proposal §8 for recovery rules
68-
69-
### Open problem: MutTxId is !Send
70-
71-
The participant holds MutTxId across multiple HTTP requests (CALL, more CALLs, END_CALLS). MutTxId is !Send (holds SharedWriteGuard). Options:
72-
73-
1. **Dedicated blocking thread per participant transaction**: spawn_blocking holds the MutTxId, communicates via channels. HTTP handlers send messages, blocking thread processes them.
74-
2. **Session-based protocol**: Participant creates a session on first CALL, routes subsequent CALLs and END_CALLS to the same thread/task that holds the MutTxId.
75-
3. **Batch all calls**: Coordinator sends all reducer calls + args in a single request. Participant executes them all, returns all results, then commits. Single HTTP round-trip, no cross-request MutTxId holding.
76-
77-
Option 3 is simplest but limits the coordinator to not making decisions between calls. Option 1 is most general. TBD.
112+
- See proposal in `proposals/00XX-inter-database-communication.md` section 8 for recovery rules
78113

79114
## Commitlog format
80115

@@ -94,12 +129,12 @@ On replay, when encountering a PREPARE:
94129

95130
## Key files
96131

97-
- `crates/core/src/db/relational_db.rs` -- PersistenceBarrier, arm/deactivate, send_or_buffer_durability
132+
- `crates/core/src/db/relational_db.rs` -- PersistenceBarrier, send_or_buffer_durability, finalize_prepare_commit/abort
98133
- `crates/core/src/host/prepared_tx.rs` -- PreparedTxInfo, PreparedTransactions registry
99134
- `crates/core/src/host/module_host.rs` -- prepare_reducer, commit_prepared, abort_prepared
100135
- `crates/core/src/host/wasm_common/module_host_actor.rs` -- coordinator post-commit coordination
101136
- `crates/core/src/host/instance_env.rs` -- call_reducer_on_db_2pc, prepared_participants tracking
102137
- `crates/core/src/host/wasmtime/wasm_instance_env.rs` -- WASM host function
103-
- `crates/client-api/src/routes/database.rs` -- HTTP endpoints
138+
- `crates/client-api/src/routes/database.rs` -- HTTP endpoints (CALL, END_CALLS, COMMIT, ABORT, PREPARED notification)
104139
- `crates/bindings-sys/src/lib.rs` -- FFI
105140
- `crates/bindings/src/remote_reducer.rs` -- safe wrapper

0 commit comments

Comments
 (0)