fix: clean up benchmark MCP sessions by PatrickSys · Pull Request #94 · PatrickSys/codebase-context

PatrickSys · 2026-04-10T12:59:55Z

Summary

add a managed stdio MCP session helper that guarantees child cleanup on timeout and failure paths
route benchmark comparator execution through the managed session wrapper
add regression fixtures/tests for connect-timeout and post-connect failure cleanup

Verification

pnpm run type-check
pnpm test -- tests/benchmark-comparators.test.ts tests/zombie-guard.test.ts
full pre-push test suite passed during git push

greptile-apps · 2026-04-10T13:04:17Z

Greptile Summary

This PR introduces withManagedStdioClientSession — a lifecycle wrapper that guarantees child-process cleanup on connect timeout and callback failure — and routes the benchmark comparator through it, replacing ad-hoc client.close() calls. Two issues in the new helper are worth addressing before relying on it heavily: connectPromise is awaited after safeClose with no deadline, so a misbehaving MCP SDK or an unkillable child could cause the finally block to hang indefinitely; and the spawnNotification polling loop exits as soon as settling = true without a final pid check, creating a narrow race that can leave onSpawn uncalled — making the connect-timeout test flaky under load.

Confidence Score: 4/5

Safe to merge for most cases, but the unbounded connectPromise await in the finally block and the onSpawn race are worth hardening before widespread use.

All findings are P2 (no confirmed runtime breakage in normal conditions), but two of them describe reliability gaps in the resource-cleanup guarantee that is the primary purpose of this PR — a hang with no timeout and a race that can make tests flaky. That warrants a 4 rather than 5 to flag them for attention before merge.

scripts/lib/managed-mcp-session.mjs — the finally block cleanup sequence and spawnNotification loop both have edge cases that could undermine the zombie-guard guarantee.

Important Files Changed

Filename	Overview
scripts/lib/managed-mcp-session.mjs	New managed session wrapper; two concerns: `connectPromise` awaited without a timeout after cleanup (potential hang), and a `settling` race that can silently drop the `onSpawn` callback
scripts/benchmark-comparators.mjs	Routes MCP execution through the new managed wrapper; dead-code guard `startsWith('Failed to connect to')` in catch handler is harmless but misleading
tests/benchmark-comparators.test.ts	New regression tests for timeout and post-connect failure cleanup; timeout test could be flaky on slow systems if `onSpawn` race fires before pid is captured
tests/fixtures/mcp/echo-server.mjs	Minimal MCP echo server fixture for post-connect failure test; straightforward and correct
tests/fixtures/mcp/hanging-server.mjs	Single-line fixture that keeps the process alive indefinitely for the connect-timeout test

Sequence Diagram

sequenceDiagram
    participant Caller
    participant withManagedStdioClientSession
    participant MCP Client
    participant StdioClientTransport
    participant ChildProcess

    Caller->>withManagedStdioClientSession: call(options, callback)
    withManagedStdioClientSession->>StdioClientTransport: new StdioClientTransport(...)
    withManagedStdioClientSession->>MCP Client: new Client(clientInfo)
    withManagedStdioClientSession->>MCP Client: connect(transport) → connectPromise
    MCP Client->>StdioClientTransport: spawn child
    StdioClientTransport->>ChildProcess: spawn
    Note over withManagedStdioClientSession: spawnNotification polls transport.pid every 10ms

    alt Connect succeeds within timeout
        withManagedStdioClientSession->>Caller: callback({ client, transport })
        Caller-->>withManagedStdioClientSession: result or throw
    else Connect timeout
        withManagedStdioClientSession->>withManagedStdioClientSession: withTimeout rejects
    end

    Note over withManagedStdioClientSession: finally block runs
    withManagedStdioClientSession->>withManagedStdioClientSession: settling = true (stops poll loop)
    withManagedStdioClientSession->>MCP Client: client.close() (if connected)
    withManagedStdioClientSession->>StdioClientTransport: transport.close()
    StdioClientTransport->>ChildProcess: SIGTERM / kill
    ChildProcess-->>StdioClientTransport: exit
    withManagedStdioClientSession->>withManagedStdioClientSession: await spawnNotification
    withManagedStdioClientSession->>withManagedStdioClientSession: await connectPromise (no timeout ⚠️)
    withManagedStdioClientSession-->>Caller: return result or rethrow

_{Reviews (1): Last reviewed commit: "fix: clean up benchmark MCP sessions" | Re-trigger Greptile}

greptile-apps · 2026-04-10T13:04:20Z

scripts/lib/managed-mcp-session.mjs

+  } finally {
+    settling = true;
+    await safeClose(client, transport, connected);
+    await spawnNotification.catch(() => undefined);
+    await connectPromise.catch(() => undefined);


connectPromise awaited without a timeout after cleanup

After safeClose terminates the child process, await connectPromise.catch(() => undefined) is called with no deadline. If the MCP SDK's connect promise doesn't reject when the transport is force-closed (e.g., due to an SDK buffering bug or a process that ignores SIGTERM), this line will hang indefinitely and withManagedStdioClientSession will never return — defeating the resource-cleanup guarantee. A defensive timeout here would bound the worst-case hang:

} finally { settling = true; await safeClose(client, transport, connected); await spawnNotification.catch(() => undefined); await Promise.race([ connectPromise, new Promise((resolve) => setTimeout(resolve, 5_000)) ]).catch(() => undefined); }

greptile-apps · 2026-04-10T13:04:21Z

scripts/lib/managed-mcp-session.mjs

+  let connected = false;
+  let settling = false;
+  const connectPromise = client.connect(transport);
+  const spawnNotification = (async () => {
+    if (typeof onSpawn !== 'function') {
+      return;
+    }
+
+    while (!settling) {
+      if (transport.pid !== null) {
+        onSpawn(transport.pid);
+        return;
+      }
+      await new Promise((resolve) => setTimeout(resolve, 10));
+    }
+  })();


onSpawn can be silently dropped due to settling race

The polling loop exits the moment settling = true is set in the finally block. If the child process spawns during the 10 ms sleep that happens to span the settling = true assignment, the loop re-evaluates while (!settling), sees true, and exits without ever calling onSpawn — even though transport.pid is now populated. In the connect-timeout test this leaves pid === null, causing expect(pid).toBeTypeOf('number') to fail.

In practice, with a 200 ms timeout and a local process that spawns in < 50 ms, the window is very narrow. But under CI load or on slow machines the test can become flaky. A robust fix is to perform one final pid check after the loop exits:

const spawnNotification = (async () => { if (typeof onSpawn !== 'function') return; while (!settling) { if (transport.pid !== null) { onSpawn(transport.pid); return; } await new Promise((resolve) => setTimeout(resolve, 10)); } // One final check: process may have spawned while we were asleep if (transport.pid !== null) { onSpawn(transport.pid); } })();

greptile-apps · 2026-04-10T13:04:22Z

scripts/benchmark-comparators.mjs

+  ).catch((err) => {
+    if (err.message.startsWith('Failed to connect to')) {
+      throw err;
    }
-
-    const elapsedMs = Date.now() - startMs;
-    const payloadBytes = countUtf8Bytes(payload);
-    const estimatedTokens = estimateTokens(payloadBytes);
-    const { usefulnessScore, matchedSignals, missingSignals } = matchSignals(
-      payload,
-      task.expectedSignals,
-      task.forbiddenSignals
+    throw new Error(
+      err.message.includes('timed out')
+        ? `Failed to connect to ${adapter.name} MCP server: ${err.message}`
+        : err.message
    );
-
-    taskResults.push({
-      taskId: task.id,
-      job: task.job,
-      surface: task.surface,
-      usefulnessScore,
-      matchedSignals,
-      missingSignals,
-      payloadBytes,
-      estimatedTokens,
-      toolCallCount,
-      elapsedMs
-    });
-  }
-
-  await client.close();
-  return taskResults;
+  });


Dead-code branch in catch handler

err.message.startsWith('Failed to connect to') can never be true here. withManagedStdioClientSession either throws "MCP client connect timed out after Xs" (timeout path) or re-throws errors from the callback, none of which start with "Failed to connect to". The guard is harmless but misleading; removing it makes the intent clearer:

Suggested change

).catch((err) => {

if (err.message.startsWith('Failed to connect to')) {

throw err;

}

const elapsedMs = Date.now() - startMs;

const payloadBytes = countUtf8Bytes(payload);

const estimatedTokens = estimateTokens(payloadBytes);

const { usefulnessScore, matchedSignals, missingSignals } = matchSignals(

payload,

task.expectedSignals,

task.forbiddenSignals

throw new Error(

err.message.includes('timed out')

? `Failed to connect to ${adapter.name} MCP server: ${err.message}`

: err.message

);

taskResults.push({

taskId: task.id,

job: task.job,

surface: task.surface,

usefulnessScore,

matchedSignals,

missingSignals,

payloadBytes,

estimatedTokens,

toolCallCount,

elapsedMs

});

}

await client.close();

return taskResults;

});

}).catch((err) => {

throw new Error(

err.message.includes('timed out')

? `Failed to connect to ${adapter.name} MCP server: ${err.message}`

: err.message

);

});

fix: clean up benchmark MCP sessions

408c248

greptile-apps bot reviewed Apr 10, 2026

View reviewed changes

fix: harden managed MCP session cleanup

fcce4b6

PatrickSys merged commit ad5db8b into master Apr 10, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clean up benchmark MCP sessions#94

fix: clean up benchmark MCP sessions#94
PatrickSys merged 2 commits intomasterfrom
fix/benchmark-lifecycle-cleanup

PatrickSys commented Apr 10, 2026

Uh oh!

greptile-apps bot commented Apr 10, 2026

Important Files Changed

Uh oh!

greptile-apps bot Apr 10, 2026

Uh oh!

greptile-apps bot Apr 10, 2026

Uh oh!

greptile-apps bot Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PatrickSys commented Apr 10, 2026

Summary

Verification

Uh oh!

greptile-apps bot commented Apr 10, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant