You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add MCP test runner support for run_tests and get_test_results
Adds MCP tools to control the Phoenix test runner remotely: run test
suites by category/spec and poll structured results. Includes WS
protocol handlers, test-runner-side MCP script, and updated CLAUDE.md
with accurate suite naming guidance.
Copy file name to clipboardExpand all lines: CLAUDE.md
+45Lines changed: 45 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,3 +24,48 @@ Use `exec_js` to run JS in the Phoenix browser runtime. jQuery `$()` is global.
24
24
**Click AI chat buttons:**`$('.ai-edit-restore-btn:contains("Undo")').click();`
25
25
26
26
**Check logs:**`get_browser_console_logs` with `filter` regex (e.g. `"AI UI"`, `"error"`) and `tail` — includes both browser console and Node.js (PhNode) logs. Use `get_terminal_logs` for Electron process output (only available if Phoenix was launched via `start_phoenix`).
27
+
28
+
## Running Tests via MCP
29
+
30
+
The test runner must be open as a separate Phoenix instance (it shows up as `phoenix-test-runner-*` in `get_phoenix_status`). Use `run_tests` to trigger test runs and `get_test_results` to poll for results. `take_screenshot` also works on the test runner.
31
+
32
+
### Test categories
33
+
-**unit** — Fast, no UI. Safe to run all at once (`run_tests category=unit`).
34
+
-**integration** — Spawns a Phoenix iframe inside the test runner. Some specs require window focus and will hang if the test runner window isn't focused.
35
+
-**LegacyInteg** — Like integration but uses the legacy test harness. Also spawns an embedded Phoenix instance.
-**Do NOT use:**`all`, `performance`, `extension`, `individualrun` — not actively supported.
38
+
39
+
### Hierarchy: Category → Suite → Test
40
+
-**Category** — top-level grouping: `unit`, `integration`, `LegacyInteg`, etc. Safe to run an entire category.
41
+
-**Suite** — a group of related tests within a category (e.g. `integration: FileFilters` has ~20 tests). This is the `spec` parameter value.
42
+
-**Test** — a single test within a suite.
43
+
44
+
### Running all tests in a category
45
+
```
46
+
run_tests(category="unit")
47
+
```
48
+
49
+
### Running a single suite
50
+
Pass the exact suite name as the `spec` parameter. **Suite names do NOT always have a category prefix.** Many suites are registered with just their plain name (e.g. `"CSS Parsing"`, `"Editor"`, `"JSUtils"`), while others include a prefix (e.g. `"unit:Phoenix Platform Tests"`, `"integration: FileFilters"`, `"LegacyInteg:ExtensionLoader"`). If the suite name is wrong, the test runner will show a blank page with 0 specs and appear stuck.
51
+
52
+
**To discover the exact suite name**, run this in `exec_js` on the test runner instance:
You can pass a specific test's full name as `spec` to run just that one test. It is perfectly valid to run a single test. However, if a single test fails, re-run the full suite to confirm — suites sometimes execute tests in order with shared state, so an individual test may fail in isolation but pass within its suite. If the suite passes, the test is valid.
67
+
68
+
### Gotchas
69
+
-**Instance name changes on reload:** The test runner gets a new random instance name each time the page reloads. Always check `get_phoenix_status` after a `run_tests` call to get the current instance name.
70
+
-**Integration tests may hang:** Specs labeled "needs window focus" will hang indefinitely if the test runner doesn't have OS-level window focus. If `get_test_results` starts timing out, the event loop is likely blocked by a stuck spec — use `force_reload_phoenix` to recover.
71
+
-**LegacyInteg/integration tests spawn an iframe:** These tests open an embedded Phoenix instance inside the test runner, so they are slower and more resource-intensive than unit tests.
0 commit comments