Skip to content

Playwright poc#9

Draft
andrewsanchez wants to merge 25 commits into
editor-3830from
playwright-poc
Draft

Playwright poc#9
andrewsanchez wants to merge 25 commits into
editor-3830from
playwright-poc

Conversation

@andrewsanchez
Copy link
Copy Markdown

@andrewsanchez andrewsanchez commented May 13, 2026

Plan for getting this upstream

  • examine this PR
  • decompose into small issues
    • one issue for python coverage
    • one issue for vitest and coverage
    • one issue for playwright
    • update justfile
    • update ci.yml
      • checks fail if coverage drops
    • add coverage badge to readme
    • update docs
      • mention pre-commit and how to install
  • identify most important areas to add test coverage for and create issues
  • use codecov.io? or smokeshow to make viewing HTML coverage reports convenient
    directly from the PR
  • increase coverage thresholds over time
  • consider a pre-push hook (installed with pre-commit) to run tests with
    coverage before pushing

@andrewsanchez andrewsanchez changed the base branch from editor-3830 to main May 13, 2026 17:29
@andrewsanchez andrewsanchez changed the base branch from main to editor-3830 May 13, 2026 17:31
Adds @playwright/test ^1.49.0 (resolved to 1.59.1) as a devDependency,
plus two package.json scripts (test:e2e, test:e2e:ui) that proxy to the
Playwright CLI.

This enables a parallel test track to the existing pytest + raw-CDP
harness in qt/tests/test_webview_ipc.py. Playwright drives the new
SvelteKit editor by launching its own Chromium and navigating directly
to mediasrv's editor URL — not by attaching to QtWebEngine over CDP.
Targets PR ankitects#4029 (Shift editor control to TypeScript).

yarn.lock is updated correspondingly via 'yarn install' under Corepack
Yarn 4.11.0.
Mirrors test-integration. Wraps 'yarn playwright test' so the Playwright
suite is discoverable alongside the other test recipes via 'just --list'.
The recipe relies on playwright.config.ts at the repo root, which uses
Playwright's webServer config to launch and tear down a temporary Anki
instance per run.
Spawns a throwaway Anki instance for e2e tests, invoked by Playwright's
webServer config. Differs from the pytest-side anki_process fixture in
qt/tests/conftest.py in that it runs as its own process (not as a pytest
fixture) and exposes mediasrv over a pinned port for external HTTP
clients.

Key environment configuration:

- ANKI_API_PORT=40000 pins the mediasrv port so the Playwright config
  can hardcode the baseURL. Without this, mediasrv chooses a random port
  and tests cannot connect deterministically.

- ANKI_API_HOST=0.0.0.0 is the documented testing escape in
  qt/aqt/mediasrv.py:_have_api_access. Normally the API requires an
  Authorization: Bearer <_APIKEY> header that only QtWebEngine injects;
  setting this var bypasses that check so external clients (Playwright's
  own Chromium) can hit /_anki/* endpoints. Side effect: mediasrv binds
  to all interfaces — acceptable for local testing, would not be safe in
  a shared environment.

- QTWEBENGINE_REMOTE_DEBUGGING is still set so the existing pytest
  CDP-based tests can co-exist with this launcher (different ports).

Profile/db seeding mirrors qt/tests/conftest.py._seed_prefs but adds
"check_for_updates": False to suppress the version-update prompt at
startup. The pre-existing "suppressUpdate": True is insufficient — that
key is compared to the new version string in qt/aqt/update.py:60, not a
boolean, so a True value never matches and the prompt fires anyway. The
real switch is meta["check_for_updates"] read by ProfileManager in
qt/aqt/profiles.py:675 and gated on in main.py:setup_auto_update.

The seed code is intentionally duplicated rather than shared with
conftest.py so the two harnesses can evolve independently; keep both
copies in sync if the schema changes.
playwright.config.ts: pins workers to 1 (the shared Anki collection is
mutable state — concurrent workers would race), sets baseURL to the
mediasrv port, and uses webServer to spawn launch_anki_for_e2e.py.
webServer.url polls the editor URL itself (not just the port) so
mediasrv must be actually serving the SvelteKit shell before tests run.

ts/tests/e2e/fixtures.ts exposes two layered fixtures:

  editorPage: page.goto'd to /editor/?mode={editorMode} with
  window.bridgeCommand and window.pycmd stubbed via addInitScript. The
  stub records calls into window.__bridgeCalls so tests can assert that
  the editor told the host the right things (e.g. 'editorReady',
  'saved'). The bridge stub is the key piece that lets the SvelteKit
  page boot outside QtWebEngine — without it, the editor's
  bridgeCommand calls throw ReferenceError on page mount.

  editor: builds on editorPage by invoking
  loadNote({initial: true, ...}) via page.evaluate, mirroring what
  Python does after receiving the editorReady bridge command in the
  real Qt path. This fires the full bootstrap RPC sequence
  (defaultsForAdding, newNote, getNotetype, getFieldNames,
  noteFieldsCheck) and waits until at least one editor field has
  rendered. Most tests want this fixture.

Notes on choices that were tried and dropped:

- Earlier drafts of the harness used chromium.connectOverCDP() to
  attach to QtWebEngine's CDP port. This fails: Playwright issues
  Browser.setDownloadBehavior during connect, which QtWebEngine rejects
  with "Browser context management is not supported". The new editor
  is served as plain HTTP by mediasrv, so we don't need QtWebEngine at
  all — we just navigate Playwright's own Chromium to the URL.

- Playwright has no 'pierce/' selector prefix; chained .locator()
  calls cross shadow-DOM boundaries on their own.

EditorMode is a test.use() option, defaulting to "add".
Three foundational tests that validate every assumption downstream
suites depend on. Each is independent so a failure is diagnostic:

  1. 'editor page is served and SvelteKit hydrates' — confirms the
     .note-editor mount and that the bridge stub captured an
     'editorReady' bridgeCommand call.

  2. 'loadNote() drives the full bootstrap RPC sequence' — confirms
     that invoking loadNote({initial: true}) from page.evaluate fires
     defaultsForAdding, newNote, getNotetype, getFieldNames, and
     noteFieldsCheck. If this breaks, the editor's bootstrap wiring
     has changed and every other suite will need to update.

  3. 'page.route() intercepts /_anki/* fetches' — confirms the
     network-interception capability that Suites C/D/F/G all rely on.
     This was the original go/no-go gate for the whole approach.

The networkidle waits are wrapped in .catch(() => {}) because
QtWebEngine occasionally emits heartbeat requests that prevent the
idle event from firing; the actual settle signal is the field-render
selector.
Types into two fields, clicks Add, and asserts the addNote RPC payload
matches what the user typed. This is the primary contract test for the
add-card flow under PR ankitects#4029.

Assertions:
- /_anki/addNote request body decodes to AddNoteRequest with
  fields[0]='Hello World', fields[1]='Goodbye World', deckId != 0
- /_anki/addNote response status < 400
- /_anki/noteFieldsCheck fires during typing (debounce)
- /_anki/updateNotes does NOT fire (add mode contract)
- /_anki/newNote fires after Add (form reset signal)
- first field clears after Add
- window.__bridgeCalls contains 'saved'

Two Playwright wrinkles were addressed:

- The Add button is exact-matched via getByRole("button",
  { name: "Add", exact: true }) because there's also an "Add tag"
  button in the tag editor that would otherwise match.

- Request body capture goes through waitForRequest's returned Request
  object, not via route.request().postDataBuffer() inside a route
  handler. The latter returns null intermittently in Playwright when
  forwarding via route.continue() — the waitForRequest path is the
  reliable pattern.

The toast assertion is deliberately omitted. showToast in
NoteEditor.svelte uses a 500ms auto-dismiss, which is too short to be
reliable under Playwright's polling interval. The follow-up newNote
RPC + cleared field are more durable signals of "add succeeded".

This test mutates collection state — a note is persisted on each run.
Validates that PR ankitects#4029's TypeScript html-filter actually runs on
paste and produces the same output the Python BeautifulSoup filter
did. Specifically, ts/lib/html-filter/element.ts:34 adds a convertToDiv
rule that rewrites <p> tags to <div>; this test confirms the rule
fires end-to-end.

Steps:
- Dispatch a synthetic paste ClipboardEvent on the first field's
  anki-editable, with text/html set to <p>Paragraph One</p><p>Paragraph
  Two</p>
- Assert the contenteditable's innerHTML contains
  <div>Paragraph One</div><div>Paragraph Two</div> and zero <p> tags
- Click Add; intercept /_anki/addNote and assert the persisted field
  value also contains <div> and not <p>

Notes:

- The paste event must be constructed inside page.evaluate(). The
  Node-side dispatchEvent does not populate clipboardData, and the
  editor's handlePaste relies on event.clipboardData.getData("text/html"),
  so the event has to be built where the DataTransfer constructor is a
  real browser API.

- Shadow-DOM piercing happens automatically via chained .locator()
  calls; there is no 'pierce/' selector syntax in Playwright. The
  contenteditable lives at
  .rich-text-editable >> shadow >> anki-editable[contenteditable=true].

- tagsAllowedExtended spreads tagsAllowedBasic, so P → DIV holds
  regardless of the PASTE_STRIPS_FORMATTING config value.
Clicks the sticky badge on field 0 and asserts that the new TypeScript
path (getNotetype → updateNotetype) runs, replacing the legacy
bridgeCommand('toggleSticky') path. This is the primary contract test
for the sticky-state migration under PR ankitects#4029.

Assertions:
- /_anki/getNotetype request fires first
- /_anki/updateNotetype fires next; decoded Notetype body has
  fields[0].config.sticky flipped to true
- window.__bridgeCalls does NOT contain 'toggleSticky:0'
- the badge gains the 'highlighted' CSS class
- toggling again flips back to false and the highlighted class is
  removed

Selector notes:

- The sticky badge is in .field-container (parent wrapper), NOT in
  .editor-field (which is just the input body). An earlier attempt
  scoped to .editor-field and timed out because the StickyBadge slot
  is rendered as a sibling of the editor-field div, inside the
  field-container.

- The badge's outer element is <span role="button"> per
  StickyBadge.svelte:55. The Badge child component carries the
  title="Toggle sticky (F9)" attribute. We target via the title and
  walk up to the role=button ancestor with an xpath axis.

- The badge requires hover to be visible (CSS opacity:0 transition).
  Hovering the field-container first ensures the click connects.

This test mutates the notetype configuration. It toggles twice so the
final state matches the initial state — but if a subsequent test
depends on a specific sticky state, set it explicitly.
Validates the end-to-end TS path for duplicate detection: 600ms field
debounce → /_anki/noteFieldsCheck → DUPLICATE response → DOM marks the
field and renders a Show Duplicates link.

Test flow (self-contained, no cross-test ordering):
- Generate a unique probe ('dupe-probe-<timestamp>')
- Type probe, add the note, wait for form reset
- Retype the same probe in the now-empty form
- Assert at least one noteFieldsCheck response has state=DUPLICATE
- Assert the .editor-field at index 0 gains the 'dupe' class
- Assert the .duplicate-link element with "Show Duplicates" text is
  visible
- Clear the field; assert the class and link disappear

The response-capture strategy is the noteworthy part. An earlier
waitForResponse(...).body() approach saw zero bytes despite content-
length being non-zero on the wire — Playwright does not reliably
expose response bodies for non-intercepted fetch() responses. The fix
is to intercept the RPC with page.route, call route.fetch() to forward
and capture the response, decode each response body, record observed
states, and assert via expect.poll that at least one response had
state=DUPLICATE.

A naive waitForResponse on noteFieldsCheck also doesn't work because
the post-reset form fires its own noteFieldsCheck on the empty field
(state=NORMAL), which races with the typed-probe response. The
predicate-based / observe-all approach decouples the test from this
ordering.

The proto enum NoteFieldsCheckResponse_State has DUPLICATE=2 (and
NORMAL=0, EMPTY=1, MISSING_CLOZE=3, ...). The agent's first pass had
these values wrong; they were verified against
out/ts/lib/generated/anki/notes_pb.d.ts.

This test adds one note to the collection per run.
Validates the new Rust-backed addMediaFromUrl path (rslib/src/editor.rs
::retrieve_url, wired through rslib/src/backend/media.rs) plus the TS
data-transfer side. Pasting an image URL into a field should:

  1. Fire /_anki/addMediaFromUrl with the URL in the request payload
  2. Take the returned filename and build a local <img src=...>
  3. NOT touch the network for the external host

Strategy:
- Register page.route('**/_anki/addMediaFromUrl', ...) and fulfill with
  a synthetic AddMediaFromUrlResponse{filename: 'pasted-image.jpg'}
  encoded via proto-es .toBinary(). The real backend is bypassed; the
  test asserts only on the request/response contract.

- Dispatch a paste ClipboardEvent with text/uri-list set to
  https://example.com/image.jpg. text/html is deliberately NOT set —
  data-transfer.ts:processDataTransferEvent short-circuits to raw-HTML
  mode if html is present and never calls addMediaFromUrl.

- Decode the intercepted request body as AddMediaFromUrlRequest and
  assert .url matches.

- Assert the field's innerHTML contains
  <img[^>]+src=[^>]*pasted-image.jpg/ — filenameToLink in
  data-transfer.ts uses encodeURI on the filename, which is a no-op
  for an ascii filename.

- Negative check: install a page.on('request') listener that fails the
  test if any request to a non-127.0.0.1 host fires, proving the
  paste-image flow never accidentally hits the actual remote URL.

This test does not mutate the collection.
Two tests covering both branches of shouldPromptBeforeClosing() in
NoteEditor.svelte. When the user clicks Close on the new editor, the
front-end is responsible for telling Python whether to prompt the
Discard dialog (true) or close silently (false). The decision is
encoded as the val boolean on the /_anki/closeAddCards request body
(a generic.Bool message).

Phase 1 — close with unsaved content: type into a field, click Close,
intercept closeAddCards, decode the body as Bool, assert val === true.

Phase 2 — close with empty fields: click Close without typing, decode
as before, assert val === false (proto3 zero-bytes encoding for
val=false is handled — Bool.fromBinary on a zero-length buffer yields
{val: false}).

The Qt-side Discard MessageBox is avoided entirely by route.fulfill'ing
the intercepted request with an empty 200, so the request never reaches
the Python backend's close_add_cards handler. The native dialog would
be outside the WebEngine and therefore unreachable from Playwright
anyway; in this architecture it never appears in the first place.

The Bool message is in proto/anki/generic.proto and imported as
@generated/anki/generic_pb.Bool. The Close button is exact-matched via
getByRole("button", { name: "Close", exact: true }).
specs/pr-4029-editor-ts-poc.plan.md captures the final architecture,
per-suite contract under test, and the empirical lessons that took
several iterations to surface. It is an as-built record rather than
an aspirational design — earlier drafts (since superseded) assumed a
CDP-attach harness against QtWebEngine, which was the wrong shape.

Documented in the plan:

- Architecture diagram: Playwright launches its own Chromium and
  navigates to mediasrv's editor URL. QtWebEngine is irrelevant to
  the tests.

- The four environment knobs that make the harness work
  (ANKI_API_PORT, ANKI_API_HOST=0.0.0.0,
  meta["check_for_updates"]=False, bridge stub via addInitScript) and
  why each is needed.

- Playwright behaviors that are easy to get wrong:
    * waitForResponse(...).body() can return zero bytes for non-
      intercepted fetch responses → use page.route + route.fetch +
      route.fulfill when a response body is needed
    * waitForRequest(...).postDataBuffer() is reliable for request
      bodies
    * shadow-DOM piercing happens automatically with chained
      .locator() calls; no 'pierce/' prefix exists
    * toast assertions are fragile (500 ms auto-dismiss)
    * getByRole + exact: true avoids name collisions like "Add" vs
      "Add tag"

- Per-suite scope: harness sanity, note-add roundtrip, sticky toggle,
  duplicate detection, paste HTML filter, media from URL, close
  prompt. Each suite maps to one .spec.ts file.

- Suites dropped from the original plan, with rationale: Mode Toggle
  (Shift-key) is a Qt menu interaction outside the WebEngine and
  unreachable by Playwright in this architecture.

- Scope boundaries: what this POC does and does not cover. The
  QtWebEngine integration itself, native Qt dialogs, and add-on hook
  compatibility need separate harnesses.

- Reusable patterns for future Playwright work on other mediasrv-
  served SvelteKit pages (deck-options, import-csv, image-occlusion).

specs/README.md is a one-liner explaining the directory's purpose.
…esting

The previous draft was framed as a per-PR test plan tied to ankitects#4029. The
practical value of this document is wider: it's the reference for
anyone writing Playwright tests against any mediasrv-served SvelteKit
page in this codebase. Re-scoped accordingly and renamed.

What changed:

- Removed all references to the specific PR, issue, and "POC" framing.
- Reframed each section as guidance for future test authors rather
  than as a status report.
- Added a "Why Playwright (and not just pytest+CDP)" section that
  spells out when to use which harness — the existing raw-CDP pytest
  harness is still right for IPC-level tests; Playwright is for the
  web UI.
- Replaced the per-suite plan with two sections:
    * "What to test (and what not to)" — the categories of contracts
      this approach is well-suited for, and the categories it isn't.
    * "Writing a new test" — minimum viable spec + conventions.
- Kept the existing suite as worked-example references, with a
  one-liner each pointing at the pattern the spec demonstrates.
- Kept the Playwright behaviors section, which is the most reusable
  part of the document — each item caused a real failure during
  development and is worth knowing before authoring new tests.
- Added "Extending to other mediasrv-served pages" with the minimal
  recipe for adapting fixtures.ts to deck-options, import-csv,
  image-occlusion, etc.

Renamed from specs/pr-4029-editor-ts-poc.plan.md via git mv to
preserve history.
Comment on lines +1 to +7
---
name: playwright-test-generator
description: 'Use this agent when you need to create automated browser tests using Playwright Examples: <example>Context: User wants to generate a test for the test plan item. <test-suite><!-- Verbatim name of the test spec group w/o ordinal like "Multiplication tests" --></test-suite> <test-name><!-- Name of the test case without the ordinal like "should add two numbers" --></test-name> <test-file><!-- Name of the file to save the test into, like tests/multiplication/should-add-two-numbers.spec.ts --></test-file> <seed-file><!-- Seed file path from test plan --></seed-file> <body><!-- Test case content including steps and expectations --></body></example>'
tools: Glob, Grep, Read, LS, mcp__playwright-test__browser_click, mcp__playwright-test__browser_drag, mcp__playwright-test__browser_evaluate, mcp__playwright-test__browser_file_upload, mcp__playwright-test__browser_handle_dialog, mcp__playwright-test__browser_hover, mcp__playwright-test__browser_navigate, mcp__playwright-test__browser_press_key, mcp__playwright-test__browser_select_option, mcp__playwright-test__browser_snapshot, mcp__playwright-test__browser_type, mcp__playwright-test__browser_verify_element_visible, mcp__playwright-test__browser_verify_list_visible, mcp__playwright-test__browser_verify_text_visible, mcp__playwright-test__browser_verify_value, mcp__playwright-test__browser_wait_for, mcp__playwright-test__generator_read_log, mcp__playwright-test__generator_setup_page, mcp__playwright-test__generator_write_test
model: sonnet
color: blue
---
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we should include these in the repo. we need to test to ensure these work how we need them to work. and also may need to modify these instructions slightly to work with anki?

andrewsanchez and others added 7 commits May 13, 2026 14:40
Documents testable scope (congrats, deck-options, graphs pages and deck
RPC contracts) and explains why the legacy DeckBrowser Qt page is
out-of-scope for the Playwright harness.
uv run exits with code 250 (UV_INTERNAL_ERROR_CODE) on fresh CI
environments when the project-sync/lock-file re-resolution fails.
The launcher script only uses Python stdlib, so there is no reason
to go through uv at all.

Switch the webServer command to call out/pyenv/bin/python directly,
and add pyenv to the ninja targets in test-e2e so it is guaranteed
to be built even when the recipe is run in isolation.
The e2e harness only needs mediasrv's HTTP stack — Playwright's own
Chromium connects to it directly. The Anki Qt window never needs to
be visible. Setting QT_QPA_PLATFORM=offscreen lets Qt initialise
without a display server, eliminating the exit-250 crash that happened
when Qt couldn't connect to the (virtual) framebuffer.

Also add PYTHONUNBUFFERED=1 so subprocess output reaches Playwright's
[WebServer] log immediately, and drop xvfb-run from the CI step since
it is no longer required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants