Skip to content

Commit 4f0fa49

Browse files
DavertMikDavertMikclaude
authored
feat(codeceptq): CLI to query HTML with CodeceptJS locators (#5550)
* update docs * updated docs, added browser plugin * feat(codeceptq): CLI to query HTML with CodeceptJS locators Adds `codeceptq` — a standalone CLI that takes an HTML stream (stdin or --file) plus a CodeceptJS locator (CSS / XPath / fuzzy / semantic) and prints matched elements with line numbers and outerHTML snippets. Designed to give AI agents a fast feedback loop against `aiTrace`'s per-step HTML snapshots: "would this locator match at step N?" without re-running the test or spawning a browser. - Reuses Locator class for CSS→XPath conversion + semantic builders (--field, --click, --checkable, --select). - Optional context arg scopes matches: `codeceptq 'Save' '.modal' --click`. - Stable output flags: --limit, --snippet (default 500), --full, --json. - Exit codes: 0 match, 1 no match, 2 invalid input/XPath. - formatHtml now uses `inline: []` so every element gets its own line in trace HTML — line numbers map 1:1 to elements for codeceptq output. - 45 runner tests against test/data/checkout.html, github.html, gitlab.html, drag_drop.html assert exact line + snippet for every locator strategy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(mcp): surface aiTrace dir in run_test / pause payloads run_test, run_step_by_step, and pausedPayload now include aiTraceDir (the per-test output/trace_<title>_<hash>/ folder) so agents can point codeceptq directly at the saved *_page.html snapshots without globbing or recomputing the hash. Per-test entries in reporterJson.tests[] also carry the dir. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(codeceptq): bump describe timeout to 30s for CI The 'Sign up' --click case on github.html (2k-line fixture, 12-branch semantic union XPath) takes ~8s locally and exceeds the default 10s mocha timeout on slower CI runners. Suite-level timeout matches what the local runs already use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf(codeceptq): pre-resolve //*[@id]/@id subqueries Locator.clickable.wide and field.labelContains emit predicates of form [@aria-labelledby = //*[@id][normalize-space(string(.)) = 'X']/@id ]. xpath@0.0.34 re-runs the inner //* scan once per outer element match — O(N²) on non-trivial docs. The 2k-line github fixture spent 8.5s in that single branch out of 12. Pre-resolve the inner subquery once, splice the resulting id (or a sentinel for no-match) back as a literal so the engine sees a flat attribute compare. Github 'Sign up' --click: 9026ms → 276ms (~33×). Full runner suite: 14s → 6s. Reverts the 30s describe-level timeout from the previous commit since the underlying perf issue is now fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(codeceptq): build per-strategy XPath fragments Replaces the post-hoc regex pre-resolver with strategy-level construction. Each semantic locator (--click/--field/--checkable) is built as a list of XPath branches; doc-wide subqueries (label[@for] resolution, ids by visible text) are evaluated once and inlined as literal predicates instead of sitting nested inside outer per-element predicates that the engine re-executes on every match. Eval loop runs each branch separately and sorts results by source offset to preserve the document-order contract of XPath unions. Github 'Sign up' --click: 9000ms → 264ms (independent of XPath engine — fontoxpath benched the same as xpath@0.0.34 on the original union). All 45 runner tests pass with identical line/snippet output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf(locator): guard aria-labelledby branch with attr-existence predicate The wide clickable / labelContains field XPath includes: .//*[@aria-labelledby = //*[@id][normalize-space(string(.)) = X]/@id] That predicate forces every element to evaluate the inner //*[@id] subquery, which is O(N²) on any non-trivial document for pure-JS XPath engines (xpath npm: 7641ms on a 2k-line page; fontoxpath: 7057ms on the same branch). Browser engines optimize via join-pushdown. Adding [@aria-labelledby] as a left-to-right filter predicate first cuts the slow comparison to only elements that actually have the attribute: .//*[@aria-labelledby][@aria-labelledby = //*[@id][...]/@id] 7641ms → 52ms (147×). Semantics identical: in XPath, [A][B] and [A and B] produce the same result-set, but predicates are evaluated left-to-right, so the cheap attr-existence check filters out the bulk first. This is a single-character XPath change — codeceptq goes from 9000ms → 325ms on test/data/github.html with no special-case code. Reverted the per-strategy reimplementation in lib/command/query.js (back to using Locator.clickable.wide / Locator.field.byText directly). Added two unit tests for the aria-labelledby branch in Locator.clickable.wide (positive + negative). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: DavertMik <davert@testomat.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c769274 commit 4f0fa49

8 files changed

Lines changed: 758 additions & 5 deletions

File tree

bin/codeceptq.js

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/usr/bin/env node
2+
import { Command } from 'commander'
3+
import query from '../lib/command/query.js'
4+
5+
const program = new Command()
6+
7+
program
8+
.name('codeceptq')
9+
.description('Query HTML with CodeceptJS locators (CSS, XPath, fuzzy text, semantic).\n\nReads HTML from stdin or --file and prints matching elements with line numbers.')
10+
.argument('<locator>', 'locator string (CSS, XPath, or text for semantic match)')
11+
.argument('[context]', 'scope locator — restrict matches to descendants of context')
12+
.option('--field', 'treat locator as form field (input/textarea/select)')
13+
.option('--click', 'treat locator as clickable element (link, button, role=button, ...)')
14+
.option('--clickable', 'alias for --click')
15+
.option('--checkable', 'treat locator as checkbox/radio')
16+
.option('--select', 'treat locator as <option> visible text')
17+
.option('--xpath', 'force XPath interpretation')
18+
.option('--css', 'force CSS interpretation')
19+
.option('--file <path>', 'read HTML from file instead of stdin')
20+
.option('--limit <n>', 'cap matches printed', '20')
21+
.option('--snippet <chars>', 'truncate outerHTML per match to N characters', '500')
22+
.option('--full', 'print full outerHTML (no truncation)')
23+
.option('--json', 'output JSON')
24+
.addHelpText(
25+
'after',
26+
`
27+
Examples:
28+
cat trace/0001_page.html | codeceptq './/input'
29+
cat trace/0001_page.html | codeceptq 'Username' --field
30+
cat trace/0001_page.html | codeceptq 'Username' '.form' --field
31+
codeceptq './/button' --file trace/0001_page.html
32+
codeceptq 'Login' --click --file page.html
33+
34+
Exit codes:
35+
0 matches found
36+
1 no matches
37+
2 invalid input or XPath
38+
`,
39+
)
40+
.action(async (locator, context, options) => {
41+
try {
42+
await query(locator, context, options)
43+
} catch (err) {
44+
console.error(`codeceptq: ${err.message}`)
45+
process.exitCode = 2
46+
}
47+
})
48+
49+
program.parseAsync(process.argv)

bin/mcp-server.js

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -465,6 +465,7 @@ function collectRunCompletion(errorMessage) {
465465
}
466466
return {
467467
status: error ? 'failed' : 'completed',
468+
aiTraceDir: currentAiTraceDir,
468469
reporterJson: { stats, tests: results },
469470
error,
470471
aiTraceHint: aiTraceHint(),
@@ -475,11 +476,13 @@ function pausedPayload() {
475476
return {
476477
status: 'paused',
477478
file: pendingTestFile,
479+
aiTraceDir: currentAiTraceDir,
478480
pausedAfter: pendingStepInfo,
479481
suggestions: [
480482
'Call snapshot to capture URL/HTML/ARIA/screenshot/console/storage at this point',
481483
'Call run_code to inspect or manipulate state (e.g. return await I.grabText("h1"))',
482484
'Call continue to release the pause and let the test run the next step (or finish)',
485+
'Query a saved step snapshot offline: codeceptq <locator> --file <aiTraceDir>/<NNNN>_<step>_page.html',
483486
],
484487
}
485488
}

lib/command/query.js

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
import fs from 'fs'
2+
import * as parse5 from 'parse5'
3+
import { DOMImplementation, XMLSerializer } from '@xmldom/xmldom'
4+
import xpath from 'xpath'
5+
import Locator from '../locator.js'
6+
import { xpathLocator } from '../utils.js'
7+
8+
export default async function query(locator, context, options = {}) {
9+
const html = options.file ? fs.readFileSync(options.file, 'utf8') : await readStdin()
10+
11+
if (!html || !html.trim()) {
12+
console.error('codeceptq: no HTML input. Pipe HTML via stdin or use --file <path>.')
13+
process.exitCode = 2
14+
return
15+
}
16+
17+
let xpathExpr
18+
let contextExpr = null
19+
try {
20+
xpathExpr = buildXPath(locator, options)
21+
if (context) contextExpr = buildXPath(context, {})
22+
} catch (err) {
23+
console.error(`codeceptq: cannot build XPath: ${err.message}`)
24+
process.exitCode = 2
25+
return
26+
}
27+
28+
const { doc, source } = htmlToDoc(html)
29+
30+
let nodes
31+
try {
32+
if (contextExpr) {
33+
const ctxNodes = toArray(xpath.select(contextExpr, doc))
34+
const seen = new Set()
35+
nodes = []
36+
for (const ctx of ctxNodes) {
37+
for (const m of toArray(xpath.select(xpathExpr, ctx))) {
38+
if (!seen.has(m)) {
39+
seen.add(m)
40+
nodes.push(m)
41+
}
42+
}
43+
}
44+
} else {
45+
nodes = toArray(xpath.select(xpathExpr, doc))
46+
}
47+
} catch (err) {
48+
console.error(`codeceptq: XPath evaluation failed for "${xpathExpr}": ${err.message}`)
49+
process.exitCode = 2
50+
return
51+
}
52+
53+
const limit = parseInt(options.limit, 10) || 20
54+
const snippetLen = parseInt(options.snippet, 10) || 500
55+
const truncated = nodes.slice(0, limit)
56+
const where = options.file || 'stdin'
57+
58+
if (options.json) {
59+
process.stdout.write(
60+
JSON.stringify(
61+
{
62+
locator,
63+
context: context || null,
64+
xpath: xpathExpr,
65+
contextXPath: contextExpr,
66+
source: where,
67+
total: nodes.length,
68+
shown: truncated.length,
69+
matches: truncated.map(n => ({
70+
line: n.__line ?? null,
71+
snippet: renderSnippet(n, source, snippetLen, options.full),
72+
})),
73+
},
74+
null,
75+
2,
76+
) + '\n',
77+
)
78+
} else {
79+
if (nodes.length === 0) {
80+
console.log(`No matches for ${quote(locator)}${context ? ` within ${quote(context)}` : ''} in ${where}`)
81+
console.log(`(xpath: ${xpathExpr})`)
82+
} else {
83+
const noun = nodes.length === 1 ? 'match' : 'matches'
84+
const more = nodes.length > truncated.length ? ` (showing first ${truncated.length})` : ''
85+
console.log(`${nodes.length} ${noun} for ${quote(locator)}${context ? ` within ${quote(context)}` : ''} in ${where}${more}`)
86+
console.log()
87+
truncated.forEach((node, i) => {
88+
const line = node.__line ?? '?'
89+
console.log(`${i + 1}. Line ${line}`)
90+
const snippet = renderSnippet(node, source, snippetLen, options.full)
91+
snippet.split('\n').forEach(l => console.log(' ' + l))
92+
console.log()
93+
})
94+
}
95+
}
96+
97+
if (nodes.length === 0) process.exitCode = 1
98+
}
99+
100+
function buildXPath(input, options) {
101+
const literal = xpathLocator.literal(input)
102+
if (options.field) return Locator.field.byText(literal)
103+
if (options.click || options.clickable) return Locator.clickable.wide(literal)
104+
if (options.checkable) return Locator.checkable.byText(literal)
105+
if (options.select) {
106+
return Locator.select.byVisibleText(literal).replace(/\.\/(option|optgroup)/g, './/$1')
107+
}
108+
109+
if (options.xpath) return new Locator({ xpath: input }).toXPath()
110+
if (options.css) return new Locator({ css: input }).toXPath()
111+
112+
const loc = new Locator(input)
113+
if (loc.type === 'fuzzy') {
114+
return xpathLocator.combine([Locator.clickable.wide(literal), Locator.field.byText(literal)])
115+
}
116+
return loc.toXPath()
117+
}
118+
119+
function htmlToDoc(html) {
120+
const p5doc = parse5.parse(html, { sourceCodeLocationInfo: true })
121+
const impl = new DOMImplementation()
122+
const doc = impl.createDocument(null, null, null)
123+
walkParse5(p5doc, doc, doc)
124+
return { doc, source: html }
125+
}
126+
127+
function walkParse5(p5node, xmlParent, xmlDoc) {
128+
for (const child of p5node.childNodes || []) {
129+
const name = child.nodeName
130+
if (name === '#text') {
131+
if (child.value != null) {
132+
const t = xmlDoc.createTextNode(child.value)
133+
if (child.sourceCodeLocation) t.__line = child.sourceCodeLocation.startLine
134+
xmlParent.appendChild(t)
135+
}
136+
} else if (name === '#comment') {
137+
try {
138+
xmlParent.appendChild(xmlDoc.createComment(child.data || ''))
139+
} catch {
140+
// ignore comments xmldom rejects
141+
}
142+
} else if (name === '#documentType') {
143+
// skip doctype
144+
} else {
145+
const tagName = child.tagName || name
146+
let el
147+
try {
148+
el = xmlDoc.createElement(tagName)
149+
} catch {
150+
continue
151+
}
152+
for (const attr of child.attrs || []) {
153+
try {
154+
el.setAttribute(attr.name, attr.value)
155+
} catch {
156+
// ignore attrs xmldom rejects (namespaces, invalid names)
157+
}
158+
}
159+
const loc = child.sourceCodeLocation
160+
if (loc) {
161+
el.__line = loc.startLine
162+
el.__startOffset = loc.startOffset
163+
el.__endOffset = loc.endOffset
164+
el.__startTagEndOffset = loc.startTag ? loc.startTag.endOffset : loc.endOffset
165+
}
166+
xmlParent.appendChild(el)
167+
walkParse5(child, el, xmlDoc)
168+
}
169+
}
170+
}
171+
172+
function renderSnippet(node, source, snippetLen, full) {
173+
if (typeof node.__startOffset !== 'number') {
174+
try {
175+
return new XMLSerializer().serializeToString(node)
176+
} catch {
177+
return `<${node.nodeName || '?'}>`
178+
}
179+
}
180+
const start = node.__startOffset
181+
const end = node.__endOffset ?? start
182+
if (full) return source.slice(start, end)
183+
184+
const tagEnd = node.__startTagEndOffset ?? end
185+
const openingTag = source.slice(start, tagEnd)
186+
if (end <= tagEnd) return openingTag
187+
188+
const totalLen = end - start
189+
if (totalLen <= snippetLen) return source.slice(start, end)
190+
191+
const remaining = Math.max(0, snippetLen - openingTag.length)
192+
if (remaining < 20) return openingTag + ' …'
193+
return openingTag + source.slice(tagEnd, tagEnd + remaining) + ' …'
194+
}
195+
196+
function readStdin() {
197+
return new Promise((resolve, reject) => {
198+
if (process.stdin.isTTY) {
199+
resolve('')
200+
return
201+
}
202+
let data = ''
203+
process.stdin.setEncoding('utf8')
204+
process.stdin.on('data', chunk => (data += chunk))
205+
process.stdin.on('end', () => resolve(data))
206+
process.stdin.on('error', reject)
207+
})
208+
}
209+
210+
function toArray(v) {
211+
if (Array.isArray(v)) return v
212+
if (v == null || v === '' || typeof v === 'boolean' || typeof v === 'number') return []
213+
return [v]
214+
}
215+
216+
function quote(s) {
217+
return `'${String(s).replace(/'/g, "\\'")}'`
218+
}

lib/html.js

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -323,6 +323,9 @@ async function formatHtml(html) {
323323
wrap_line_length: 0,
324324
preserve_newlines: false,
325325
end_with_newline: false,
326+
// Force every element onto its own line so line numbers in trace HTML
327+
// map 1:1 to elements (consumed by codeceptq for AI/agent debugging).
328+
inline: [],
326329
})
327330
} catch (e) {
328331
return processed

lib/locator.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -589,7 +589,7 @@ Locator.clickable = {
589589
`.//button[./@name = ${literal}]`,
590590
`.//*[@aria-label = ${literal}]`,
591591
`.//*[@title = ${literal}]`,
592-
`.//*[@aria-labelledby = //*[@id][normalize-space(string(.)) = ${literal}]/@id ]`,
592+
`.//*[@aria-labelledby][@aria-labelledby = //*[@id][normalize-space(string(.)) = ${literal}]/@id]`,
593593
`.//*[@role='button'][normalize-space(.)=${literal}]`,
594594
`.//*[@role='tab' or @role='link' or @role='menuitem' or @role='menuitemcheckbox' or @role='menuitemradio' or @role='option' or @role='treeitem'][contains(normalize-space(string(.)), ${literal})]`,
595595
]),
@@ -632,7 +632,7 @@ Locator.field = {
632632
`.//label[contains(normalize-space(string(.)), ${literal})]//.//*[self::input | self::textarea | self::select][not(./@type = 'submit' or ./@type = 'image' or ./@type = 'hidden')]`,
633633
`.//*[@aria-label = ${literal}]`,
634634
`.//*[@title = ${literal}]`,
635-
`.//*[@aria-labelledby = //*[@id][normalize-space(string(.)) = ${literal}]/@id ]`,
635+
`.//*[@aria-labelledby][@aria-labelledby = //*[@id][normalize-space(string(.)) = ${literal}]/@id]`,
636636
]),
637637

638638
/**

package.json

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,8 @@
4747
},
4848
"bin": {
4949
"codeceptjs": "./bin/codecept.js",
50-
"codeceptjs-mcp": "./bin/mcp-server.js"
50+
"codeceptjs-mcp": "./bin/mcp-server.js",
51+
"codeceptq": "./bin/codeceptq.js"
5152
},
5253
"repository": "codeceptjs/CodeceptJS",
5354
"scripts": {
@@ -132,6 +133,7 @@
132133
"resq": "1.11.0",
133134
"sprintf-js": "1.1.3",
134135
"uuid": "11.1.0",
136+
"xpath": "0.0.34",
135137
"zod": "^4.1.11"
136138
},
137139
"optionalDependencies": {
@@ -193,8 +195,7 @@
193195
"typescript": "5.9.3",
194196
"wdio-docker-service": "3.2.1",
195197
"webdriverio": "9.23.0",
196-
"xml2js": "0.6.2",
197-
"xpath": "0.0.34"
198+
"xml2js": "0.6.2"
198199
},
199200
"peerDependencies": {
200201
"tsx": "^4.0.0"

0 commit comments

Comments
 (0)