Skip to content

Commit a5ab2e4

Browse files
committed
First step in implementing the DSL checker
1 parent 6c94666 commit a5ab2e4

4 files changed

Lines changed: 549 additions & 17 deletions

File tree

Lines changed: 294 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,294 @@
1+
# Phase 0 / T1: miniexpr DSL syntax inventory (source-anchored)
2+
3+
This is the **ground-truth inventory** from current `../miniexpr` sources/tests.
4+
It is intended to drive `P0-T2` (`../miniexpr/doc/dsl-syntax.md`) and later Python-side validation.
5+
6+
## 1. Top-level program shape
7+
8+
- Exactly one top-level function definition is expected.
9+
- Leading blank/comment lines are allowed.
10+
- Anything after the function body is rejected.
11+
12+
References:
13+
- `../miniexpr/src/dsl_parser.c:1531`
14+
- `../miniexpr/src/dsl_parser.c:1548`
15+
- `../miniexpr/src/dsl_parser.c:1553`
16+
17+
## 2. Pragmas
18+
19+
Supported header pragmas:
20+
- `# me:fp=<strict|contract|fast>`
21+
- `# me:compiler=<tcc|cc>`
22+
23+
Behavior:
24+
- duplicates are rejected (`duplicate me:fp pragma`, `duplicate me:compiler pragma`)
25+
- unknown `me:*` pragma is rejected
26+
- malformed assignments/trailing content are rejected
27+
28+
References:
29+
- `../miniexpr/src/dsl_parser.c:247`
30+
- `../miniexpr/src/dsl_parser.c:312`
31+
- `../miniexpr/src/dsl_parser.c:373`
32+
- `../miniexpr/src/dsl_parser.c:411`
33+
- `../miniexpr/src/dsl_parser.c:423`
34+
- `../miniexpr/src/dsl_parser.c:435`
35+
- `../miniexpr/tests/test_dsl_syntax.c:1203`
36+
- `../miniexpr/tests/test_dsl_syntax.c:1388`
37+
38+
## 3. Statement kinds (parser enum)
39+
40+
Recognized statement kinds:
41+
- assignment
42+
- expression statement
43+
- return
44+
- print
45+
- if/elif/else
46+
- while
47+
- for
48+
- break
49+
- continue
50+
51+
References:
52+
- `../miniexpr/src/dsl_parser.h:16`
53+
- `../miniexpr/src/dsl_parser.c:1269`
54+
55+
## 4. Function header and parameters
56+
57+
Function header rules:
58+
- must start with `def`
59+
- requires function name
60+
- requires `(...)` parameter list
61+
- requires trailing `:`
62+
- duplicate parameter names rejected
63+
64+
References:
65+
- `../miniexpr/src/dsl_parser.c:1461`
66+
- `../miniexpr/src/dsl_parser.c:1491`
67+
- `../miniexpr/src/dsl_parser.c:1510`
68+
- `../miniexpr/src/dsl_parser.c:1520`
69+
- `../miniexpr/src/dsl_parser.c:1429`
70+
- `../miniexpr/src/dsl_parser.c:1445`
71+
72+
## 5. Blocks and indentation
73+
74+
Python-like indentation is enforced:
75+
- block must be indented after `:`
76+
- dedent ends block
77+
- blank/comment-only lines are allowed in blocks
78+
79+
References:
80+
- `../miniexpr/src/dsl_parser.c:808`
81+
- `../miniexpr/src/dsl_parser.c:813`
82+
- `../miniexpr/src/dsl_parser.c:1360`
83+
- `../miniexpr/src/dsl_parser.c:1401`
84+
85+
## 6. Control flow forms
86+
87+
### 6.1 if/elif/else
88+
- supports `if`, chained `elif`, optional `else`
89+
- `elif` after `else` is rejected
90+
- duplicate `else` is rejected
91+
- stray `elif`/`else` rejected
92+
93+
References:
94+
- `../miniexpr/src/dsl_parser.c:852`
95+
- `../miniexpr/src/dsl_parser.c:914`
96+
- `../miniexpr/src/dsl_parser.c:942`
97+
- `../miniexpr/src/dsl_parser.c:1297`
98+
- `../miniexpr/src/dsl_parser.c:1301`
99+
100+
### 6.2 while
101+
- `while <expr>:` supported
102+
- body required/indented
103+
- runtime loop-iteration cap exists (`ME_DSL_WHILE_MAX_ITERS`)
104+
105+
References:
106+
- `../miniexpr/src/dsl_parser.c:974`
107+
- `../miniexpr/src/miniexpr.c:2745`
108+
- `../miniexpr/src/miniexpr.c:8621`
109+
110+
### 6.3 for
111+
Only this form is accepted:
112+
- `for <var> in range(...):`
113+
114+
`range` arity at compile-time:
115+
- 1 arg: `range(stop)`
116+
- 2 args: `range(start, stop)`
117+
- 3 args: `range(start, stop, step)`
118+
- other arities rejected
119+
120+
Runtime:
121+
- `step == 0` is runtime eval error
122+
123+
References:
124+
- `../miniexpr/src/dsl_parser.c:1005`
125+
- `../miniexpr/src/dsl_parser.c:1027`
126+
- `../miniexpr/src/dsl_parser.c:1044`
127+
- `../miniexpr/src/miniexpr.c:3638`
128+
- `../miniexpr/src/miniexpr.c:3652`
129+
- `../miniexpr/src/miniexpr.c:8747`
130+
- `../miniexpr/tests/test_dsl_syntax.c:180`
131+
132+
### 6.4 break/continue
133+
- only valid inside loops
134+
- deprecated `break if ...` / `continue if ...` explicitly rejected
135+
136+
References:
137+
- `../miniexpr/src/dsl_parser.c:717`
138+
- `../miniexpr/src/dsl_parser.c:726`
139+
- `../miniexpr/src/dsl_parser.c:733`
140+
- `../miniexpr/src/miniexpr.c:7153`
141+
- `../miniexpr/tests/test_dsl_syntax.c:472`
142+
143+
## 7. Assignments
144+
145+
Supported syntactic forms:
146+
- `x = expr`
147+
- `x += expr`
148+
- `x -= expr`
149+
- `x *= expr`
150+
- `x /= expr`
151+
- `x //= expr`
152+
153+
Desugaring:
154+
- `//=` becomes `floor(lhs / (rhs))`
155+
156+
References:
157+
- `../miniexpr/src/dsl_parser.c:1100`
158+
- `../miniexpr/src/dsl_parser.c:1138`
159+
- `../miniexpr/src/dsl_parser.c:1152`
160+
- `../miniexpr/src/dsl_parser.c:1164`
161+
- `../miniexpr/src/dsl_parser.c:214`
162+
- `../miniexpr/tests/test_dsl_syntax.c:1604`
163+
164+
## 8. print statement
165+
166+
Parser recognizes `print(...)` as dedicated statement.
167+
Compiler rules:
168+
- at least one argument
169+
- optional first string-format argument
170+
- placeholder count must match supplied value args
171+
- print args must be uniform expressions
172+
173+
References:
174+
- `../miniexpr/src/dsl_parser.c:1245`
175+
- `../miniexpr/src/dsl_parser.c:1305`
176+
- `../miniexpr/src/miniexpr.c:6878`
177+
- `../miniexpr/src/miniexpr.c:6932`
178+
- `../miniexpr/src/miniexpr.c:6979`
179+
- `../miniexpr/src/miniexpr.c:7026`
180+
- `../miniexpr/tests/test_dsl_syntax.c:1451`
181+
182+
## 9. Expressions: parser vs compiler responsibilities
183+
184+
Parser-side expression handling is intentionally shallow:
185+
- captures text until end-of-statement with balanced parentheses and string checks
186+
- does **not** parse Python expression grammar deeply at DSL-parser level
187+
188+
Compilation/evaluation is delegated to miniexpr expression compiler (`private_compile_ex`) plus DSL semantic checks.
189+
190+
References:
191+
- `../miniexpr/src/dsl_parser.c:575`
192+
- `../miniexpr/src/dsl_parser.c:621`
193+
- `../miniexpr/src/dsl_parser.c:633`
194+
- `../miniexpr/src/miniexpr.c:3358`
195+
- `../miniexpr/src/miniexpr.c:3429`
196+
197+
## 10. Reserved identifiers and ND symbols
198+
199+
Reserved names rejected for user vars/functions:
200+
- `print`, `int`, `float`, `bool`, `def`, `return`, `_ndim`, `_i<d>`, `_n<d>`
201+
202+
ND reserved symbol handling:
203+
- `_i0.._iN`, `_n0.._nN`, `_ndim` scanned and injected as synthetic vars when used.
204+
205+
References:
206+
- `../miniexpr/src/miniexpr.c:546`
207+
- `../miniexpr/src/miniexpr.c:602`
208+
- `../miniexpr/src/miniexpr.c:7422`
209+
- `../miniexpr/src/miniexpr.c:7431`
210+
- `../miniexpr/src/miniexpr.c:7462`
211+
- `../miniexpr/tests/test_dsl_syntax.c:855`
212+
213+
## 11. Cast intrinsics (current explicit support)
214+
215+
Supported intrinsics:
216+
- `int(expr)`
217+
- `float(expr)`
218+
- `bool(expr)`
219+
220+
Validation:
221+
- must be called form
222+
- exactly one argument
223+
- bad arity rejected
224+
225+
References:
226+
- `../miniexpr/src/miniexpr.c:568`
227+
- `../miniexpr/src/miniexpr.c:654`
228+
- `../miniexpr/src/miniexpr.c:660`
229+
- `../miniexpr/src/miniexpr.c:764`
230+
- `../miniexpr/src/miniexpr.c:3377`
231+
- `../miniexpr/tests/test_dsl_syntax.c:1485`
232+
- `../miniexpr/tests/test_nd.c:116`
233+
234+
## 12. Signature and variable binding constraints
235+
236+
Compile-time constraints:
237+
- DSL function parameters must match provided variable entries by name (set equality; order can differ)
238+
- param count mismatch rejected
239+
- duplicate/conflicting variable/function names rejected
240+
241+
References:
242+
- `../miniexpr/src/miniexpr.c:7247`
243+
- `../miniexpr/src/miniexpr.c:7268`
244+
- `../miniexpr/src/miniexpr.c:7386`
245+
- `../miniexpr/src/miniexpr.c:7395`
246+
- `../miniexpr/tests/test_dsl_syntax.c:503`
247+
248+
## 13. Return semantics and dtype consistency
249+
250+
Compile-time:
251+
- at least one return expression must be compilable
252+
- all return paths that do return must share dtype
253+
254+
Runtime:
255+
- non-guaranteed-return programs can compile, but missing return at runtime yields eval error
256+
257+
References:
258+
- `../miniexpr/src/miniexpr.c:6857`
259+
- `../miniexpr/src/miniexpr.c:6869`
260+
- `../miniexpr/src/miniexpr.c:7497`
261+
- `../miniexpr/tests/test_dsl_syntax.c:494`
262+
- `../miniexpr/tests/test_dsl_syntax.c:552`
263+
264+
## 14. DSL detection and compile error mapping
265+
266+
- DSL candidate detection is heuristic (`dsl_is_candidate`)
267+
- If parsed/treated as DSL and compile fails, compile API returns parse error with offset
268+
269+
References:
270+
- `../miniexpr/src/miniexpr.c:2790`
271+
- `../miniexpr/src/miniexpr.c:7534`
272+
- `../miniexpr/src/miniexpr.c:7573`
273+
- `../miniexpr/src/miniexpr.h:234`
274+
275+
## 15. Known unsupported / risky constructs (current behavior)
276+
277+
These are important for a Python-side validator because DSL parser accepts expression text broadly:
278+
279+
- Python expression forms not representable in miniexpr grammar (example: ternary `a if c else b`) are not blocked at DSL-parser level and rely on downstream expression compile behavior.
280+
- Unsupported/unknown function calls in expressions are also largely deferred to expression compilation.
281+
- Current user-facing diagnostics in Python can be poor if failures happen late; this matches the need for preflight syntax checks in `dsl_kernel.py`.
282+
283+
Evidence:
284+
- parser stores expression text opaquely: `../miniexpr/src/dsl_parser.c:575`
285+
- compile delegation: `../miniexpr/src/miniexpr.c:3429`
286+
- DSL parse/compile failure path in compile API: `../miniexpr/src/miniexpr.c:7573`
287+
288+
## 16. Notes for P0-T2 doc authoring
289+
290+
When moving this into `../miniexpr/doc/dsl-syntax.md`, keep two explicit tables:
291+
- **Syntax rejection (parse/compile time)**
292+
- **Runtime semantic errors** (e.g., zero `range` step, missing return path)
293+
294+
This distinction is already visible in tests and should remain explicit.

0 commit comments

Comments
 (0)