|
| 1 | +# Phase 0 / T1: miniexpr DSL syntax inventory (source-anchored) |
| 2 | + |
| 3 | +This is the **ground-truth inventory** from current `../miniexpr` sources/tests. |
| 4 | +It is intended to drive `P0-T2` (`../miniexpr/doc/dsl-syntax.md`) and later Python-side validation. |
| 5 | + |
| 6 | +## 1. Top-level program shape |
| 7 | + |
| 8 | +- Exactly one top-level function definition is expected. |
| 9 | +- Leading blank/comment lines are allowed. |
| 10 | +- Anything after the function body is rejected. |
| 11 | + |
| 12 | +References: |
| 13 | +- `../miniexpr/src/dsl_parser.c:1531` |
| 14 | +- `../miniexpr/src/dsl_parser.c:1548` |
| 15 | +- `../miniexpr/src/dsl_parser.c:1553` |
| 16 | + |
| 17 | +## 2. Pragmas |
| 18 | + |
| 19 | +Supported header pragmas: |
| 20 | +- `# me:fp=<strict|contract|fast>` |
| 21 | +- `# me:compiler=<tcc|cc>` |
| 22 | + |
| 23 | +Behavior: |
| 24 | +- duplicates are rejected (`duplicate me:fp pragma`, `duplicate me:compiler pragma`) |
| 25 | +- unknown `me:*` pragma is rejected |
| 26 | +- malformed assignments/trailing content are rejected |
| 27 | + |
| 28 | +References: |
| 29 | +- `../miniexpr/src/dsl_parser.c:247` |
| 30 | +- `../miniexpr/src/dsl_parser.c:312` |
| 31 | +- `../miniexpr/src/dsl_parser.c:373` |
| 32 | +- `../miniexpr/src/dsl_parser.c:411` |
| 33 | +- `../miniexpr/src/dsl_parser.c:423` |
| 34 | +- `../miniexpr/src/dsl_parser.c:435` |
| 35 | +- `../miniexpr/tests/test_dsl_syntax.c:1203` |
| 36 | +- `../miniexpr/tests/test_dsl_syntax.c:1388` |
| 37 | + |
| 38 | +## 3. Statement kinds (parser enum) |
| 39 | + |
| 40 | +Recognized statement kinds: |
| 41 | +- assignment |
| 42 | +- expression statement |
| 43 | +- return |
| 44 | +- print |
| 45 | +- if/elif/else |
| 46 | +- while |
| 47 | +- for |
| 48 | +- break |
| 49 | +- continue |
| 50 | + |
| 51 | +References: |
| 52 | +- `../miniexpr/src/dsl_parser.h:16` |
| 53 | +- `../miniexpr/src/dsl_parser.c:1269` |
| 54 | + |
| 55 | +## 4. Function header and parameters |
| 56 | + |
| 57 | +Function header rules: |
| 58 | +- must start with `def` |
| 59 | +- requires function name |
| 60 | +- requires `(...)` parameter list |
| 61 | +- requires trailing `:` |
| 62 | +- duplicate parameter names rejected |
| 63 | + |
| 64 | +References: |
| 65 | +- `../miniexpr/src/dsl_parser.c:1461` |
| 66 | +- `../miniexpr/src/dsl_parser.c:1491` |
| 67 | +- `../miniexpr/src/dsl_parser.c:1510` |
| 68 | +- `../miniexpr/src/dsl_parser.c:1520` |
| 69 | +- `../miniexpr/src/dsl_parser.c:1429` |
| 70 | +- `../miniexpr/src/dsl_parser.c:1445` |
| 71 | + |
| 72 | +## 5. Blocks and indentation |
| 73 | + |
| 74 | +Python-like indentation is enforced: |
| 75 | +- block must be indented after `:` |
| 76 | +- dedent ends block |
| 77 | +- blank/comment-only lines are allowed in blocks |
| 78 | + |
| 79 | +References: |
| 80 | +- `../miniexpr/src/dsl_parser.c:808` |
| 81 | +- `../miniexpr/src/dsl_parser.c:813` |
| 82 | +- `../miniexpr/src/dsl_parser.c:1360` |
| 83 | +- `../miniexpr/src/dsl_parser.c:1401` |
| 84 | + |
| 85 | +## 6. Control flow forms |
| 86 | + |
| 87 | +### 6.1 if/elif/else |
| 88 | +- supports `if`, chained `elif`, optional `else` |
| 89 | +- `elif` after `else` is rejected |
| 90 | +- duplicate `else` is rejected |
| 91 | +- stray `elif`/`else` rejected |
| 92 | + |
| 93 | +References: |
| 94 | +- `../miniexpr/src/dsl_parser.c:852` |
| 95 | +- `../miniexpr/src/dsl_parser.c:914` |
| 96 | +- `../miniexpr/src/dsl_parser.c:942` |
| 97 | +- `../miniexpr/src/dsl_parser.c:1297` |
| 98 | +- `../miniexpr/src/dsl_parser.c:1301` |
| 99 | + |
| 100 | +### 6.2 while |
| 101 | +- `while <expr>:` supported |
| 102 | +- body required/indented |
| 103 | +- runtime loop-iteration cap exists (`ME_DSL_WHILE_MAX_ITERS`) |
| 104 | + |
| 105 | +References: |
| 106 | +- `../miniexpr/src/dsl_parser.c:974` |
| 107 | +- `../miniexpr/src/miniexpr.c:2745` |
| 108 | +- `../miniexpr/src/miniexpr.c:8621` |
| 109 | + |
| 110 | +### 6.3 for |
| 111 | +Only this form is accepted: |
| 112 | +- `for <var> in range(...):` |
| 113 | + |
| 114 | +`range` arity at compile-time: |
| 115 | +- 1 arg: `range(stop)` |
| 116 | +- 2 args: `range(start, stop)` |
| 117 | +- 3 args: `range(start, stop, step)` |
| 118 | +- other arities rejected |
| 119 | + |
| 120 | +Runtime: |
| 121 | +- `step == 0` is runtime eval error |
| 122 | + |
| 123 | +References: |
| 124 | +- `../miniexpr/src/dsl_parser.c:1005` |
| 125 | +- `../miniexpr/src/dsl_parser.c:1027` |
| 126 | +- `../miniexpr/src/dsl_parser.c:1044` |
| 127 | +- `../miniexpr/src/miniexpr.c:3638` |
| 128 | +- `../miniexpr/src/miniexpr.c:3652` |
| 129 | +- `../miniexpr/src/miniexpr.c:8747` |
| 130 | +- `../miniexpr/tests/test_dsl_syntax.c:180` |
| 131 | + |
| 132 | +### 6.4 break/continue |
| 133 | +- only valid inside loops |
| 134 | +- deprecated `break if ...` / `continue if ...` explicitly rejected |
| 135 | + |
| 136 | +References: |
| 137 | +- `../miniexpr/src/dsl_parser.c:717` |
| 138 | +- `../miniexpr/src/dsl_parser.c:726` |
| 139 | +- `../miniexpr/src/dsl_parser.c:733` |
| 140 | +- `../miniexpr/src/miniexpr.c:7153` |
| 141 | +- `../miniexpr/tests/test_dsl_syntax.c:472` |
| 142 | + |
| 143 | +## 7. Assignments |
| 144 | + |
| 145 | +Supported syntactic forms: |
| 146 | +- `x = expr` |
| 147 | +- `x += expr` |
| 148 | +- `x -= expr` |
| 149 | +- `x *= expr` |
| 150 | +- `x /= expr` |
| 151 | +- `x //= expr` |
| 152 | + |
| 153 | +Desugaring: |
| 154 | +- `//=` becomes `floor(lhs / (rhs))` |
| 155 | + |
| 156 | +References: |
| 157 | +- `../miniexpr/src/dsl_parser.c:1100` |
| 158 | +- `../miniexpr/src/dsl_parser.c:1138` |
| 159 | +- `../miniexpr/src/dsl_parser.c:1152` |
| 160 | +- `../miniexpr/src/dsl_parser.c:1164` |
| 161 | +- `../miniexpr/src/dsl_parser.c:214` |
| 162 | +- `../miniexpr/tests/test_dsl_syntax.c:1604` |
| 163 | + |
| 164 | +## 8. print statement |
| 165 | + |
| 166 | +Parser recognizes `print(...)` as dedicated statement. |
| 167 | +Compiler rules: |
| 168 | +- at least one argument |
| 169 | +- optional first string-format argument |
| 170 | +- placeholder count must match supplied value args |
| 171 | +- print args must be uniform expressions |
| 172 | + |
| 173 | +References: |
| 174 | +- `../miniexpr/src/dsl_parser.c:1245` |
| 175 | +- `../miniexpr/src/dsl_parser.c:1305` |
| 176 | +- `../miniexpr/src/miniexpr.c:6878` |
| 177 | +- `../miniexpr/src/miniexpr.c:6932` |
| 178 | +- `../miniexpr/src/miniexpr.c:6979` |
| 179 | +- `../miniexpr/src/miniexpr.c:7026` |
| 180 | +- `../miniexpr/tests/test_dsl_syntax.c:1451` |
| 181 | + |
| 182 | +## 9. Expressions: parser vs compiler responsibilities |
| 183 | + |
| 184 | +Parser-side expression handling is intentionally shallow: |
| 185 | +- captures text until end-of-statement with balanced parentheses and string checks |
| 186 | +- does **not** parse Python expression grammar deeply at DSL-parser level |
| 187 | + |
| 188 | +Compilation/evaluation is delegated to miniexpr expression compiler (`private_compile_ex`) plus DSL semantic checks. |
| 189 | + |
| 190 | +References: |
| 191 | +- `../miniexpr/src/dsl_parser.c:575` |
| 192 | +- `../miniexpr/src/dsl_parser.c:621` |
| 193 | +- `../miniexpr/src/dsl_parser.c:633` |
| 194 | +- `../miniexpr/src/miniexpr.c:3358` |
| 195 | +- `../miniexpr/src/miniexpr.c:3429` |
| 196 | + |
| 197 | +## 10. Reserved identifiers and ND symbols |
| 198 | + |
| 199 | +Reserved names rejected for user vars/functions: |
| 200 | +- `print`, `int`, `float`, `bool`, `def`, `return`, `_ndim`, `_i<d>`, `_n<d>` |
| 201 | + |
| 202 | +ND reserved symbol handling: |
| 203 | +- `_i0.._iN`, `_n0.._nN`, `_ndim` scanned and injected as synthetic vars when used. |
| 204 | + |
| 205 | +References: |
| 206 | +- `../miniexpr/src/miniexpr.c:546` |
| 207 | +- `../miniexpr/src/miniexpr.c:602` |
| 208 | +- `../miniexpr/src/miniexpr.c:7422` |
| 209 | +- `../miniexpr/src/miniexpr.c:7431` |
| 210 | +- `../miniexpr/src/miniexpr.c:7462` |
| 211 | +- `../miniexpr/tests/test_dsl_syntax.c:855` |
| 212 | + |
| 213 | +## 11. Cast intrinsics (current explicit support) |
| 214 | + |
| 215 | +Supported intrinsics: |
| 216 | +- `int(expr)` |
| 217 | +- `float(expr)` |
| 218 | +- `bool(expr)` |
| 219 | + |
| 220 | +Validation: |
| 221 | +- must be called form |
| 222 | +- exactly one argument |
| 223 | +- bad arity rejected |
| 224 | + |
| 225 | +References: |
| 226 | +- `../miniexpr/src/miniexpr.c:568` |
| 227 | +- `../miniexpr/src/miniexpr.c:654` |
| 228 | +- `../miniexpr/src/miniexpr.c:660` |
| 229 | +- `../miniexpr/src/miniexpr.c:764` |
| 230 | +- `../miniexpr/src/miniexpr.c:3377` |
| 231 | +- `../miniexpr/tests/test_dsl_syntax.c:1485` |
| 232 | +- `../miniexpr/tests/test_nd.c:116` |
| 233 | + |
| 234 | +## 12. Signature and variable binding constraints |
| 235 | + |
| 236 | +Compile-time constraints: |
| 237 | +- DSL function parameters must match provided variable entries by name (set equality; order can differ) |
| 238 | +- param count mismatch rejected |
| 239 | +- duplicate/conflicting variable/function names rejected |
| 240 | + |
| 241 | +References: |
| 242 | +- `../miniexpr/src/miniexpr.c:7247` |
| 243 | +- `../miniexpr/src/miniexpr.c:7268` |
| 244 | +- `../miniexpr/src/miniexpr.c:7386` |
| 245 | +- `../miniexpr/src/miniexpr.c:7395` |
| 246 | +- `../miniexpr/tests/test_dsl_syntax.c:503` |
| 247 | + |
| 248 | +## 13. Return semantics and dtype consistency |
| 249 | + |
| 250 | +Compile-time: |
| 251 | +- at least one return expression must be compilable |
| 252 | +- all return paths that do return must share dtype |
| 253 | + |
| 254 | +Runtime: |
| 255 | +- non-guaranteed-return programs can compile, but missing return at runtime yields eval error |
| 256 | + |
| 257 | +References: |
| 258 | +- `../miniexpr/src/miniexpr.c:6857` |
| 259 | +- `../miniexpr/src/miniexpr.c:6869` |
| 260 | +- `../miniexpr/src/miniexpr.c:7497` |
| 261 | +- `../miniexpr/tests/test_dsl_syntax.c:494` |
| 262 | +- `../miniexpr/tests/test_dsl_syntax.c:552` |
| 263 | + |
| 264 | +## 14. DSL detection and compile error mapping |
| 265 | + |
| 266 | +- DSL candidate detection is heuristic (`dsl_is_candidate`) |
| 267 | +- If parsed/treated as DSL and compile fails, compile API returns parse error with offset |
| 268 | + |
| 269 | +References: |
| 270 | +- `../miniexpr/src/miniexpr.c:2790` |
| 271 | +- `../miniexpr/src/miniexpr.c:7534` |
| 272 | +- `../miniexpr/src/miniexpr.c:7573` |
| 273 | +- `../miniexpr/src/miniexpr.h:234` |
| 274 | + |
| 275 | +## 15. Known unsupported / risky constructs (current behavior) |
| 276 | + |
| 277 | +These are important for a Python-side validator because DSL parser accepts expression text broadly: |
| 278 | + |
| 279 | +- Python expression forms not representable in miniexpr grammar (example: ternary `a if c else b`) are not blocked at DSL-parser level and rely on downstream expression compile behavior. |
| 280 | +- Unsupported/unknown function calls in expressions are also largely deferred to expression compilation. |
| 281 | +- Current user-facing diagnostics in Python can be poor if failures happen late; this matches the need for preflight syntax checks in `dsl_kernel.py`. |
| 282 | + |
| 283 | +Evidence: |
| 284 | +- parser stores expression text opaquely: `../miniexpr/src/dsl_parser.c:575` |
| 285 | +- compile delegation: `../miniexpr/src/miniexpr.c:3429` |
| 286 | +- DSL parse/compile failure path in compile API: `../miniexpr/src/miniexpr.c:7573` |
| 287 | + |
| 288 | +## 16. Notes for P0-T2 doc authoring |
| 289 | + |
| 290 | +When moving this into `../miniexpr/doc/dsl-syntax.md`, keep two explicit tables: |
| 291 | +- **Syntax rejection (parse/compile time)** |
| 292 | +- **Runtime semantic errors** (e.g., zero `range` step, missing return path) |
| 293 | + |
| 294 | +This distinction is already visible in tests and should remain explicit. |
0 commit comments