Skip to content

Commit 5d7e936

Browse files
authored
Merge pull request #3 from jgarzik/updates
Updates
2 parents a64ad0f + ddd73c8 commit 5d7e936

47 files changed

Lines changed: 2888 additions & 3723 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,16 +35,20 @@ Callee-saved registers hold global interpreter state:
3535
|----------|------|
3636
| `rbx` | Bytecode IP (into co_code[]) |
3737
| `r12` | Current frame (PyFrame*) |
38-
| `r13` | Value stack top |
38+
| `r13` | Value stack top (payload array, u64[]) |
3939
| `r14` | co_consts data ptr (&tuple.ob_item[0]) |
40-
| `r15` | co_names data ptr (&tuple.ob_item[0]) |
40+
| `r15` | Tag stack top (sidecar tag array, u8[]) |
4141
| `ecx` | Opcode arg on handler entry |
4242

43+
co_names is accessed via `LOAD_CO_NAMES reg` / `LOAD_CO_NAMES_TAGS reg` macros (reads from `eval_co_names` / `eval_co_names_tags` globals), not a dedicated register.
44+
4345
**Critical rule:** Never hold live values in caller-saved regs (rax, rcx, rdx, rsi, rdi, r8-r11) across `call` or `DECREF`/`DECREF_REG`. Use push/pop or callee-saved regs instead. `DECREF_REG` calls `obj_dealloc` which clobbers all caller-saved regs.
4446

45-
## 128-bit Fat Values
47+
## Value64 Representation
48+
49+
Values are split into 64-bit payloads stored in `u64[]` arrays and 8-bit tags stored in separate `u8[]` sidecar arrays. The value stack uses `r13` (payload top) and `r15` (tag top). Containers (list, tuple, dict) store `ob_item` (u64[]) and `ob_item_tags` (u8[]) separately. Frame locals use `localsplus` (u64[]) and `locals_tag_base` (u8[]).
4650

47-
All values are 128-bit (payload, tag) pairs in 16-byte slots. Tags: `TAG_NULL=0`, `TAG_SMALLINT=1`, `TAG_FLOAT=2`, `TAG_NONE=3`, `TAG_BOOL=4`, `TAG_PTR=0x105`. SmallInts store raw signed i64 in payload (full 64-bit range), zero heap alloc/refcount. `INCREF_VAL`/`DECREF_VAL` check `TAG_RC_BIT` (bit 8) to decide refcounting. Functions return `(rax=payload, edx=tag)`.
51+
Tags (u8): `TAG_NULL=0`, `TAG_SMALLINT=1`, `TAG_FLOAT=2`, `TAG_NONE=3`, `TAG_BOOL=4`, `TAG_PTR=0x85`. Bit 7 (`TAG_RC_BIT=0x80`) means payload is a refcounted heap pointer. SmallInts store raw signed i64 in payload (full 64-bit range), zero heap alloc/refcount. `INCREF_VAL`/`DECREF_VAL` check `TAG_RC_BIT` to decide refcounting. Functions return `(rax=payload, edx=tag)`.
4852

4953
## Source Layout
5054

@@ -64,7 +68,7 @@ All values are 128-bit (payload, tag) pairs in 16-byte slots. Tags: `TAG_NULL=0`
6468
Defined in `include/*.inc`. All objects start with `PyObject` (ob_refcnt +0, ob_type +8).
6569

6670
- **PyTypeObject** (types.inc, 192 bytes): tp_call +64, tp_getattr +72, tp_setattr +80, tp_as_number +128, tp_as_sequence +136, tp_as_mapping +144
67-
- **PyFrame** (frame.inc): code +8, globals +16, locals +32, localsplus +72 (variable-size)
71+
- **PyFrame** (frame.inc): code +8, globals +16, locals +32, stack_tag_ptr +64, locals_tag_base +96, localsplus +104 (variable-size u64[])
6872
- **PyCodeObject** (object.inc): co_consts, co_names, co_code starts at +112
6973

7074
## Opcode Handler Pattern
@@ -77,7 +81,7 @@ op_example:
7781
DISPATCH ; jmp eval_dispatch
7882
```
7983

80-
Stack macros: `VPUSH reg`, `VPOP reg`, `VPEEK reg`, `VPEEK_AT reg, offset`
84+
Stack macros: `VPUSH_PTR reg`, `VPUSH_INT reg`, `VPUSH_FLOAT reg`, `VPUSH_NONE`, `VPUSH_BOOL reg`, `VPUSH_VAL pay, tag`, `VPOP reg` (payload only), `VPOP_VAL pay, tag`, `VPEEK reg`
8185

8286
## Named Frame-Layout Constants
8387

STYLE.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,11 @@ Repeat the register convention comment block at the top of every
4545
; Register convention (callee-saved, preserved across handlers):
4646
; rbx = bytecode instruction pointer (current position in co_code[])
4747
; r12 = current frame pointer (PyFrame*)
48-
; r13 = value stack top pointer
48+
; r13 = value stack top pointer (payload array, u64[])
4949
; r14 = co_consts tuple data pointer (&tuple.ob_item[0])
50-
; r15 = co_names tuple data pointer (&tuple.ob_item[0])
50+
; r15 = tag stack top pointer (sidecar tag array, u8[])
5151
;
52+
; co_names accessed via LOAD_CO_NAMES / LOAD_CO_NAMES_TAGS macros (globals).
5253
; ecx = opcode argument on entry (set by eval_dispatch)
5354
; rbx has already been advanced past the 2-byte instruction word.
5455
```
@@ -230,16 +231,16 @@ with raw arithmetic unless implementing a new stack macro.
230231
Prefer typed pushes (`VPUSH_PTR`, `VPUSH_INT`) over `VPUSH` when the
231232
type is statically known — they avoid branching.
232233

233-
## Fat Value Return/Push Macros
234+
## Value Return/Push Macros
234235

235-
Always use these macros for fat value return patterns. Never inline the
236+
Always use these macros for value return patterns. Never inline the
236237
equivalent instructions — inlining is a source of bugs.
237238

238239
| Macro | Expansion | Use when |
239240
|-------|-----------|----------|
240241
| `RET_NULL` | `xor eax, eax` / `xor edx, edx` | Error return: (0, TAG_NULL) |
241242
| `RET_TAG_SMALLINT` | `mov edx, TAG_SMALLINT` | Return SmallInt (caller sets rax) |
242-
| `SPUSH_PTR reg` | `sub rsp, 16` / `mov [rsp], reg` / `mov qword [rsp+8], TAG_PTR` | Build 16-byte fat arg on stack for tp_call |
243+
| `SPUSH_PTR reg` | `sub rsp, 16` / `mov [rsp], reg` / `mov qword [rsp+8], TAG_PTR` | Build fat arg on stack for tp_call |
243244

244245
## Refcounting Macros
245246

@@ -249,21 +250,21 @@ equivalent instructions — inlining is a source of bugs.
249250
| `DECREF reg` | Known heap pointer (saves/restores rdi) |
250251
| `DECREF_REG reg` | Known heap pointer (does NOT save rdi) |
251252
| `XDECREF reg` | Possibly NULL heap pointer |
252-
| `INCREF_VAL pay, tag` | 128-bit fat value |
253-
| `DECREF_VAL pay, tag` | 128-bit fat value (clobbers rdi + caller-saved) |
254-
| `XDECREF_VAL pay, tag` | 128-bit fat value, NULL-safe |
253+
| `INCREF_VAL pay, tag` | Value64 (payload + u8 tag) |
254+
| `DECREF_VAL pay, tag` | Value64 (clobbers rdi + caller-saved) |
255+
| `XDECREF_VAL pay, tag` | Value64, NULL-safe |
255256

256257
`DECREF_REG` and `DECREF_VAL` contain `call obj_dealloc` which **clobbers
257258
all caller-saved registers** when the refcount reaches zero.
258259

259260
## Addressing Idioms
260261

261-
**Localsplus indexing** (16 bytes/slot = ×8 × ×2 via LEA):
262+
**Localsplus indexing** (8 bytes/payload slot + separate u8 tag array):
262263

263264
```nasm
264-
lea rdx, [rcx*8] ; slot * 8
265-
mov rdi, [r12 + rdx*2 + PyFrame.localsplus] ; payload
266-
mov r9, [r12 + rdx*2 + PyFrame.localsplus + 8] ; tag
265+
mov rdi, [r12 + rcx*8 + PyFrame.localsplus] ; payload from u64[]
266+
mov rdx, [r12 + PyFrame.locals_tag_base] ; tag array base (u8[])
267+
movzx esi, byte [rdx + rcx] ; tag from u8[]
267268
```
268269

269270
**Forward bytecode jumps** (instruction words → bytes = ×2):

include/frame.inc

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,15 @@ struc PyFrame
1111
.builtins: resq 1 ; +24: ptr to builtins dict
1212
.locals: resq 1 ; +32: ptr to locals dict (or NULL for fast locals)
1313
.instr_ptr: resq 1 ; +40: saved bytecode pointer
14-
.stack_ptr: resq 1 ; +48: saved value stack pointer
15-
.stack_base: resq 1 ; +56: ptr to bottom of value stack
16-
.return_offset: resd 1 ; +64: return offset
17-
.nlocalsplus: resd 1 ; +68: number of locals + cells + frees
18-
.func_obj: resq 1 ; +72: ptr to function object (for closures)
19-
.localsplus: ; +80: PyObject*[] array (variable size)
14+
.stack_ptr: resq 1 ; +48: saved payload stack pointer
15+
.stack_base: resq 1 ; +56: ptr to bottom of payload stack
16+
.stack_tag_ptr: resq 1 ; +64: saved tag stack pointer
17+
.stack_tag_base: resq 1 ; +72: ptr to bottom of tag stack
18+
.return_offset: resd 1 ; +80: return offset
19+
.nlocalsplus: resd 1 ; +84: number of locals + cells + frees
20+
.func_obj: resq 1 ; +88: ptr to function object (for closures)
21+
.locals_tag_base: resq 1 ; +96: ptr to locals tag array
22+
.localsplus: ; +104: Value64 payload array (variable size)
2023
endstruc
2124

2225
FRAME_HEADER_SIZE equ PyFrame.localsplus

include/gc.inc

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,10 @@ GC_PREV_MASK equ ~3 ; mask to extract prev pointer (clear low 2 bi
3131
lea %1, [%2 + GC_HEAD_SIZE]
3232
%endmacro
3333

34-
; VISIT_FAT — call visit callback on a fat value slot if it's a heap pointer
34+
; VISIT_FAT — call visit callback on a value if it's a heap pointer
3535
; r14 must be loaded with the visit callback function pointer before use
3636
; rdi is set to the payload (object pointer) for the callback
3737
%macro VISIT_FAT 2 ; %1 = payload_reg, %2 = tag_reg (64-bit)
38-
bt %2, 63
39-
jc %%skip ; SmallStr — skip
4038
test %2, TAG_RC_BIT
4139
jz %%skip ; no RC bit — not a heap pointer
4240
test %1, %1

0 commit comments

Comments
 (0)