|
| 1 | +--- |
| 2 | +title: Contributing |
| 3 | +weight: 1 |
| 4 | +bookToc: true |
| 5 | +--- |
| 6 | + |
| 7 | +# Contributing to Zerfoo |
| 8 | + |
| 9 | +Thank you for your interest in contributing to Zerfoo, the Go-native ML inference and training framework. This guide covers the full Zerfoo ecosystem and applies to all six repositories. |
| 10 | + |
| 11 | +## Code of Conduct |
| 12 | + |
| 13 | +All participants in the Zerfoo community are expected to treat each other with respect and professionalism. We are committed to providing a welcoming and inclusive environment for everyone. |
| 14 | + |
| 15 | +## Repository Structure |
| 16 | + |
| 17 | +Zerfoo is an ecosystem of six independent repositories (each with its own `go.mod`, CI, and releases): |
| 18 | + |
| 19 | +| Repository | Module | Purpose | |
| 20 | +|-----------|--------|---------| |
| 21 | +| [zerfoo](https://github.com/zerfoo/zerfoo) | `github.com/zerfoo/zerfoo` | Core ML framework: inference, training, serving | |
| 22 | +| [ztensor](https://github.com/zerfoo/ztensor) | `github.com/zerfoo/ztensor` | GPU-accelerated tensor, compute engine, computation graph | |
| 23 | +| [ztoken](https://github.com/zerfoo/ztoken) | `github.com/zerfoo/ztoken` | BPE tokenizer with HuggingFace compatibility | |
| 24 | +| [zonnx](https://github.com/zerfoo/zonnx) | `github.com/zerfoo/zonnx` | ONNX-to-GGUF converter CLI | |
| 25 | +| [float16](https://github.com/zerfoo/float16) | `github.com/zerfoo/float16` | IEEE 754 half-precision (Float16/BFloat16) arithmetic | |
| 26 | +| [float8](https://github.com/zerfoo/float8) | `github.com/zerfoo/float8` | FP8 E4M3FN arithmetic for quantized inference | |
| 27 | + |
| 28 | +**Dependency graph:** |
| 29 | + |
| 30 | +``` |
| 31 | +float16 --+ |
| 32 | +float8 --+--> ztensor --> zerfoo |
| 33 | +ztoken --+ |
| 34 | +
|
| 35 | +zonnx (standalone) |
| 36 | +``` |
| 37 | + |
| 38 | +Each repo is versioned and released independently. Do not treat this as a monorepo -- submit PRs to the repository where the change belongs. |
| 39 | + |
| 40 | +## Development Setup |
| 41 | + |
| 42 | +### Prerequisites |
| 43 | + |
| 44 | +- **Go 1.25+** (generics with `tensor.Numeric` constraint) |
| 45 | +- **Git** |
| 46 | +- **CUDA Toolkit** (optional, for GPU-accelerated tests and development) |
| 47 | + |
| 48 | +### Clone and Build |
| 49 | + |
| 50 | +Each repository builds independently: |
| 51 | + |
| 52 | +```bash |
| 53 | +# Clone whichever repo you want to work on |
| 54 | +git clone https://github.com/zerfoo/<repo>.git |
| 55 | +cd <repo> |
| 56 | +go mod tidy |
| 57 | +go test ./... |
| 58 | +``` |
| 59 | + |
| 60 | +No CGo is required for CPU-only builds. GPU support is loaded dynamically at runtime via purego/dlopen, so `go build ./...` works on any platform without a C compiler. |
| 61 | + |
| 62 | +## Running Tests |
| 63 | + |
| 64 | +```bash |
| 65 | +go test ./... # All CPU tests (no GPU required) |
| 66 | +go test -race ./... # Tests with race detector (required before submitting) |
| 67 | +go test -tags cuda ./... # GPU tests (requires CUDA toolkit and a GPU) |
| 68 | +go test -coverprofile=coverage.out ./... # Coverage report |
| 69 | +go tool cover -html=coverage.out -o coverage.html |
| 70 | +``` |
| 71 | + |
| 72 | +### Testing Requirements |
| 73 | + |
| 74 | +- All new code must have tests |
| 75 | +- Use **table-driven tests** with `t.Run` subtests |
| 76 | +- Always run with the **`-race` flag** before submitting |
| 77 | +- CI enforces a **75% coverage gate** on new packages |
| 78 | + |
| 79 | +## Code Style |
| 80 | + |
| 81 | +### Formatting and Linting |
| 82 | + |
| 83 | +- **`gofmt`** -- all code must be formatted with `gofmt` |
| 84 | +- **`goimports`** -- imports must be organized (stdlib, external, internal) |
| 85 | +- **`golangci-lint`** -- run `golangci-lint run` before submitting |
| 86 | + |
| 87 | +### Go Conventions |
| 88 | + |
| 89 | +- Prefer the **Go standard library** over third-party dependencies |
| 90 | +- Follow standard Go naming: PascalCase for exported, camelCase for unexported |
| 91 | +- Write documentation comments for all exported functions, types, and methods |
| 92 | +- Use generics with `[T tensor.Numeric]` constraints -- avoid type-specific code where generics work |
| 93 | +- All tensor arithmetic must flow through `compute.Engine[T]` (see [Key Conventions](#key-conventions)) |
| 94 | + |
| 95 | +## Commit Conventions |
| 96 | + |
| 97 | +We use [Conventional Commits](https://www.conventionalcommits.org/) for automated versioning with release-please. |
| 98 | + |
| 99 | +``` |
| 100 | +<type>(<scope>): <description> |
| 101 | +``` |
| 102 | + |
| 103 | +| Type | Description | |
| 104 | +|------|-------------| |
| 105 | +| `feat` | A new feature | |
| 106 | +| `fix` | A bug fix | |
| 107 | +| `perf` | A performance improvement | |
| 108 | +| `docs` | Documentation only changes | |
| 109 | +| `test` | Adding or correcting tests | |
| 110 | +| `chore` | Maintenance tasks, CI, dependencies | |
| 111 | +| `refactor` | Code change that neither fixes a bug nor adds a feature | |
| 112 | + |
| 113 | +Examples: |
| 114 | + |
| 115 | +``` |
| 116 | +feat(inference): add Qwen 2.5 architecture support |
| 117 | +fix(generate): correct KV cache eviction for sliding window attention |
| 118 | +perf(layers): fuse SiLU and gate projection into single kernel |
| 119 | +``` |
| 120 | + |
| 121 | +## Pull Request Process |
| 122 | + |
| 123 | +1. **Branch from `main`** and keep your branch up to date with rebase |
| 124 | +2. **One logical change per PR** -- keep PRs focused and reviewable |
| 125 | +3. **All CI checks must pass** -- tests, linting, formatting |
| 126 | +4. **Rebase and merge** -- we do not use squash merges or merge commits |
| 127 | +5. **Reference related issues** -- use `Fixes #123` or `Closes #123` in the PR description |
| 128 | + |
| 129 | +### Before Submitting |
| 130 | + |
| 131 | +```bash |
| 132 | +go test -race ./... |
| 133 | +go vet ./... |
| 134 | +golangci-lint run |
| 135 | +``` |
| 136 | + |
| 137 | +### Review Process |
| 138 | + |
| 139 | +- All PRs require at least one maintainer approval |
| 140 | +- Maintainers may request changes -- address feedback and force-push to update your branch |
| 141 | +- Once approved and CI is green, a maintainer will rebase-merge your PR |
| 142 | + |
| 143 | +## GPU Development |
| 144 | + |
| 145 | +### purego Bindings |
| 146 | + |
| 147 | +GPU libraries are loaded at runtime via purego/dlopen -- not linked at compile time. This means: |
| 148 | + |
| 149 | +- `go build` never requires a C compiler or GPU SDK |
| 150 | +- GPU availability is detected at runtime |
| 151 | +- The same binary runs on CPU-only machines (gracefully falls back) |
| 152 | + |
| 153 | +When writing GPU code, use the `compute.Engine[T]` interface. Do not call CUDA/ROCm/OpenCL APIs directly outside of `internal/gpuapi/`. |
| 154 | + |
| 155 | +## Release Process |
| 156 | + |
| 157 | +All six repositories use [release-please](https://github.com/googleapis/release-please) for automated releases: |
| 158 | + |
| 159 | +1. Conventional Commit messages drive version bumps (`feat` = minor, `fix` = patch) |
| 160 | +2. release-please opens a release PR automatically when changes land on `main` |
| 161 | +3. Merging the release PR creates a GitHub release and Git tag |
| 162 | +4. Semantic versioning (`vMAJOR.MINOR.PATCH`) is enforced across all repos |
| 163 | + |
| 164 | +Breaking changes require a `BREAKING CHANGE:` footer in the commit message, which triggers a major version bump. |
| 165 | + |
| 166 | +## Issue Reporting |
| 167 | + |
| 168 | +### Bug Reports |
| 169 | + |
| 170 | +Include: clear description, steps to reproduce, expected vs actual behavior, environment (Go version, OS, architecture, GPU), and model details if applicable. |
| 171 | + |
| 172 | +### Feature Requests |
| 173 | + |
| 174 | +Include: problem statement, proposed solution, alternatives considered, and use case. |
| 175 | + |
| 176 | +## Good First Issues |
| 177 | + |
| 178 | +Looking for a place to start? Here are some beginner-friendly issues across the ecosystem. |
| 179 | + |
| 180 | +### Beginner |
| 181 | + |
| 182 | +| # | Issue | Repo | Effort | |
| 183 | +|---|-------|------|--------| |
| 184 | +| 1 | Fix `Exp10` returning a constant instead of computing 10^f | float16 | 30 min | |
| 185 | +| 2 | Remove doc comment erroneously pasted into `Config.EnableFastMath` field | float16 | 15 min | |
| 186 | +| 3 | Add `String()` method to `FloatClass` enum type | float16 | 30 min | |
| 187 | +| 4 | Add missing doc comments to GGUF writer `AddMetadata*` methods | zonnx | 20 min | |
| 188 | +| 5 | Add `String()` methods to `ConversionMode` and `ArithmeticMode` enums | float8 | 30 min | |
| 189 | +| 6 | Add table-driven tests for `BFloat16` comparison functions | float16 | 45 min | |
| 190 | + |
| 191 | +### Intermediate |
| 192 | + |
| 193 | +| # | Issue | Repo | Effort | |
| 194 | +|---|-------|------|--------| |
| 195 | +| 7 | Fix `Mod(f, Inf)` returning NaN instead of `f` | float16 | 30 min | |
| 196 | +| 8 | Add NaN checks to `addAlgorithmic` and `subAlgorithmic` in float8 | float8 | 30 min | |
| 197 | +| 9 | Add `SetNormalizer` public method to `BPETokenizer` | ztoken | 30 min | |
| 198 | +| 10 | Convert `downloadFile` to use `defer` for resource cleanup | zonnx | 45 min | |
| 199 | +| 11 | Add unit tests for `Div`, `Sqrt`, and `Neg` layers | zerfoo | 1 hr | |
| 200 | +| 12 | Add unit tests for `Softmax` activation layer | zerfoo | 45 min | |
| 201 | +| 13 | Optimize `RecordRequest` to avoid per-token counter increment loop | zerfoo | 45 min | |
| 202 | + |
| 203 | +### Advanced |
| 204 | + |
| 205 | +| # | Issue | Repo | Effort | |
| 206 | +|---|-------|------|--------| |
| 207 | +| 14 | Implement `Backward` pass for the `Gelu` activation's test coverage | zerfoo | 1.5 hr | |
| 208 | +| 15 | Add JSON Schema `$ref` resolution to grammar-constrained decoding converter | zerfoo | 2 hr | |
| 209 | +| 16 | Add a fine-tuning example application | zerfoo | 2 hr | |
| 210 | +| 17 | Implement `Backward` for `Div` and `Sqrt` core layers | zerfoo | 2 hr | |
| 211 | +| 18 | Add `String()` method to `device.Type` enum | ztensor | 20 min | |
| 212 | +| 19 | Add `R2Score` metric to the metrics package | ztensor | 45 min | |
| 213 | +| 20 | Add table-driven tests for tensor shape validation | ztensor | 45 min | |
| 214 | + |
| 215 | +Browse issues labeled [`good first issue`](https://github.com/zerfoo/zerfoo/labels/good%20first%20issue) on GitHub for the full list with detailed acceptance criteria. |
| 216 | + |
| 217 | +**How to claim an issue:** |
| 218 | + |
| 219 | +1. Comment on the issue to let maintainers know you're working on it |
| 220 | +2. Fork the repo and create a feature branch |
| 221 | +3. Submit a PR referencing the issue |
| 222 | + |
| 223 | +## Key Conventions |
| 224 | + |
| 225 | +### Engine[T] is law |
| 226 | + |
| 227 | +All tensor arithmetic must flow through `compute.Engine[T]`. Never operate on raw slices outside the engine -- this enables transparent CPU/GPU switching and CUDA graph capture. |
| 228 | + |
| 229 | +### No CGo by default |
| 230 | + |
| 231 | +GPU bindings use purego/dlopen. A plain `go build ./...` must compile on any platform without a C compiler. |
| 232 | + |
| 233 | +### GGUF is the sole model format |
| 234 | + |
| 235 | +Do not add support for other formats (ONNX, SafeTensors, etc.) in this repo. Use [`zonnx`](https://github.com/zerfoo/zonnx) to convert ONNX models to GGUF. |
| 236 | + |
| 237 | +### Fuse, don't fragment |
| 238 | + |
| 239 | +Prefer fused operations (`FusedAddRMSNorm`, `FusedSiluGate`, etc.) over sequences of primitive ops. Every eliminated kernel launch matters for tok/s. |
| 240 | + |
| 241 | +## Getting Help |
| 242 | + |
| 243 | +- **GitHub Discussions** -- ask questions and share ideas on each repo's Discussions tab |
| 244 | +- **GitHub Issues** -- report bugs or request features |
0 commit comments