Skip to content

feat: add --sample <n> flag for quick data preview with schema#131

Merged
vmvarela merged 4 commits into
masterfrom
issue-89/add-sample-flag
May 7, 2026
Merged

feat: add --sample <n> flag for quick data preview with schema#131
vmvarela merged 4 commits into
masterfrom
issue-89/add-sample-flag

Conversation

@vmvarela
Copy link
Copy Markdown
Owner

@vmvarela vmvarela commented May 7, 2026

Summary

  • Adds --sample [<n>] flag that prints a #-prefixed schema block to stderr and the first n CSV rows (default 10) to stdout — combining --columns --verbose + SELECT * FROM t LIMIT n into a single invocation
  • Schema block lists each column name and its inferred SQLite type (INTEGER, REAL, TEXT), aligned for readability
  • Implies --header; compatible with --delimiter / --tsv; mutually exclusive with --json, --columns, --validate, and a query argument
  • Type inference buffers max(100, n) rows before emitting output; exits after printing n rows without reading the full input
  • Documented in --help, README.md, and docs/sql-pipe.1.scd

Example

$ cat sales.csv | sql-pipe --sample 3
# Schema (3 columns):
#   id      INTEGER
#   region  TEXT
#   amount  REAL
id,region,amount
1,North,1250.00
2,South,875.50
3,East,2100.75

Closes #89

@vmvarela vmvarela added this to the Sprint 7 milestone May 7, 2026
@vmvarela vmvarela added type:feature New functionality priority:medium Should be done soon size:s Small — 1 to 4 hours labels May 7, 2026
@github-actions github-actions Bot removed the type:feature New functionality label May 7, 2026
@github-actions github-actions Bot added the type:feature New functionality label May 7, 2026
- Add SampleWithOutput error: --sample --output now exits 1 with error
  instead of silently discarding the output file path
- Fix stdout write errors in runSample to call fatal instead of
  std.log.err, preventing silent truncation with exit code 0
- Add type_inference field to SampleArgs so --no-type-inference is
  respected in --sample mode (shows all columns as TEXT)
- Use @max for max_col_width computation (style)
- Clarify help text ambiguity around mutual exclusions
- Add 10 integration tests (85-94) covering all --sample acceptance
  criteria: row count, schema on stderr, type inference, TSV input,
  error cases and edge cases
@vmvarela vmvarela merged commit d8451c2 into master May 7, 2026
@vmvarela vmvarela deleted the issue-89/add-sample-flag branch May 7, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:medium Should be done soon size:s Small — 1 to 4 hours type:feature New functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add --sample <n> flag for quick data preview with schema

1 participant