Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 19 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,19 @@ $ printf 'name,age\nAlice,30\nBob,25' | sql-pipe --json 'SELECT * FROM t'

`--json` is mutually exclusive with `-H`/`--header`. It can be combined with `-d`/`--delimiter` and `--tsv` to read non-comma-separated input.

For XML input and output, use `-I xml` / `-O xml`. By default the root element is `<results>` and each row is `<row>`. Override with `--xml-root` and `--xml-row`:

```sh
$ printf 'name,age\nAlice,30\nBob,25' | sql-pipe -O xml 'SELECT * FROM t'
<?xml version="1.0" encoding="UTF-8"?>
<results>
<row><name>Alice</name><age>30</age></row>
<row><name>Bob</name><age>25</age></row>
</results>

$ cat data.xml | sql-pipe -I xml 'SELECT name FROM t WHERE age > 25'
```

Chain queries by piping back in — useful for two-pass aggregations. Pass `-H` to the first call so the second one sees column names:

```sh
Expand All @@ -208,15 +221,17 @@ $ cat events.csv \
|------|-------------|
| `-d`, `--delimiter <char>` | Input field delimiter (single character, default `,`) |
| `--tsv` | Alias for `--delimiter '\t'` |
| `-I`, `--input-format <fmt>` | Input format: `csv` (default), `tsv`, `json`, `ndjson` |
| `-O`, `--output-format <fmt>` | Output format: `csv` (default), `tsv`, `json`, `ndjson` |
| `-I`, `--input-format <fmt>` | Input format: `csv` (default), `tsv`, `json`, `ndjson`, `xml` |
| `-O`, `--output-format <fmt>` | Output format: `csv` (default), `tsv`, `json`, `ndjson`, `xml` |
| `--no-type-inference` | Treat all columns as TEXT (skip auto-detection) |
| `-H`, `--header` | Print column names as the first output row |
| `--json` | Alias for `--output-format json` (mutually exclusive with `-H`) |
| `--max-rows <n>` | Stop if more than `n` data rows are read (exit 1) |
| `--validate` | Parse the entire input and print a summary (`OK: <n> rows, <m> columns (col TYPE, ...)`) to stdout. Exit 0 on success, exit 2 on parse error. No query required. Compatible with `--delimiter`, `--tsv`, `--no-type-inference`, `-I`/`--input-format` (csv, tsv, json, ndjson). JSON/NDJSON columns are reported as TEXT. |
| `--columns` | Read the CSV header row, print each column name on its own line, and exit 0. With `-v`/`--verbose`, also shows the inferred type per column (`name INTEGER`). Respects `--delimiter` and `--tsv`. Mutually exclusive with a query argument. |
| `--validate` | Parse the entire input and print a summary (`OK: <n> rows, <m> columns (col TYPE, ...)`) to stdout. Exit 0 on success, exit 2 on parse error. No query required. Compatible with `--delimiter`, `--tsv`, `--no-type-inference`, `-I`/`--input-format` (csv, tsv, json, ndjson, xml). JSON/NDJSON/XML columns are reported as TEXT. |
| `--columns` | Read the input header, print each column name on its own line, and exit 0. Supports CSV, TSV, JSON, NDJSON, and XML input. With `-v`/`--verbose`, also shows the inferred type per column (`name INTEGER`). Respects `--delimiter` and `--tsv`. Mutually exclusive with a query argument. |
| `--sample [<n>]` | Print a schema comment block to stderr and the first `<n>` data rows to stdout as CSV (default: `n=10`). The schema block lists each column name and its inferred type, prefixed with `#`. Implies `--header`. Compatible with `--delimiter` and `--tsv`. Mutually exclusive with `--json` and a query argument. No query required. |
| `--xml-root <name>` | Root element name for XML I/O (default: `results`) |
| `--xml-row <name>` | Row element name for XML I/O (default: `row`) |
| `--output <file>` | Write results to the given file instead of stdout. Creates or overwrites the file. Exits 1 if the file cannot be created. |
| `-v`, `--verbose` | Print `Loaded <n> rows in <t>s` to stderr after loading (always on TTY; forced with flag) |
| `-s`, `--silent` | Suppress `Loaded <n> rows in <t>s` and the progress counter from stderr unconditionally. Cannot be combined with `-v`/`--verbose` |
Expand Down
178 changes: 176 additions & 2 deletions build.zig
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,7 @@ pub fn build(b: *std.Build) void {
// Integration test 57: unknown input format → error exit 1
const test_bad_input_format = b.addSystemCommand(&.{
"bash", "-c",
\\msg=$(printf '' | ./zig-out/bin/sql-pipe --input-format xml 'SELECT 1' 2>&1 >/dev/null; echo "EXIT:$?")
\\msg=$(printf '' | ./zig-out/bin/sql-pipe --input-format parquet 'SELECT 1' 2>&1 >/dev/null; echo "EXIT:$?")
\\echo "$msg" | grep -q 'unknown input format' && echo "$msg" | grep -q 'EXIT:1'
});
test_bad_input_format.step.dependOn(b.getInstallStep());
Expand All @@ -602,7 +602,7 @@ pub fn build(b: *std.Build) void {
// Integration test 58: unknown output format → error exit 1
const test_bad_output_format = b.addSystemCommand(&.{
"bash", "-c",
\\msg=$(printf 'a\n1\n' | ./zig-out/bin/sql-pipe --output-format xml 'SELECT * FROM t' 2>&1 >/dev/null; echo "EXIT:$?")
\\msg=$(printf 'a\n1\n' | ./zig-out/bin/sql-pipe --output-format parquet 'SELECT * FROM t' 2>&1 >/dev/null; echo "EXIT:$?")
\\echo "$msg" | grep -q 'unknown output format' && echo "$msg" | grep -q 'EXIT:1'
});
test_bad_output_format.step.dependOn(b.getInstallStep());
Expand Down Expand Up @@ -1011,6 +1011,157 @@ pub fn build(b: *std.Build) void {
test_delimiter_too_long_error.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_delimiter_too_long_error.step);

// ─── XML input/output integration tests ─────────────────────────────────

// Integration test 99: XML output format emits correct structure
const test_xml_output = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf 'name,age\nAlice,30\nBob,25\n' \
\\ | ./zig-out/bin/sql-pipe --output-format xml 'SELECT * FROM t ORDER BY name')
\\expected=$(printf '<?xml version="1.0" encoding="UTF-8"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n<row><name>Bob</name><age>25</age></row>\n</results>')
\\[ "$result" = "$expected" ]
});
test_xml_output.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_output.step);

// Integration test 100: XML input can be queried
const test_xml_input = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n<row><name>Bob</name><age>25</age></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe --input-format xml 'SELECT name FROM t ORDER BY name')
\\expected=$(printf 'Alice\nBob')
\\[ "$result" = "$expected" ]
});
test_xml_input.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_input.step);

// Integration test 101: XML roundtrip (xml in → xml out)
const test_xml_roundtrip = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml -O xml 'SELECT * FROM t')
\\echo "$result" | grep -q '<name>Alice</name>' && echo "$result" | grep -q '<age>30</age>'
});
test_xml_roundtrip.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_roundtrip.step);

// Integration test 102: --columns with XML input lists column names
const test_xml_columns = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml --columns)
\\expected=$(printf 'name\nage')
\\[ "$result" = "$expected" ]
});
test_xml_columns.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_columns.step);

// Integration test 103: --validate with XML input prints summary
const test_xml_validate = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n<row><name>Bob</name><age>25</age></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml --validate)
\\echo "$result" | grep -q 'OK: 2 rows'
});
test_xml_validate.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_validate.step);

// Integration test 104: --xml-root and --xml-row customize element names
const test_xml_custom_elements = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf 'name,age\nAlice,30\n' \
\\ | ./zig-out/bin/sql-pipe -O xml --xml-root data --xml-row record 'SELECT * FROM t')
\\echo "$result" | grep -q '<data>' && echo "$result" | grep -q '<record>' && echo "$result" | grep -q '</data>'
});
test_xml_custom_elements.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_custom_elements.step);

// Integration test 105: XML entities en input — roundtrip correcto
const test_xml_entities_input = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice &amp; Bob</name></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml 'SELECT name FROM t')
\\[ "$result" = "Alice & Bob" ]
});
test_xml_entities_input.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_entities_input.step);

// Integration test 106: NULL en output XML → elemento vacío, no "NULL"
const test_xml_null_output = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf 'name\nAlice\n' \
\\ | ./zig-out/bin/sql-pipe -O xml 'SELECT name, NULL as age FROM t')
\\echo "$result" | grep -q '<age></age>'
});
test_xml_null_output.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_null_output.step);

// Integration test 107: Documento XML vacío → error con "empty input"
const test_xml_empty_input = b.addSystemCommand(&.{
"bash", "-c",
\\msg=$(printf '' | ./zig-out/bin/sql-pipe -I xml 'SELECT 1' 2>&1; echo "EXIT:$?")
\\echo "$msg" | grep -q 'empty input' && echo "$msg" | grep -qv 'EXIT:0'
});
test_xml_empty_input.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_empty_input.step);

// Integration test 108: Root sin rows → error con "no row elements"
const test_xml_no_rows = b.addSystemCommand(&.{
"bash", "-c",
\\msg=$(printf '<root></root>' | ./zig-out/bin/sql-pipe -I xml 'SELECT 1' 2>&1; echo "EXIT:$?")
\\echo "$msg" | grep -q 'no row elements' && echo "$msg" | grep -qv 'EXIT:0'
});
test_xml_no_rows.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_no_rows.step);

// Integration test 109: --sample rechazado con XML → exit no-cero con mensaje claro
const test_xml_sample_rejected = b.addSystemCommand(&.{
"bash", "-c",
\\msg=$(printf '<r><row><a>1</a></row></r>' | ./zig-out/bin/sql-pipe -I xml --sample 2>&1; echo "EXIT:$?")
\\echo "$msg" | grep -q 'sample' && echo "$msg" | grep -qv 'EXIT:0'
});
test_xml_sample_rejected.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_sample_rejected.step);

// Integration test 110: Self-closing column → NULL en SQLite (SELECT devuelve vacío)
const test_xml_self_closing_null = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name/><age>30</age></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml 'SELECT COALESCE(name, "NULL_VALUE") FROM t')
\\[ "$result" = "NULL_VALUE" ]
});
test_xml_self_closing_null.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_self_closing_null.step);

// Integration test 111: Columnas en orden distinto entre rows → bind-by-name correcto
const test_xml_column_order = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row><name>Alice</name><age>30</age></row>\n<row><age>25</age><name>Bob</name></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml 'SELECT name || ":" || age FROM t ORDER BY name')
\\[ "$result" = "$(printf 'Alice:30\nBob:25')" ]
});
test_xml_column_order.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_column_order.step);

// Integration test 112: Atributos en elementos → ignorados, contenido preservado
const test_xml_attrs_ignored = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf '<?xml version="1.0"?>\n<results>\n<row id="1"><name class="primary">Alice</name></row>\n</results>\n' \
\\ | ./zig-out/bin/sql-pipe -I xml 'SELECT name FROM t')
\\[ "$result" = "Alice" ]
});
test_xml_attrs_ignored.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_attrs_ignored.step);

// Integration test 113: Float-as-integer → emitido como entero en XML
const test_xml_float_as_int = b.addSystemCommand(&.{
"bash", "-c",
\\result=$(printf 'x\n1\n' | ./zig-out/bin/sql-pipe -O xml 'SELECT CAST(30.0 AS REAL) as val')
\\echo "$result" | grep -q '<val>30</val>'
});
test_xml_float_as_int.step.dependOn(b.getInstallStep());
test_step.dependOn(&test_xml_float_as_int.step);

// Unit tests for the RFC 4180 CSV parser (src/csv.zig)
const unit_tests = b.addTest(.{
.root_module = b.createModule(.{
Expand All @@ -1022,4 +1173,27 @@ pub fn build(b: *std.Build) void {
const run_unit_tests = b.addRunArtifact(unit_tests);
const unit_test_step = b.step("unit-test", "Run CSV unit tests");
unit_test_step.dependOn(&run_unit_tests.step);

// Unit tests for the XML parser (src/xml.zig)
const xml_unit_tests = b.addTest(.{
.root_module = b.createModule(.{
.root_source_file = b.path("src/xml.zig"),
.target = target,
.optimize = optimize,
.link_libc = true,
}),
});
xml_unit_tests.root_module.addImport("c", translate_c.createModule());
if (bundle_sqlite) {
xml_unit_tests.root_module.addIncludePath(b.path("lib"));
xml_unit_tests.root_module.addCSourceFile(.{
.file = b.path("lib/sqlite3.c"),
.flags = &.{"-DSQLITE_OMIT_LOAD_EXTENSION=1"},
});
} else {
xml_unit_tests.root_module.linkSystemLibrary("sqlite3", .{});
}
const run_xml_unit_tests = b.addRunArtifact(xml_unit_tests);
test_step.dependOn(&run_xml_unit_tests.step);
unit_test_step.dependOn(&run_xml_unit_tests.step);
}
40 changes: 33 additions & 7 deletions docs/sql-pipe.1.scd
Original file line number Diff line number Diff line change
Expand Up @@ -72,22 +72,33 @@ OPTIONS
stderr is a TTY. Useful for producing clean stderr in interactive
terminals. Cannot be combined with *-v* / *--verbose*.

*--xml-root* <name>
Root element name used when reading or writing XML (default: *results*).
The output document is wrapped in *<name>...</name>*. Also used as the
expected root tag when parsing XML input.

*--xml-row* <name>
Row element name used when reading or writing XML (default: *row*).
Each result row is emitted as *<name><col>value</col>...</name>*.

*--validate*
Parse the entire input without executing a SQL query. On success,
prints a one-line summary to standard output:
*OK: <n> rows, <m> columns (<col> <TYPE>, ...)* and exits 0.
On parse error, prints the error message and exits 2. Compatible
with *--delimiter*, *--tsv*, *--no-type-inference*, and
*-I* / *--input-format* (csv, tsv, json, ndjson). JSON and NDJSON
columns are reported as TEXT. Mutually exclusive with a query
*-I* / *--input-format* (csv, tsv, json, ndjson, xml). JSON, NDJSON,
and XML columns are reported as TEXT. Mutually exclusive with a query
argument.

*--columns*
Read the CSV header row, print each column name on its own line to
standard output, and exit with code 0. When combined with *-v* /
*--verbose*, also shows the inferred type (INTEGER, REAL, or TEXT)
for each column, using the first 100 data rows for inference. Respects
*--delimiter* and *--tsv*. Mutually exclusive with a query argument.
Read the input header, print each column name on its own line to
standard output, and exit with code 0. Supported for CSV, TSV,
JSON, NDJSON, and XML input. When combined with *-v* / *--verbose*,
also shows the inferred type (INTEGER, REAL, or TEXT) for each column
(CSV/TSV only; other formats always show TEXT), using the first 100
data rows for inference. Respects *--delimiter* and *--tsv*.
Mutually exclusive with a query argument.

*--sample* [<n>]
Print a schema comment block to standard error and the first <n> data
Expand Down Expand Up @@ -157,6 +168,21 @@ EXAMPLES
Output:++
[{"name":"Alice","age":30},{"name":"Bob","age":25}]

Convert CSV to XML:

$ printf 'name,age\nAlice,30\nBob,25' | sql-pipe -O xml 'SELECT \* FROM t'

Output:++
<?xml version="1.0" encoding="UTF-8"?>++
<results>++
<row><name>Alice</name><age>30</age></row>++
<row><name>Bob</name><age>25</age></row>++
</results>

Query XML input:

$ cat data.xml | sql-pipe -I xml 'SELECT name FROM t WHERE age > 25'

Preview schema and first 3 rows of a CSV file:

$ cat sales.csv | sql-pipe --sample 3
Expand Down
Loading
Loading