fix(csv-parse): align trim with ECMAScript whitespace (fix #482) by vishwakt · Pull Request #483 · adaltas/node-csv

vishwakt · 2026-05-11T02:03:55Z

Closes #482.

Aligns trim / ltrim / rtrim with String.prototype.trim(). Previously only \r \n \f \t and space were treated as trimmable, so anything else JS considers whitespace (　 from the issue, plus NBSP, vertical tab, ogham, U+2000-U+200A, line/paragraph separators, ZWNBSP) passed through.

Changes

__isCharTrimable now matches the full ES2015+ whitespace + line terminator set.
Added a first-byte Uint8Array lookup so non-whitespace bytes bail out in O(1). Without this, going from 5 to 25 trim chars slowed the trim path by ~10% on clean ASCII data.
Codepoints that can't be represented in the parser's encoding (e.g. 　 under latin1, which Node encodes as ?) are filtered at init time so literal ? bytes aren't trimmed under non-Unicode encodings.
Bumped needMoreDataSize to cover the longest trim char (up to 3 bytes in UTF-8) so multi-byte whitespace split across stream writes still gets caught.

Tests

New unicode whitespace block under Option trim:

trim U+3000, U+000B, U+00A0 individually
mixed ES whitespace at field boundaries
? is not trimmed under latin1 (covers the encoding filter)
　 split across parser.write() calls still gets trimmed

Full suite: 591 passing, 3 pending (same as master).

Perf

Quick local bench (200k rows, 10 cols), median of 5 runs:

	master	this branch
no trim, clean	416k rows/s	425k rows/s
no trim, padded	311k rows/s	320k rows/s
trim, clean	352k rows/s	356k rows/s
trim, padded	214k rows/s	255k rows/s

trim, padded benefits the most because the first-byte table fires constantly when scanning past trim chars.

wdavidw · 2026-05-11T21:51:50Z

Thank you @vishwakt for your contribution, well done

fix(csv-parse): align trim with ECMAScript whitespace (fix adaltas#482)

e683b04

wdavidw force-pushed the fix/csv-parse-trim-ecmascript-whitespace branch from 15c4120 to e683b04 Compare May 11, 2026 21:27

docs(csv-parse): trim chars description

aa2fcc5

wdavidw merged commit d9f724c into adaltas:master May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(csv-parse): align trim with ECMAScript whitespace (fix #482)#483

fix(csv-parse): align trim with ECMAScript whitespace (fix #482)#483
wdavidw merged 2 commits into
adaltas:masterfrom
vishwakt:fix/csv-parse-trim-ecmascript-whitespace

vishwakt commented May 11, 2026

Uh oh!

wdavidw commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vishwakt commented May 11, 2026

Changes

Tests

Perf

Uh oh!

wdavidw commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants