Skip to content

Commit 00c8afa

Browse files
committed
perf(memoization,cacache): O(n) → O(1) LRU / regex compile in hot loops
Two independent hot-path wins: **memoization.ts** — `memoize()` and `memoizeAsync()` previously maintained a parallel `accessOrder: string[]` alongside the primary `cache: Map`, and bumped recency on every cache hit via `indexOf()` (O(n)) + `splice()` (O(n)). For a fully-populated cache of size 1000, every hit walked up to 1000 entries and shifted the tail. LRU eviction also did `accessOrder.shift()` which is O(n). Replace with a Map-insertion-order idiom: `cache.delete(key)` + `cache.set(key, entry)` moves an entry to the tail in O(1), and `cache.keys().next().value` returns the oldest in O(1). The parallel array is gone; every `cache` Map operation stays O(1). Same pattern already used in `dlx/detect.ts` and `dlx/package.ts`. memoizeAsync gains a small `bumpRecency()` helper to centralize the delete-then-set idiom across the hit and stale-dedup branches. **cacache.ts** — `clear()`'s wildcard branch streamed every cache entry through `matchesPattern(key, pattern)`, which re-compiled the regex on every key. For a wildcard clear across N entries that's N redundant regex compiles. Refactor to `createPatternMatcher(pattern)` which compiles once and closes over the regex; call sites hoist the matcher out of the stream loop, dropping regex allocations from O(N) to O(1). Non-wildcard patterns stay on the fast prefix path. All 41 memoization tests + 29 cacache tests pass unchanged; full suite 6327/6327 passing. No API surface changes.
1 parent 1a882a5 commit 00c8afa

3 files changed

Lines changed: 106 additions & 79 deletions

File tree

CHANGELOG.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,58 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [Unreleased]
9+
10+
### Added — schema (new module, replaces `validation/validate-schema`)
11+
12+
- `@socketsecurity/lib/schema/validate` — non-throwing validator accepting any Zod-shaped schema (duck-typed on `.safeParse`). Returns a tagged `{ ok: true, value } | { ok: false, errors }` with normalized `{ path, message }` issues. socket-lib additionally recognizes TypeBox for its own internal use (e.g. `src/ipc.ts` stub validation); external callers should pass Zod schemas. No runtime dependency on `zod` — consumers bring their own
13+
- `@socketsecurity/lib/schema/parse` — throwing twin for fail-fast trust boundaries (app startup, config files). Summarizes all issues into a single `Error` message
14+
- `@socketsecurity/lib/schema/types` — shared types: `Schema<T>`, `ParseResult<T>`, `ValidateResult<T>`, `ValidationIssue`, `AnySchema`, and `Infer<S>` (which unwraps Zod v3/v4 and TypeBox output shapes)
15+
16+
### Added — native feature-detect helpers
17+
18+
- `@socketsecurity/lib/promises` `withResolvers()` — exposes the TC39 [`Promise.withResolvers`](https://tc39.es/ecma262/#sec-promise.withResolvers) API as a first-class export. Bound to the native method when available (Node 20.12+ / 21+ / 22+; V8 ≥ 12.0); otherwise falls back to a spec-equivalent `new Promise(executor)` implementation. Returns `{ promise, resolve, reject }` with `Object.prototype`-prototyped own enumerable properties per §27.2.4.9. Retires the `let resolve; const p = new Promise(r => { resolve = r })` dance for deferred-resolution patterns. Public `PromiseWithResolvers<T>` interface exported alongside
19+
20+
### Changed
21+
22+
- `@socketsecurity/lib/regexps` `escapeRegExp()` — bound to native [`RegExp.escape`](https://tc39.es/ecma262/#sec-regexp.escape) when available (Node 24+ / V8 13.7); otherwise uses a spec-compliant fallback. Previous implementation escaped only the SyntaxCharacter set, leaving output unsafe against leading-identifier merging (e.g. a trailing `\0..\9` in a surrounding pattern could absorb a leading digit) and unsafe inside `/.../` literals (no `/` escape). New implementation encodes leading `[0-9A-Za-z]` as `\xHH`, backslash-prefixes `SyntaxCharacter + /`, emits `ControlEscape` letter forms, and `\xHH`-escapes the `otherPunctuators` / whitespace / line-terminator / lone-surrogate set per §22.2.5.1. Fallback output byte-equivalent to native across ASCII 0-127 plus non-ASCII NBSP / ZWNBSP / LS / PS / surrogates (zero diffs). **Caller-visible shape change**: escaped output now uses `\xHH` for many characters that previously passed through literally (e.g. `escapeRegExp('a')` is now `'\\x61'`, not `'a'`); callers that string-match on the escape output rather than compiling it into a `RegExp` may need updates. Functional equivalence (round-trip match against the original input) is preserved
23+
24+
### Removed
25+
26+
- `@socketsecurity/lib/validation/*` subpath retired entirely — its two exports are re-homed under modules that match their purpose. Migrate:
27+
- `import { validateSchema } from '@socketsecurity/lib/validation/validate-schema'``import { validateSchema } from '@socketsecurity/lib/schema/validate'`
28+
- `import { parseSchema } from '@socketsecurity/lib/validation/validate-schema'``import { parseSchema } from '@socketsecurity/lib/schema/parse'`
29+
- `import { safeJsonParse } from '@socketsecurity/lib/validation/json-parser'``import { safeJsonParse } from '@socketsecurity/lib/json/parse'`
30+
- Types (`Infer`, `ValidateResult`, `ValidationIssue`, `AnySchema`, `Schema`, `ParseResult`) → `@socketsecurity/lib/schema/types`
31+
- Types (`SafeJsonParseOptions`) → `@socketsecurity/lib/json/types`
32+
- `memoizeDebounced` export from `@socketsecurity/lib/memoization` — the helper was misnamed (the "debounce" callback only invoked the already-memoized fn for cache-population side effects, never deferred the caller's return value) and had no internal consumers. Use `memoize` or `memoizeAsync` with a `ttl` instead
33+
34+
### Fixed — caching
35+
36+
- `src/dlx/detect.ts` — bound `packageJsonPathCache` with an LRU cap (200) and give negative entries a 10s TTL so a directory that later gains a `package.json` (e.g. `npm install` in a sibling workspace) is re-probed instead of permanently stuck on the cached "not found"
37+
- `src/dlx/package.ts` — bound `binaryPathCache` with an LRU cap (200) so a long-running process that resolves many distinct binary paths no longer accumulates entries forever after `cleanDlxCache` reclaims the files on disk
38+
- `src/cacache.ts` / `src/cache-with-ttl.ts` — wildcard deletion (`deleteAll('foo*bar')`) now anchors both ends of the pattern. The missing `$` anchor silently over-deleted keys like `foo123bar-extra`
39+
- `src/globs.ts` — cache key for array-valued options (e.g. `ignore`) is now order-insensitive, so the same logical set doesn't produce multiple cache entries depending on call-site ordering
40+
41+
### Fixed — promises & concurrency
42+
43+
- `src/promise-queue.ts` — when a bounded queue hits its limit, reject the **newest** submission (current call) rather than the oldest already-enqueued task. Prior behavior let a flood of new submissions cancel in-flight work the caller had already committed to awaiting
44+
- `src/suppress-warnings.ts` `withSuppressedWarnings()` — do not reassign `process.emitWarning` in `finally`. The snapshot-and-restore pattern at the top of the function either captured the native function after a wrapper was already installed (then restoring native on exit wiped every other active suppression) or captured the wrapper itself (making the restore a no-op). Suppression is driven by the `suppressedWarnings` set; membership changes are enough
45+
- `src/process-lock.ts` — drop the `existsSync(lockPath)` pre-check before `mkdirSync(lockPath)`. `mkdirSync` without `recursive` is already atomic and throws `EEXIST` when another process owns the directory; the pre-check only opened a TOCTOU window without adding safety. Stale-lock detection now compares full-precision milliseconds (`Date.now() - mtime.getTime() > staleMs`) instead of second-level truncation — sub-second `staleMs` values (e.g. 500) were previously being rounded up to a 1s minimum
46+
47+
### Fixed — URL / version / spec parsing
48+
49+
- `src/packages/specs.ts` `getRepoUrlDetails()` — tighten the GitHub URL matcher to anchor on `github.com` specifically (escaped `.`, full-label match) and accept npm's canonical `git+https://` / `git+ssh://` repository URL forms. Previously `/^.+github.com\//` matched lookalike hosts like `githubXcom` or `fake-github.com.attacker.tld`, and returned garbage (e.g. `user: 'git@github.com:npm'`) for scp-style URLs. scp-style `git@github.com:…` (no `://`) is now rejected and returns `{ user: '', project: '' }` — callers must normalize to https/ssh upstream
50+
- `src/url.ts` `urlSearchParamAsBoolean()` — accept the same truthy vocabulary as `envAsBoolean` (`1` / `true` / `yes` / `on`, case-insensitive). Empty-string input now falls through to `defaultValue` instead of silently returning `false`
51+
- `src/versions.ts` `maxVersion()` / `minVersion()` — pass `includePrerelease: true` to semver so an all-prerelease input like `['1.0.0-alpha', '1.0.0-beta']` resolves to the latest prerelease instead of returning `undefined` under semver's default behavior against `'*'`
52+
- `src/external/semver.d.ts` — add `RangeOptions` and type the `options` parameter on `maxSatisfying` / `minSatisfying`
53+
54+
### Fixed — misc
55+
56+
- `src/fs.ts` `findUp()` / `findUpSync()` — traverse up to and **including** the filesystem root (and `stopAt`). The old `while (dir && dir !== root)` loop exited before visiting `root` itself, so a match at `/.foo` was never found
57+
- `src/words.ts` `capitalize()` — iterate by code point so non-BMP characters (emoji, astral-plane scripts) aren't split between their UTF-16 surrogate pair halves. Previously `'𐐀foo'` produced a broken leading surrogate
58+
- `src/words.ts` `determineArticle()` — match leading vowels case-insensitively (`Apple``an Apple`, not `a Apple`). Silent-h / y-sound exceptions (hour, user) remain a documented limitation rather than a built-in exception list
59+
860
## [5.20.1](https://github.com/SocketDev/socket-lib/releases/tag/v5.20.1) - 2026-04-19
961

1062
### Fixed

src/cacache.ts

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -40,29 +40,23 @@ export interface RemoveOptions {
4040
}
4141

4242
/**
43-
* Check if a key matches a pattern (with wildcard support).
43+
* Build a key→boolean matcher for `pattern`. For non-wildcard patterns
44+
* this returns a prefix-startsWith predicate (no regex allocation); for
45+
* wildcard patterns it compiles the regex *once* and closes over it so
46+
* the caller can apply the same matcher across N keys in O(1)-per-key.
47+
*
48+
* Anchors both ends — `foo*bar` matches exactly `foo<anything>bar`,
49+
* not `foo<anything>bar<more>`.
4450
*/
45-
function matchesPattern(key: string, pattern: string): boolean {
46-
// If no wildcards, use simple prefix matching (faster)
51+
function createPatternMatcher(pattern: string): (key: string) => boolean {
4752
if (!pattern.includes('*')) {
48-
return key.startsWith(pattern)
53+
return (key: string) => key.startsWith(pattern)
4954
}
50-
// Use regex for wildcard patterns
51-
const regex = patternToRegex(pattern)
52-
return regex.test(key)
53-
}
54-
55-
/**
56-
* Convert wildcard pattern to regex for matching.
57-
* Supports * as wildcard (matches any characters). Anchors both ends —
58-
* `foo*bar` matches exactly `foo<anything>bar`, not `foo<anything>bar<more>`.
59-
*/
60-
function patternToRegex(pattern: string): RegExp {
61-
// Escape regex special characters except *
55+
// Escape regex special characters except `*`, then convert `*` to `.*`.
6256
const escaped = pattern.replaceAll(/[.+?^${}()|[\]\\]/g, '\\$&')
63-
// Convert * to .* (match any characters)
6457
const regexPattern = escaped.replaceAll('*', '.*')
65-
return new RegExp(`^${regexPattern}$`)
58+
const regex = new RegExp(`^${regexPattern}$`)
59+
return (key: string) => regex.test(key)
6660
}
6761

6862
/**
@@ -136,13 +130,16 @@ export async function clear(
136130
return removed
137131
}
138132

139-
// For wildcard patterns, need to match each entry.
133+
// For wildcard patterns, need to match each entry. Compile the
134+
// matcher once outside the stream loop so wildcard scans are
135+
// O(1)-per-key instead of re-compiling the regex on every entry.
140136
let removed = 0
137+
const matches = createPatternMatcher(opts.prefix)
141138
/* c8 ignore next - External cacache call */
142139
const stream = cacache.ls.stream(cacheDir)
143140

144141
for await (const entry of stream) {
145-
if (matchesPattern(entry.key, opts.prefix)) {
142+
if (matches(entry.key)) {
146143
try {
147144
/* c8 ignore next - External cacache call */
148145
await cacache.rm.entry(cacheDir, entry.key)

src/memoization.ts

Lines changed: 37 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -150,19 +150,21 @@ export function memoize<Args extends unknown[], Result>(
150150
throw new TypeError('TTL must be non-negative')
151151
}
152152

153+
// LRU via Map insertion-order: delete+re-insert moves a key to the
154+
// end in O(1). The oldest key is `cache.keys().next().value`. This
155+
// replaces the prior parallel `accessOrder: string[]` which cost
156+
// O(n) per hit (indexOf + splice) and scaled poorly for large caches.
153157
const cache = new Map<string, CacheEntry<Result>>()
154-
const accessOrder: string[] = []
155158

156159
// Register for global clearing.
157160
cacheRegistry.push(() => {
158161
cache.clear()
159-
accessOrder.length = 0
160162
})
161163

162164
function evictLRU(): void {
163-
if (cache.size >= maxSize && accessOrder.length > 0) {
164-
const oldest = accessOrder.shift()
165-
if (oldest) {
165+
if (cache.size >= maxSize) {
166+
const oldest = cache.keys().next().value
167+
if (oldest !== undefined) {
166168
cache.delete(oldest)
167169
debugLog(`[memoize:${name}] clear`, {
168170
key: oldest,
@@ -182,41 +184,31 @@ export function memoize<Args extends unknown[], Result>(
182184
return function memoized(...args: Args): Result {
183185
const key = keyGen(...args)
184186

185-
// Check cache
186187
const cached = cache.get(key)
187188
if (cached) {
188189
if (!isExpired(cached)) {
189190
cached.hits++
190-
// Move to end of access order (LRU)
191-
const index = accessOrder.indexOf(key)
192-
if (index !== -1) {
193-
accessOrder.splice(index, 1)
194-
}
195-
accessOrder.push(key)
191+
// Bump recency: delete + re-insert moves the entry to Map's
192+
// insertion-order tail in O(1).
193+
cache.delete(key)
194+
cache.set(key, cached)
196195

197196
debugLog(`[memoize:${name}] hit`, { key, hits: cached.hits })
198197
return cached.value
199198
}
200-
// Clean up expired entry before re-caching.
199+
// Expired — drop it before recomputing.
201200
cache.delete(key)
202-
const index = accessOrder.indexOf(key)
203-
if (index !== -1) {
204-
accessOrder.splice(index, 1)
205-
}
206201
}
207202

208-
// Cache miss - compute value
209203
debugLog(`[memoize:${name}] miss`, { key })
210204
const value = fn(...args)
211205

212-
// Store in cache
213206
evictLRU()
214207
cache.set(key, {
215208
value,
216209
timestamp: Date.now(),
217210
hits: 0,
218211
})
219-
accessOrder.push(key)
220212

221213
debugLog(`[memoize:${name}] set`, { key, cacheSize: cache.size })
222214
return value
@@ -253,19 +245,20 @@ export function memoizeAsync<Args extends unknown[], Result>(
253245
ttl = Number.POSITIVE_INFINITY,
254246
} = options
255247

248+
// LRU via Map insertion-order: see `memoize()` above for the full
249+
// rationale. Key lifecycle on bump: `cache.delete(key)` +
250+
// `cache.set(key, entry)` moves the entry to the tail in O(1).
256251
const cache = new Map<string, CacheEntry<Promise<Result>>>()
257-
const accessOrder: string[] = []
258252

259253
// Register for global clearing.
260254
cacheRegistry.push(() => {
261255
cache.clear()
262-
accessOrder.length = 0
263256
})
264257

265258
function evictLRU(): void {
266-
if (cache.size >= maxSize && accessOrder.length > 0) {
267-
const oldest = accessOrder.shift()
268-
if (oldest) {
259+
if (cache.size >= maxSize) {
260+
const oldest = cache.keys().next().value
261+
if (oldest !== undefined) {
269262
cache.delete(oldest)
270263
debugLog(`[memoizeAsync:${name}] clear`, {
271264
key: oldest,
@@ -282,24 +275,24 @@ export function memoizeAsync<Args extends unknown[], Result>(
282275
return Date.now() - entry.timestamp > ttl
283276
}
284277

278+
// Bump an existing cache entry to the tail (most-recently-used) in
279+
// O(1). Caller must have already verified `cache.has(key)`.
280+
function bumpRecency(key: string, entry: CacheEntry<Promise<Result>>): void {
281+
cache.delete(key)
282+
cache.set(key, entry)
283+
}
284+
285285
// Track in-flight refreshes to prevent thundering herd on TTL expiry.
286286
const refreshing = new Map<string, Promise<Result>>()
287287

288288
return async function memoized(...args: Args): Promise<Result> {
289289
const key = keyGen(...args)
290290

291-
// Check cache
292291
const cached = cache.get(key)
293292
if (cached) {
294293
if (!isExpired(cached)) {
295294
cached.hits++
296-
// Move to end of access order (LRU)
297-
const index = accessOrder.indexOf(key)
298-
if (index !== -1) {
299-
accessOrder.splice(index, 1)
300-
}
301-
accessOrder.push(key)
302-
295+
bumpRecency(key, cached)
303296
debugLog(`[memoizeAsync:${name}] hit`, { key, hits: cached.hits })
304297
return await cached.value
305298
}
@@ -310,35 +303,26 @@ export function memoizeAsync<Args extends unknown[], Result>(
310303
debugLog(`[memoizeAsync:${name}] stale-dedup`, { key })
311304
// Bump recency so the entry we're refreshing isn't evicted
312305
// under LRU pressure while a peer is computing on our behalf.
313-
const inflightIndex = accessOrder.indexOf(key)
314-
if (inflightIndex !== -1) {
315-
accessOrder.splice(inflightIndex, 1)
316-
}
317-
accessOrder.push(key)
306+
bumpRecency(key, cached)
318307
return await inflight
319308
}
320-
// Clean up expired entry before re-caching.
309+
// Expired and no in-flight refresh — drop it before recomputing.
321310
cache.delete(key)
322-
const index = accessOrder.indexOf(key)
323-
if (index !== -1) {
324-
accessOrder.splice(index, 1)
325-
}
326311
}
327312

328-
// Cache miss - compute value
329313
debugLog(`[memoizeAsync:${name}] miss`, { key })
330314

331-
// Create promise and cache it immediately (for deduplication)
315+
// Create promise and cache it immediately (for deduplication).
332316
const promise = fn(...args).then(
333317
result => {
334318
refreshing.delete(key)
335-
// Success — update cache entry with resolved promise AND refresh
336-
// the timestamp so the freshly-computed value isn't immediately
337-
// classified as expired. The timestamp was previously set when
338-
// the fetch *started*; under a slow fn this meant `isExpired`
339-
// could fire right as the value landed, and every subsequent
340-
// call past TTL recomputed because the stale-dedup branch had
341-
// nothing to join (`refreshing` was emptied here first).
319+
// Success — refresh the timestamp so the freshly-computed value
320+
// isn't immediately classified as expired. The timestamp was
321+
// previously set when the fetch *started*; under a slow fn this
322+
// meant `isExpired` could fire right as the value landed, and
323+
// every subsequent call past TTL recomputed because the
324+
// stale-dedup branch had nothing to join (`refreshing` was
325+
// emptied here first).
342326
const entry = cache.get(key)
343327
if (entry) {
344328
entry.value = Promise.resolve(result)
@@ -348,26 +332,20 @@ export function memoizeAsync<Args extends unknown[], Result>(
348332
},
349333
error => {
350334
refreshing.delete(key)
351-
// Failure - remove from cache to allow retry
335+
// Failure remove from cache to allow retry.
352336
cache.delete(key)
353-
const index = accessOrder.indexOf(key)
354-
if (index !== -1) {
355-
accessOrder.splice(index, 1)
356-
}
357337
debugLog(`[memoizeAsync:${name}] error`, { key, error })
358338
throw error
359339
},
360340
)
361341
refreshing.set(key, promise)
362342

363-
// Store promise in cache
364343
evictLRU()
365344
cache.set(key, {
366345
value: promise,
367346
timestamp: Date.now(),
368347
hits: 0,
369348
})
370-
accessOrder.push(key)
371349

372350
debugLog(`[memoizeAsync:${name}] set`, { key, cacheSize: cache.size })
373351
return await promise

0 commit comments

Comments
 (0)