@@ -49,7 +49,7 @@ Need to find trait impls? → get_implementations
4949Searching for text/patterns? → Grep/text search
5050```
5151
52- ### Deep vs Shallow Modules
52+ ## Deep vs Shallow Modules
5353
5454** Deep Modules** = Powerful functionality + Simple interface
5555- ** Best modules** hide significant complexity behind clean APIs
@@ -73,6 +73,64 @@ pub trait Catalog: Send + Sync + Debug {
7373- Interfaces that expose internal complexity
7474- Documentation longer than implementation
7575
76+ ## Functional Programming Patterns
77+
78+ ### Prefer Iterators Over Loops
79+
80+ ** Pattern 1: Iterator Chains** (` iceberg-rust/src/table/transaction/operation.rs:188-196 ` ):
81+ ``` rust
82+ let new_datafile_iter = data_files . into_iter (). map (| data_file | {
83+ ManifestEntry :: builder ()
84+ . with_format_version (table_metadata . format_version)
85+ . with_status (Status :: Added )
86+ . with_data_file (data_file )
87+ . with_sequence_number (table_metadata . last_sequence_number + dsn_offset )
88+ . build ()
89+ . map_err (Error :: from )
90+ });
91+ ```
92+
93+ ** Benefits:**
94+ - Lazy evaluation (only creates when consumed)
95+ - Clear transformation pipeline
96+ - Error handling inline with ` map_err `
97+ - No intermediate allocations until ` collect() `
98+
99+ ** Pattern 2: flat_map for Flattening** (` iceberg-rust/src/table/transaction/operation.rs:149-153 ` ):
100+ ``` rust
101+ let all_files : Vec <DataFile > = sequence_groups
102+ . iter ()
103+ . flat_map (| d | d . delete_files. iter (). chain (d . data_files. iter ()))
104+ . cloned ()
105+ . collect ();
106+ ```
107+
108+ ** Pattern 3: Option/Result Combinators** (` iceberg-rust/src/catalog/create.rs:131-132 ` ):
109+ ``` rust
110+ // Prefer this:
111+ self . location. ok_or (Error :: NotFound (format! (" Location for table {}" , self . name)))?
112+
113+ // Over this:
114+ let location = match self . location {
115+ Some (loc ) => loc ,
116+ None => return Err (Error :: NotFound (... )),
117+ };
118+ ```
119+
120+ ### Guidelines
121+
122+ 1 . ** Use Iterator Methods:** ` map ` , ` filter ` , ` flat_map ` , ` fold ` over ` for ` loops
123+ 2 . ** Lazy When Possible:** Return ` impl Iterator ` for large transformations
124+ 3 . ** Combinators:** ` ok_or ` , ` and_then ` , ` unwrap_or_default ` for ` Option ` /` Result `
125+ 4 . ** Strategic collect():** Only use ` .collect::<Vec<_>>() ` when needed
126+ 5 . ** Chain Iterators:** Use ` .chain() ` instead of extending vecs
127+
128+ ### When NOT to Use Iterators
129+
130+ - Complex state machines (use explicit loops)
131+ - Performance-critical hot paths needing specific optimizations
132+ - When mutation in place is clearer
133+
76134## Trait Design Patterns
77135
78136### When to Create Traits
@@ -226,99 +284,8 @@ Domain layer adds context
226284Infrastructure errors wrapped transparently
227285```
228286
229- ## Functional Programming Patterns
230-
231- ### Prefer Iterators Over Loops
232-
233- ** Pattern 1: Iterator Chains** (` iceberg-rust/src/table/transaction/operation.rs:188-196 ` ):
234- ``` rust
235- let new_datafile_iter = data_files . into_iter (). map (| data_file | {
236- ManifestEntry :: builder ()
237- . with_format_version (table_metadata . format_version)
238- . with_status (Status :: Added )
239- . with_data_file (data_file )
240- . with_sequence_number (table_metadata . last_sequence_number + dsn_offset )
241- . build ()
242- . map_err (Error :: from )
243- });
244- ```
245-
246- ** Benefits:**
247- - Lazy evaluation (only creates when consumed)
248- - Clear transformation pipeline
249- - Error handling inline with ` map_err `
250- - No intermediate allocations until ` collect() `
251-
252- ** Pattern 2: flat_map for Flattening** (` iceberg-rust/src/table/transaction/operation.rs:149-153 ` ):
253- ``` rust
254- let all_files : Vec <DataFile > = sequence_groups
255- . iter ()
256- . flat_map (| d | d . delete_files. iter (). chain (d . data_files. iter ()))
257- . cloned ()
258- . collect ();
259- ```
260-
261- ** Pattern 3: Option/Result Combinators** (` iceberg-rust/src/catalog/create.rs:131-132 ` ):
262- ``` rust
263- // Prefer this:
264- self . location. ok_or (Error :: NotFound (format! (" Location for table {}" , self . name)))?
265-
266- // Over this:
267- let location = match self . location {
268- Some (loc ) => loc ,
269- None => return Err (Error :: NotFound (... )),
270- };
271- ```
272-
273- ### Guidelines
274-
275- 1 . ** Use Iterator Methods:** ` map ` , ` filter ` , ` flat_map ` , ` fold ` over ` for ` loops
276- 2 . ** Lazy When Possible:** Return ` impl Iterator ` for large transformations
277- 3 . ** Combinators:** ` ok_or ` , ` and_then ` , ` unwrap_or_default ` for ` Option ` /` Result `
278- 4 . ** Strategic collect():** Only use ` .collect::<Vec<_>>() ` when needed
279- 5 . ** Chain Iterators:** Use ` .chain() ` instead of extending vecs
280-
281- ### When NOT to Use Iterators
282-
283- - Complex state machines (use explicit loops)
284- - Performance-critical hot paths needing specific optimizations
285- - When mutation in place is clearer
286-
287287## Async Patterns
288288
289- ### Pattern: async_trait for I/O
290-
291- All catalog and I/O operations are async (` iceberg-rust/src/catalog/mod.rs:56-57 ` ):
292- ``` rust
293- #[async_trait:: async_trait]
294- pub trait Catalog : Send + Sync + Debug {
295- async fn create_table (self : Arc <Self >, ... ) -> Result <Table , Error >;
296- }
297- ```
298-
299- ** Why ` Arc<Self> ` :**
300- - Catalog is shared across connections
301- - Methods need owned ` self ` for async execution
302- - Prevents lifetime issues in async contexts
303-
304- ### Pattern: Controlled Parallelism
305-
306- ** Parallel Async Operations:**
307- ``` rust
308- let manifests = stream :: iter (manifest_entries )
309- . map (| manifest | async move {
310- object_store . get (& path ). await
311- })
312- . buffer_unordered (10 ) // 10 concurrent fetches
313- . try_collect :: <Vec <_ >>()
314- . await ? ;
315- ```
316-
317- ** Techniques:**
318- - ` stream::iter ` for converting to async stream
319- - ` buffer_unordered(N) ` for limiting parallelism
320- - ` try_collect ` for error-aware aggregation
321-
322289### Pattern: Instrumentation
323290
324291** All performance-critical paths** (` iceberg-rust/src/table/transaction/mod.rs ` ):
@@ -338,7 +305,6 @@ pub async fn commit(self) -> Result<(), Error> { ... }
3383052 . ** Send + Sync Bounds:** All async types crossing await points
3393063 . ** Arc for Shared State:** Prefer ` Arc ` over lifetimes in async
3403074 . ** Instrument Hot Paths:** Use ` #[instrument] ` on catalog/I/O operations
341- 5 . ** Limit Concurrency:** ` buffer_unordered(N) ` to prevent resource exhaustion
3423086 . ** Tokio Runtime:** All integration tests use ` #[tokio::main] `
343309
344310## Module Organization
@@ -413,14 +379,7 @@ pub mod identifier {
4133794 . ** Examples in Docs:** For builder patterns and complex APIs
4143805 . ** Errors Are Contract:** Always document failure modes
415381
416- ## Performance & Complexity Trade-offs
417-
418- ### Known Optimizations
419-
420- 1 . ** Manifest List Caching:** Prefetch with ` buffer_unordered(10) `
421- 2 . ** Lazy Iteration:** Return iterators, not Vecs when possible
422- 3 . ** Arc Instead of Clone:** For metadata, catalogs, object stores
423- 4 . ** Controlled Concurrency:** Limit parallel I/O to prevent resource exhaustion
382+ ## Complexity Trade-offs
424383
425384### Complexity Management
426385
0 commit comments