Skip to content

Commit a04a9eb

Browse files
committed
improve claudemd
1 parent 17373f0 commit a04a9eb

1 file changed

Lines changed: 60 additions & 101 deletions

File tree

CLAUDE.md

Lines changed: 60 additions & 101 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Need to find trait impls? → get_implementations
4949
Searching for text/patterns? → Grep/text search
5050
```
5151

52-
### Deep vs Shallow Modules
52+
## Deep vs Shallow Modules
5353

5454
**Deep Modules** = Powerful functionality + Simple interface
5555
- **Best modules** hide significant complexity behind clean APIs
@@ -73,6 +73,64 @@ pub trait Catalog: Send + Sync + Debug {
7373
- Interfaces that expose internal complexity
7474
- Documentation longer than implementation
7575

76+
## Functional Programming Patterns
77+
78+
### Prefer Iterators Over Loops
79+
80+
**Pattern 1: Iterator Chains** (`iceberg-rust/src/table/transaction/operation.rs:188-196`):
81+
```rust
82+
let new_datafile_iter = data_files.into_iter().map(|data_file| {
83+
ManifestEntry::builder()
84+
.with_format_version(table_metadata.format_version)
85+
.with_status(Status::Added)
86+
.with_data_file(data_file)
87+
.with_sequence_number(table_metadata.last_sequence_number + dsn_offset)
88+
.build()
89+
.map_err(Error::from)
90+
});
91+
```
92+
93+
**Benefits:**
94+
- Lazy evaluation (only creates when consumed)
95+
- Clear transformation pipeline
96+
- Error handling inline with `map_err`
97+
- No intermediate allocations until `collect()`
98+
99+
**Pattern 2: flat_map for Flattening** (`iceberg-rust/src/table/transaction/operation.rs:149-153`):
100+
```rust
101+
let all_files: Vec<DataFile> = sequence_groups
102+
.iter()
103+
.flat_map(|d| d.delete_files.iter().chain(d.data_files.iter()))
104+
.cloned()
105+
.collect();
106+
```
107+
108+
**Pattern 3: Option/Result Combinators** (`iceberg-rust/src/catalog/create.rs:131-132`):
109+
```rust
110+
// Prefer this:
111+
self.location.ok_or(Error::NotFound(format!("Location for table {}", self.name)))?
112+
113+
// Over this:
114+
let location = match self.location {
115+
Some(loc) => loc,
116+
None => return Err(Error::NotFound(...)),
117+
};
118+
```
119+
120+
### Guidelines
121+
122+
1. **Use Iterator Methods:** `map`, `filter`, `flat_map`, `fold` over `for` loops
123+
2. **Lazy When Possible:** Return `impl Iterator` for large transformations
124+
3. **Combinators:** `ok_or`, `and_then`, `unwrap_or_default` for `Option`/`Result`
125+
4. **Strategic collect():** Only use `.collect::<Vec<_>>()` when needed
126+
5. **Chain Iterators:** Use `.chain()` instead of extending vecs
127+
128+
### When NOT to Use Iterators
129+
130+
- Complex state machines (use explicit loops)
131+
- Performance-critical hot paths needing specific optimizations
132+
- When mutation in place is clearer
133+
76134
## Trait Design Patterns
77135

78136
### When to Create Traits
@@ -226,99 +284,8 @@ Domain layer adds context
226284
Infrastructure errors wrapped transparently
227285
```
228286

229-
## Functional Programming Patterns
230-
231-
### Prefer Iterators Over Loops
232-
233-
**Pattern 1: Iterator Chains** (`iceberg-rust/src/table/transaction/operation.rs:188-196`):
234-
```rust
235-
let new_datafile_iter = data_files.into_iter().map(|data_file| {
236-
ManifestEntry::builder()
237-
.with_format_version(table_metadata.format_version)
238-
.with_status(Status::Added)
239-
.with_data_file(data_file)
240-
.with_sequence_number(table_metadata.last_sequence_number + dsn_offset)
241-
.build()
242-
.map_err(Error::from)
243-
});
244-
```
245-
246-
**Benefits:**
247-
- Lazy evaluation (only creates when consumed)
248-
- Clear transformation pipeline
249-
- Error handling inline with `map_err`
250-
- No intermediate allocations until `collect()`
251-
252-
**Pattern 2: flat_map for Flattening** (`iceberg-rust/src/table/transaction/operation.rs:149-153`):
253-
```rust
254-
let all_files: Vec<DataFile> = sequence_groups
255-
.iter()
256-
.flat_map(|d| d.delete_files.iter().chain(d.data_files.iter()))
257-
.cloned()
258-
.collect();
259-
```
260-
261-
**Pattern 3: Option/Result Combinators** (`iceberg-rust/src/catalog/create.rs:131-132`):
262-
```rust
263-
// Prefer this:
264-
self.location.ok_or(Error::NotFound(format!("Location for table {}", self.name)))?
265-
266-
// Over this:
267-
let location = match self.location {
268-
Some(loc) => loc,
269-
None => return Err(Error::NotFound(...)),
270-
};
271-
```
272-
273-
### Guidelines
274-
275-
1. **Use Iterator Methods:** `map`, `filter`, `flat_map`, `fold` over `for` loops
276-
2. **Lazy When Possible:** Return `impl Iterator` for large transformations
277-
3. **Combinators:** `ok_or`, `and_then`, `unwrap_or_default` for `Option`/`Result`
278-
4. **Strategic collect():** Only use `.collect::<Vec<_>>()` when needed
279-
5. **Chain Iterators:** Use `.chain()` instead of extending vecs
280-
281-
### When NOT to Use Iterators
282-
283-
- Complex state machines (use explicit loops)
284-
- Performance-critical hot paths needing specific optimizations
285-
- When mutation in place is clearer
286-
287287
## Async Patterns
288288

289-
### Pattern: async_trait for I/O
290-
291-
All catalog and I/O operations are async (`iceberg-rust/src/catalog/mod.rs:56-57`):
292-
```rust
293-
#[async_trait::async_trait]
294-
pub trait Catalog: Send + Sync + Debug {
295-
async fn create_table(self: Arc<Self>, ...) -> Result<Table, Error>;
296-
}
297-
```
298-
299-
**Why `Arc<Self>`:**
300-
- Catalog is shared across connections
301-
- Methods need owned `self` for async execution
302-
- Prevents lifetime issues in async contexts
303-
304-
### Pattern: Controlled Parallelism
305-
306-
**Parallel Async Operations:**
307-
```rust
308-
let manifests = stream::iter(manifest_entries)
309-
.map(|manifest| async move {
310-
object_store.get(&path).await
311-
})
312-
.buffer_unordered(10) // 10 concurrent fetches
313-
.try_collect::<Vec<_>>()
314-
.await?;
315-
```
316-
317-
**Techniques:**
318-
- `stream::iter` for converting to async stream
319-
- `buffer_unordered(N)` for limiting parallelism
320-
- `try_collect` for error-aware aggregation
321-
322289
### Pattern: Instrumentation
323290

324291
**All performance-critical paths** (`iceberg-rust/src/table/transaction/mod.rs`):
@@ -338,7 +305,6 @@ pub async fn commit(self) -> Result<(), Error> { ... }
338305
2. **Send + Sync Bounds:** All async types crossing await points
339306
3. **Arc for Shared State:** Prefer `Arc` over lifetimes in async
340307
4. **Instrument Hot Paths:** Use `#[instrument]` on catalog/I/O operations
341-
5. **Limit Concurrency:** `buffer_unordered(N)` to prevent resource exhaustion
342308
6. **Tokio Runtime:** All integration tests use `#[tokio::main]`
343309

344310
## Module Organization
@@ -413,14 +379,7 @@ pub mod identifier {
413379
4. **Examples in Docs:** For builder patterns and complex APIs
414380
5. **Errors Are Contract:** Always document failure modes
415381

416-
## Performance & Complexity Trade-offs
417-
418-
### Known Optimizations
419-
420-
1. **Manifest List Caching:** Prefetch with `buffer_unordered(10)`
421-
2. **Lazy Iteration:** Return iterators, not Vecs when possible
422-
3. **Arc Instead of Clone:** For metadata, catalogs, object stores
423-
4. **Controlled Concurrency:** Limit parallel I/O to prevent resource exhaustion
382+
## Complexity Trade-offs
424383

425384
### Complexity Management
426385

0 commit comments

Comments
 (0)