Objective: Enable fluid data access regardless of infrastructure constraints (e.g., storage types, runtime environments) without developing controller code.
- Pluggable Architecture: Standardized Cache Runtime Interface for rapid integration of new engines (CubeFS, Dragonfly, Vineyard) with minimal boilerplate.
- Orchestration Based on AdvancedStatefulSet: Migrate from StatefulSet to AdvancedStatefulSet for fine-grained Pod lifecycle management, ordered rollout, and enhanced failover capabilities.
- Zero-Downtime Tuning: Adjust cache replicas, storage media tiers (SSD/HDD/RAM), and eviction policies without Dataset reconstruction or workload restart.
- Hot Parameter Swapping: Runtime modification of cache engine configurations (e.g., Alluxio thread pool, Jindo worker threads) for traffic spike handling.
- Standardized Conditions,
ObservedGeneration, and phase transition semantics for improved GitOps and tooling compatibility. - Conversion webhook support for seamless
v1alpha1→v1alpha2migration.
- Admission-time CRD validation with auto-correction suggestions to prevent misconfigurations.
- Policy enforcement for resource quotas and security constraints.
- Production-ready stability for large-scale deployments with minimum container privileges (eliminate privileged FUSE Pod requirements).
Objective: Achieve cross-region, cross-cluster, and cross-platform data mobility and accessibility.
- Disaggregated KV Cache: Externalize vLLM/SGLang KV Cache to Fluid-managed distributed storage, enabling 10x+ throughput improvement for long-context inference.
- Cross-Pod Cache Sharing: Live migration of KV Cache between inference instances for preemptive scheduling and spot instance tolerance.
- Mooncake Integration: Official partnership for high-performance KV Cache backend with RDMA acceleration.
- Distributed Prewarming: Maximize bandwidth utilization for fast data loading.
- Throttling Control: Limit bandwidth usage during prewarming to avoid saturation.
- Rsync Optimization: Improve cross-region sync efficiency.
- Master Pod Crash Recovery: Automatic re-setup and state reconstruction after cache master failure without data loss.
- Metadata Persistence: WAL-based metadata recovery for rapid failover.
- Access Pattern Recognition: ML-based analysis to auto-inject acceleration strategies (prefetching, block size optimization).
- Dataset Garbage Collection: Idle dataset detection via reference counting and access history analysis.
Objective: Ensure real-time, adaptive, and intelligent data availability for workloads.
- Kueue-Driven Pipelines: Trigger training/inference jobs automatically upon DataLoad completion; automate post-job cache eviction and data migration.
- Event-Driven Policies: Flexible metadata synchronization triggered by workload lifecycle events.
- Fluid kubectl Plugin: Native CLI extension (
kubectl fluid) for:- Dataset status inspection and health diagnostics
- On-demand prewarming triggering (
kubectl fluid warmup) - Cache performance profiling and bottleneck analysis
- Runtime configuration hot-updates