Skip to content

Commit da71fec

Browse files
authored
docs: Add comprehensive documentation, examples, and project roadmap (#40)
* docs: Add comprehensive architecture and multi-pool documentation Added three key documentation files to guide operator development and usage: - architecture-decisions.md: Documents critical architectural understanding of RustFS unified cluster model, erasure coding behavior, and valid/invalid multi-pool use cases. Includes warnings about storage class mixing pitfalls. - multi-pool-use-cases.md: Comprehensive guide covering valid multi-pool scenarios (capacity expansion, geographic distribution, spot instances, hardware migration) with concrete examples and anti-patterns to avoid. - DEVELOPMENT-NOTES.md: Development workflow documentation including build commands, testing procedures, and contribution guidelines. These docs prevent common misconfigurations and establish architectural understanding for contributors. * examples: Add comprehensive Tenant CRD examples and reorganize structure Moved examples from deploy/rustfs-operator/examples/ to project root examples/ directory for better visibility and accessibility. New examples added: - cluster-expansion-tenant.yaml: Demonstrates capacity expansion and gradual hardware migration using multiple pools - geographic-pools-tenant.yaml: Multi-region deployment with topology constraints for compliance and disaster recovery - hardware-pools-tenant.yaml: Heterogeneous disk sizes within same storage class for efficient capacity utilization - spot-instance-tenant.yaml: Cost optimization using spot instances with appropriate tolerations and affinity rules - production-ha-tenant.yaml: Production-ready HA setup with topology spread constraints and resource limits - README.md: Comprehensive guide with usage instructions, architectural warnings, and kubectl verification commands Enhanced existing examples: - simple-tenant.yaml: Added documentation for all scheduling fields - minimal-dev-tenant.yaml: Corrected port references - custom-rbac-tenant.yaml: Clarified RBAC patterns - multi-pool-tenant.yaml: Fixed syntax and structure All examples include: - Inline documentation explaining configuration choices - Architectural warnings about RustFS unified cluster behavior - kubectl verification commands for testing - Best practices for production deployments Removed: - deploy/rustfs-operator/examples/multi-pool-tenant.yaml (moved to examples/) - deploy/rustfs-operator/examples/simple-tenant.yaml (moved to examples/) * docs: Add comprehensive CHANGELOG documenting all changes Added CHANGELOG.md following Keep a Changelog format to track all notable changes to the RustFS Kubernetes Operator. Documented changes include: - Multi-pool scheduling enhancements (2025-11-08) - Required environment variables additions (2025-11-05) - Critical port corrections (console: 9090→9001, IO: 90→9000) - Volume path standardization (/data/{N} → /data/rustfs{N}) - Architecture corrections and clarifications - Example improvements and bug fixes - Documentation of valid vs invalid multi-pool use cases Key architectural facts documented: - Unified cluster architecture (all pools form ONE erasure-coded cluster) - Uniform data distribution across ALL volumes - No storage class awareness or intelligent placement - Performance limited by slowest storage class - External tiering via lifecycle policies Verification against RustFS source code, Helm charts, and official documentation ensures accuracy. Test status: 25 tests passing, backward compatibility maintained. * docs: Add comprehensive project ROADMAP for discussion Added ROADMAP.md outlining development plans from v0.2.0 through v1.0.0 and beyond. This document serves as a foundation for community discussion and priority alignment. Key sections: - Current status (v0.1.0) with completed features and known issues - v0.2.0 (Q1 2026): Core stability with Secret management, status conditions, improved error handling, and integration tests - v0.3.0 (Q2 2026): Advanced lifecycle management, pool operations, TLS automation, and monitoring integration - v0.4.0 (Q3 2026): Enterprise features including multi-tenancy, security hardening, compliance, and advanced networking - v1.0.0 (Q4 2026): Production ready with stability guarantees, complete documentation, and ecosystem integration - Post-1.0: Future considerations (GitOps, multi-cluster, AI/ML optimization) Also includes: - Technical debt tracking - Community and contribution goals - Release schedule (quarterly pre-1.0, monthly post-1.0) - Success metrics and contribution guidelines Target 1.0 release: Q4 2026 This roadmap is a living document open to community input and feedback. * remove timelines * remove target release timeline * remove release schedule * docs: Update CLAUDE.md with comprehensive project context Enhanced CLAUDE.md with critical information from recent documentation: Critical Architectural Understanding: - Added prominent warning about RustFS unified cluster architecture - Clarified that all pools form ONE cluster, not separate clusters - Documented valid vs invalid multi-pool use cases - Reference to architecture-decisions.md for detailed ADRs RustFS-Specific Standards: - Service ports verified against source (IO: 9000, Console: 9001) - Volume path patterns (/data/rustfs{N}) - Required environment variables - Credential requirements Enhanced Documentation: - Updated CRD validation rules (2-server, 3-server requirements) - SchedulingConfig with flatten pattern - Persistence config details - New spec fields: image_pull_policy, pod_management_policy Development Context: - Known issues and TODOs with specific line numbers - Documentation structure (CHANGELOG, ROADMAP, docs/, examples/) - All 10 examples organized by category - Development priorities from ROADMAP (without timelines) - Test coverage: 25 tests passing Verification Standards: - Sources for verifying RustFS behavior - Warning against inventing features This provides comprehensive guidance for future development sessions. * docs: Update ROADMAP with correct discussion and issue tracker links * docs: Remove community chat placeholder from ROADMAP
1 parent c02cf39 commit da71fec

17 files changed

+4041
-41
lines changed

CHANGELOG.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Changelog
2+
3+
All notable changes to the RustFS Kubernetes Operator will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [Unreleased]
9+
10+
### Added
11+
12+
#### Multi-Pool Scheduling Enhancements (2025-11-08)
13+
14+
- **Per-Pool Kubernetes Scheduling**: Added comprehensive scheduling configuration to Pool struct
15+
- `nodeSelector` - Target specific nodes by labels
16+
- `affinity` - Complex node/pod affinity rules
17+
- `tolerations` - Schedule on tainted nodes (e.g., spot instances)
18+
- `topologySpreadConstraints` - Distribute pods across failure domains
19+
- `resources` - CPU/memory requests and limits per pool
20+
- `priorityClassName` - Override tenant-level priority per pool
21+
22+
- **SchedulingConfig Struct**: Grouped scheduling fields for better code organization
23+
- Uses `#[serde(flatten)]` to maintain flat YAML structure
24+
- Follows industry-standard pattern (MongoDB, PostgreSQL operators)
25+
- 100% backward compatible
26+
27+
- **New Examples**:
28+
- `cluster-expansion-tenant.yaml` - Demonstrates capacity expansion and pool migration
29+
- `hardware-pools-tenant.yaml` - Shows heterogeneous disk sizes (same storage class)
30+
- `geographic-pools-tenant.yaml` - Multi-region deployment for compliance and DR
31+
- `spot-instance-tenant.yaml` - Cost optimization using spot instances
32+
33+
- **Documentation**:
34+
- `docs/multi-pool-use-cases.md` - Comprehensive multi-pool use case guide
35+
- `docs/architecture-decisions.md` - Critical architecture understanding
36+
- Updated `examples/README.md` with architecture warnings
37+
38+
- **Tests**: Added 5 new tests for scheduling field propagation (20 → 25 tests)
39+
40+
#### Required Environment Variables (2025-11-05)
41+
42+
- Operator now automatically sets required RustFS environment variables:
43+
- `RUSTFS_VOLUMES` - Multi-node volume configuration (already existed)
44+
- `RUSTFS_ADDRESS` - Server binding address (0.0.0.0:9000)
45+
- `RUSTFS_CONSOLE_ADDRESS` - Console binding address (0.0.0.0:9001)
46+
- `RUSTFS_CONSOLE_ENABLE` - Enable console UI (true)
47+
48+
### Fixed
49+
50+
#### Critical Port Corrections (2025-11-05)
51+
52+
- **Console Port**: Changed from 9090 to 9001 (correct RustFS default)
53+
- Fixed in `services.rs` and `workloads.rs`
54+
- Verified against RustFS source code constants
55+
56+
- **IO Service Port**: Changed from 90 to 9000 (S3 API standard)
57+
- Fixed in `services.rs`
58+
- Now matches S3-compatible service expectations
59+
60+
#### Volume Path Standardization (2025-11-05)
61+
62+
- **Volume Mount Paths**: Changed from `/data/{N}` to `/data/rustfs{N}`
63+
- Matches RustFS official Helm chart convention
64+
- Aligns with RustFS docker-compose examples
65+
- Verified against RustFS MNMD deployment guide
66+
67+
- **RUSTFS_VOLUMES Format**: Updated path from `/data/{0...N}` to `/data/rustfs{0...N}`
68+
- Consistent with RustFS ecosystem standards
69+
- Uses 3-dot ellipsis notation for RustFS expansion
70+
71+
#### Architecture Corrections (2025-11-08)
72+
73+
- **Storage Class Mixing**: Corrected examples that incorrectly mixed storage classes
74+
- Updated `hardware-pools-tenant.yaml` to use same storage class with different sizes
75+
- Fixed `spot-instance-tenant.yaml` to use uniform storage class
76+
- Added warnings to `geographic-pools-tenant.yaml` about unified cluster behavior
77+
78+
- **Architectural Clarifications**:
79+
- All pools form ONE unified RustFS erasure-coded cluster
80+
- Data is striped uniformly across ALL volumes regardless of storage class
81+
- Mixing NVMe/SSD/HDD results in HDD-level performance for entire cluster
82+
- RustFS has no intelligent storage class-based data placement
83+
84+
#### Examples Bug Fixes (2025-11-05)
85+
86+
- Fixed `multi-pool-tenant.yaml` syntax error (missing `persistence:` nesting)
87+
- Moved examples from `deploy/rustfs-operator/examples/` to `examples/` at project root
88+
- Created comprehensive `examples/README.md` with usage guide
89+
90+
### Changed
91+
92+
#### Example Improvements (2025-11-05 to 2025-11-08)
93+
94+
- **simple-tenant.yaml**: Added documentation for all scheduling fields
95+
- **production-ha-tenant.yaml**: Added topology spread constraints and resource requirements
96+
- **minimal-dev-tenant.yaml**: Corrected port references and added verification commands
97+
- **custom-rbac-tenant.yaml**: Clarified RBAC patterns
98+
99+
### Removed
100+
101+
- **tiered-storage-tenant.yaml** (2025-11-05): Removed example with fabricated RustFS features
102+
- Contained non-existent environment variables
103+
- Made false claims about automatic storage tiering
104+
- Replaced with architecturally sound examples
105+
106+
### Documentation
107+
108+
#### Architecture Understanding (2025-11-08)
109+
110+
Key architectural facts now documented:
111+
112+
1. **Unified Cluster Architecture**: All pools in a Tenant form ONE erasure-coded cluster
113+
2. **Uniform Data Distribution**: Erasure coding stripes data across ALL volumes equally
114+
3. **No Storage Class Awareness**: RustFS does not prefer fast disks over slow disks
115+
4. **Performance Limitation**: Cluster performs at speed of SLOWEST storage class
116+
5. **External Tiering**: RustFS tiering uses lifecycle policies to external cloud storage (S3, Azure, GCS)
117+
118+
#### Valid Multi-Pool Use Cases
119+
120+
Documented valid uses:
121+
- ✅ Cluster capacity expansion and hardware migration
122+
- ✅ Geographic distribution for compliance and disaster recovery
123+
- ✅ Spot vs on-demand instance optimization (compute cost savings)
124+
- ✅ Same storage class with different disk sizes
125+
- ✅ Resource differentiation (CPU/memory) per pool
126+
- ✅ Topology-aware distribution across failure domains
127+
128+
Invalid uses clarified:
129+
- ❌ Storage class mixing for performance tiering (NVMe for hot, HDD for cold)
130+
- ❌ Automatic intelligent data placement based on access patterns
131+
132+
---
133+
134+
## [0.1.0] - 2025-11-05
135+
136+
### Initial State
137+
138+
- Basic Tenant CRD with pool support
139+
- RBAC resource creation (Role, ServiceAccount, RoleBinding)
140+
- Service creation (IO, Console, Headless)
141+
- StatefulSet creation per pool
142+
- Volume claim template generation
143+
- RUSTFS_VOLUMES automatic configuration
144+
145+
### Known Issues in 0.1.0 (Before Fixes)
146+
147+
- Incorrect console port (9090 instead of 9001)
148+
- Incorrect IO service port (90 instead of 9000)
149+
- Missing required RustFS environment variables
150+
- Non-standard volume mount paths
151+
- Limited multi-pool scheduling capabilities
152+
- Misleading examples with fabricated features
153+
154+
---
155+
156+
## Verification
157+
158+
All changes verified against:
159+
- RustFS source code (`~/git/rustfs`)
160+
- RustFS Helm chart (`helm/rustfs/`)
161+
- RustFS docker-compose examples
162+
- RustFS MNMD deployment guide
163+
- RustFS configuration constants
164+
165+
## Testing
166+
167+
- **Test Count**: 25 tests
168+
- **Status**: All passing ✅
169+
- **Build**: Successful ✅
170+
- **Backward Compatibility**: 100% maintained ✅
171+
172+
---
173+
174+
**Branch**: `feature/pool-scheduling-enhancements`
175+
**Status**: Ready for merge

0 commit comments

Comments
 (0)