fix: coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts
Build & Test (NowChessSystems) TeamCity build finished
Build & Test (NowChessSystems) TeamCity build finished
Critical fixes: - Enable auto-scaling (was disabled in config) - Add periodic cache eviction (5m interval) — CacheEvictionManager never ran - Add periodic rebalance check (30s) — proactive load balancing - Add 5s timeout to all gRPC calls (batchResubscribe, unsubscribe, evict) - Use Option instead of null checks (scalafix compliance) These gaps left the coordinator unable to: 1. Scale up when instances overloaded (scaling was disabled) 2. Clean up idle games from memory (no scheduled eviction) 3. Rebalance load proactively (only on scale-up) 4. Handle hung instances (no RPC timeouts, operations could hang forever) Combined with prior fixes for instance metadata parsing and heartbeat TTL, the coordinator now handles overload scenarios correctly. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -256,6 +256,7 @@
|
||||
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/AutoScaler.scala`
|
||||
- class AutoScaler
|
||||
- function initMetrics
|
||||
- function periodicScaleCheck
|
||||
- function checkAndScale
|
||||
- function scaleUp
|
||||
- function scaleDown
|
||||
|
||||
@@ -200,6 +200,7 @@
|
||||
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/AutoScaler.scala`
|
||||
- class AutoScaler
|
||||
- function initMetrics
|
||||
- function periodicScaleCheck
|
||||
- function checkAndScale
|
||||
- function scaleUp
|
||||
- function scaleDown
|
||||
|
||||
Reference in New Issue
Block a user