Compare commits

..

9 Commits

Author SHA1 Message Date
TeamCity 56f0030a83 ci: bump version with Build-85 2026-05-13 20:48:35 +00:00
Janis d0c71693bb fix: coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts
Build & Test (NowChessSystems) TeamCity build finished
Critical fixes:
- Enable auto-scaling (was disabled in config)
- Add periodic cache eviction (5m interval) — CacheEvictionManager never ran
- Add periodic rebalance check (30s) — proactive load balancing
- Add 5s timeout to all gRPC calls (batchResubscribe, unsubscribe, evict)
- Use Option instead of null checks (scalafix compliance)

These gaps left the coordinator unable to:
1. Scale up when instances overloaded (scaling was disabled)
2. Clean up idle games from memory (no scheduled eviction)
3. Rebalance load proactively (only on scale-up)
4. Handle hung instances (no RPC timeouts, operations could hang forever)

Combined with prior fixes for instance metadata parsing and heartbeat TTL,
the coordinator now handles overload scenarios correctly.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-13 22:20:25 +02:00
Janis 3f12f695f1 feat: implement periodic scaling checks and enhance instance management in AutoScaler
Build & Test (NowChessSystems) TeamCity build failed
2026-05-13 22:08:22 +02:00
TeamCity 0a3c494fa8 ci: bump version with Build-84 2026-05-13 19:32:12 +00:00
Janis f7ce4df595 fix: update documentation to reflect new functions in CoordinatorGrpcServer and InstanceRegistry
Build & Test (NowChessSystems) TeamCity build finished
2026-05-13 21:07:44 +02:00
TeamCity d41c03700c ci: bump version with Build-83 2026-05-13 17:06:04 +00:00
Janis 10937e756a fix: streamline logging for evicted instances in InstanceRegistry
Build & Test (NowChessSystems) TeamCity build finished
2026-05-13 18:39:32 +02:00
Janis 380a2cceeb feat: add periodic health check to evict dead instances
Build & Test (NowChessSystems) TeamCity build failed
Add quarkus-scheduler dependency and schedule health check every 10 seconds.
Dead instances (marked with state="DEAD") now automatically evicted instead of
accumulating indefinitely.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-13 18:25:12 +02:00
Janis 43184d296d fix: remove corrupted instances immediately and evict dead instances
Problem: Dead instances pile up indefinitely. Failed metadata parsing leaves stale data in registry. No cleanup mechanism exists.

Changes:
1. Remove instance from registry on parse failure (corrupted metadata = unrecoverable)
2. Evict instances with state="DEAD" on next health check (was only evicting by heartbeat age)

This prevents:
- Memory leak from accumulating dead/corrupted instances
- Stale data persisting after parse failures
- Dead instances blocking resources indefinitely

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-13 18:25:12 +02:00
17 changed files with 340 additions and 84 deletions
+7 -5
View File
@@ -235,10 +235,10 @@
- function rebalanceInterval
- function rebalanceMinInterval
- function heartbeatTtl
- _...12 more_
- _...14 more_
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/config/JacksonConfig.scala` — class JacksonConfig, function customize
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/config/NativeReflectionConfig.scala` — class NativeReflectionConfig
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoordinatorGrpcServer.scala` — class CoordinatorGrpcServer
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoordinatorGrpcServer.scala` — class CoordinatorGrpcServer, function hasActiveStream
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoreGrpcClient.scala`
- class CoreGrpcClient
- function shutdown
@@ -256,6 +256,7 @@
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/AutoScaler.scala`
- class AutoScaler
- function initMetrics
- function periodicScaleCheck
- function checkAndScale
- function scaleUp
- function scaleDown
@@ -272,16 +273,17 @@
- class HealthMonitor
- function setRedisPrefix
- function initializeMetrics
- function onStartup
- function periodicHealthCheck
- function checkInstanceHealth
- function watchK8sPods
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/InstanceRegistry.scala`
- class InstanceRegistry
- function initMetrics
- function setRedisPrefix
- function loadAllFromRedis
- function getInstance
- function getAllInstances
- function updateInstanceFromRedis
- _...3 more_
- _...4 more_
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/LoadBalancer.scala`
- class LoadBalancer
- function setRedisPrefix
+7 -5
View File
@@ -179,10 +179,10 @@
- function rebalanceInterval
- function rebalanceMinInterval
- function heartbeatTtl
- _...12 more_
- _...14 more_
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/config/JacksonConfig.scala` — class JacksonConfig, function customize
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/config/NativeReflectionConfig.scala` — class NativeReflectionConfig
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoordinatorGrpcServer.scala` — class CoordinatorGrpcServer
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoordinatorGrpcServer.scala` — class CoordinatorGrpcServer, function hasActiveStream
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/grpc/CoreGrpcClient.scala`
- class CoreGrpcClient
- function shutdown
@@ -200,6 +200,7 @@
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/AutoScaler.scala`
- class AutoScaler
- function initMetrics
- function periodicScaleCheck
- function checkAndScale
- function scaleUp
- function scaleDown
@@ -216,16 +217,17 @@
- class HealthMonitor
- function setRedisPrefix
- function initializeMetrics
- function onStartup
- function periodicHealthCheck
- function checkInstanceHealth
- function watchK8sPods
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/InstanceRegistry.scala`
- class InstanceRegistry
- function initMetrics
- function setRedisPrefix
- function loadAllFromRedis
- function getInstance
- function getAllInstances
- function updateInstanceFromRedis
- _...3 more_
- _...4 more_
- `modules/coordinator/src/main/scala/de/nowchess/coordinator/service/LoadBalancer.scala`
- class LoadBalancer
- function setRedisPrefix
+72
View File
@@ -366,3 +366,75 @@
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-13)
### Features
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-13)
### Features
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
+1
View File
@@ -78,6 +78,7 @@ dependencies {
implementation("com.fasterxml.jackson.module:jackson-module-scala_3:${versions["JACKSON_SCALA"]!!}")
implementation("io.quarkus:quarkus-redis-client")
implementation("io.quarkus:quarkus-kubernetes-client")
implementation("io.quarkus:quarkus-scheduler")
testImplementation(platform("org.junit:junit-bom:${versions["JUNIT_BOM"]!!}"))
testImplementation("org.junit.jupiter:junit-jupiter")
@@ -37,7 +37,7 @@ nowchess:
stream-heartbeat-interval: PT0.2S
cache-eviction-interval: 10m
game-idle-threshold: 45m
auto-scale-enabled: false
auto-scale-enabled: true
scale-up-threshold: 0.8
scale-down-threshold: 0.3
scale-min-replicas: 2
@@ -39,7 +39,7 @@ class CoreGrpcClient:
def batchResubscribeGames(host: String, port: Int, gameIds: List[String]): Int =
try
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port))
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port)).withDeadlineAfter(5, TimeUnit.SECONDS)
val request = BatchResubscribeRequest.newBuilder().addAllGameIds(gameIds.asJava).build()
val count = stub.batchResubscribeGames(request).getSubscribedCount
log.debugf("batchResubscribeGames %s:%d — subscribed %d games", host, port, count)
@@ -52,7 +52,7 @@ class CoreGrpcClient:
def unsubscribeGames(host: String, port: Int, gameIds: List[String]): Int =
try
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port))
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port)).withDeadlineAfter(5, TimeUnit.SECONDS)
val request = UnsubscribeGamesRequest.newBuilder().addAllGameIds(gameIds.asJava).build()
val count = stub.unsubscribeGames(request).getUnsubscribedCount
log.debugf("unsubscribeGames %s:%d — unsubscribed %d games", host, port, count)
@@ -65,7 +65,7 @@ class CoreGrpcClient:
def evictGames(host: String, port: Int, gameIds: List[String]): Int =
try
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port))
val stub = CoordinatorServiceGrpc.newBlockingStub(getChannel(host, port)).withDeadlineAfter(5, TimeUnit.SECONDS)
val request = EvictGamesRequest.newBuilder().addAllGameIds(gameIds.asJava).build()
val count = stub.evictGames(request).getEvictedCount
log.debugf("evictGames %s:%d — evicted %d games", host, port, count)
@@ -8,6 +8,7 @@ import de.nowchess.coordinator.config.CoordinatorConfig
import io.fabric8.kubernetes.api.model.GenericKubernetesResource
import io.fabric8.kubernetes.client.KubernetesClient
import io.micrometer.core.instrument.{Gauge, MeterRegistry}
import io.quarkus.scheduler.Scheduled
import org.jboss.logging.Logger
import java.util.concurrent.atomic.AtomicReference
@@ -25,6 +26,12 @@ class AutoScaler:
@Inject
private var instanceRegistry: InstanceRegistry = uninitialized
@Inject
private var loadBalancer: LoadBalancer = uninitialized
@Inject
private var failoverService: FailoverService = uninitialized
@Inject
private var meterRegistry: MeterRegistry = uninitialized
// scalafix:on DisableSyntax.var
@@ -51,6 +58,11 @@ class AutoScaler:
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", "down").increment(0)
()
@Scheduled(every = "10s")
def periodicScaleCheck(): Unit =
try checkAndScale
catch case ex: Exception => log.warnf(ex, "Auto-scale check failed")
// scalafix:off DisableSyntax.asInstanceOf
private def rolloutSpec(rollout: GenericKubernetesResource): Option[java.util.Map[String, AnyRef]] =
Option(rollout.get[AnyRef]("spec")).collect { case m: java.util.Map[?, ?] =>
@@ -105,6 +117,7 @@ class AutoScaler:
currentReplicas,
currentReplicas + 1,
)
loadBalancer.rebalance
else log.infof("Already at max replicas %d for %s", maxReplicas, config.k8sRolloutName)
case _ => ()
}
@@ -116,43 +129,61 @@ class AutoScaler:
def scaleDown(): Unit =
log.info("Scaling down Argo Rollout")
kubeClientOpt match
case None =>
log.warn("Kubernetes client not available, cannot scale")
case Some(kube) =>
try
Option(
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.withName(config.k8sRolloutName)
.get(),
).foreach { rollout =>
rolloutSpec(rollout).foreach { spec =>
spec.get("replicas") match
case replicas: Integer =>
val currentReplicas = replicas.intValue()
val minReplicas = config.scaleMinReplicas
val underloadedInstance = instanceRegistry.getAllInstances
.filter(_.state == "HEALTHY")
.minByOption(_.subscriptionCount)
if currentReplicas > minReplicas then
spec.put("replicas", Integer.valueOf(currentReplicas - 1))
underloadedInstance.foreach { inst =>
log.infof("Draining instance %s before scale-down", inst.instanceId)
failoverService
.onInstanceStreamDropped(inst.instanceId)
.subscribe()
.`with`(
_ =>
kubeClientOpt match
case None =>
log.warn("Kubernetes client not available, cannot scale")
case Some(kube) =>
try
Option(
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.resource(rollout)
.update()
meterRegistry.counter("nowchess.coordinator.scale.events", "direction", "down").increment()
log.infof(
"Scaled down %s from %d to %d replicas",
config.k8sRolloutName,
currentReplicas,
currentReplicas - 1,
)
else log.infof("Already at min replicas %d for %s", minReplicas, config.k8sRolloutName)
case _ => ()
}
}
catch
case ex: Exception =>
.withName(config.k8sRolloutName)
.get(),
).foreach { rollout =>
rolloutSpec(rollout).foreach { spec =>
spec.get("replicas") match
case replicas: Integer =>
val currentReplicas = replicas.intValue()
val minReplicas = config.scaleMinReplicas
if currentReplicas > minReplicas then
spec.put("replicas", Integer.valueOf(currentReplicas - 1))
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.resource(rollout)
.update()
meterRegistry.counter("nowchess.coordinator.scale.events", "direction", "down").increment()
log.infof(
"Scaled down %s from %d to %d replicas",
config.k8sRolloutName,
currentReplicas,
currentReplicas - 1,
)
else log.infof("Already at min replicas %d for %s", minReplicas, config.k8sRolloutName)
case _ => ()
}
}
catch
case ex: Exception =>
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", "down").increment()
log.warnf(ex, "Failed to scale down %s", config.k8sRolloutName),
ex =>
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", "down").increment()
log.warnf(ex, "Failed to scale down %s", config.k8sRolloutName)
log.warnf(ex, "Failed to drain instance %s before scale-down", inst.instanceId),
)
}
if underloadedInstance.isEmpty then log.warn("No healthy instances found for scale-down")
@@ -7,6 +7,7 @@ import io.quarkus.redis.datasource.RedisDataSource
import de.nowchess.coordinator.config.CoordinatorConfig
import com.fasterxml.jackson.databind.ObjectMapper
import io.micrometer.core.instrument.MeterRegistry
import io.quarkus.scheduler.Scheduled
import scala.jdk.CollectionConverters.*
import org.jboss.logging.Logger
import scala.compiletime.uninitialized
@@ -48,6 +49,11 @@ class CacheEvictionManager:
meterRegistry.timer("nowchess.coordinator.cache.eviction.duration").record(0L, TimeUnit.MILLISECONDS)
meterRegistry.counter("nowchess.coordinator.cache.evictions").increment(0)
@Scheduled(every = "5m")
def periodicCacheEviction(): Unit =
try evictStaleGames
catch case ex: Exception => log.warnf(ex, "Periodic cache eviction failed")
def evictStaleGames: Unit =
meterRegistry.timer("nowchess.coordinator.cache.eviction.duration").record((() => runEviction()): Runnable)
@@ -5,6 +5,7 @@ import jakarta.enterprise.context.ApplicationScoped
import jakarta.enterprise.event.Observes
import jakarta.enterprise.inject.Instance
import jakarta.inject.Inject
import io.quarkus.scheduler.Scheduled
import de.nowchess.coordinator.config.CoordinatorConfig
import io.fabric8.kubernetes.client.KubernetesClient
import io.fabric8.kubernetes.api.model.Pod
@@ -73,7 +74,12 @@ class HealthMonitor:
Thread.ofVirtual().start(() => validateStartupInstances(timeoutMs))
startPodWatch()
def checkInstanceHealth: Unit =
@Scheduled(every = "10s")
def periodicHealthCheck(): Unit =
try checkInstanceHealth()
catch case ex: Exception => log.warnf(ex, "Health check failed")
def checkInstanceHealth(): Unit =
meterRegistry.counter("nowchess.coordinator.health.checks").increment()
val evicted = instanceRegistry.evictStaleInstances(config.instanceDeadTimeout)
if evicted.nonEmpty then
@@ -76,23 +76,32 @@ class InstanceRegistry:
.onItem()
.transformToUni { value =>
try
val metadata = mapper.readValue(value, classOf[InstanceMetadata])
val isNew = !instances.containsKey(instanceId)
instances.put(instanceId, metadata)
if isNew then
meterRegistry.counter("nowchess.coordinator.instances.joined").increment()
log.infof("Instance %s joined registry (subscriptions=%d)", instanceId, metadata.subscriptionCount)
else
log.debugf(
"Instance %s updated (subscriptions=%d state=%s)",
instanceId,
metadata.subscriptionCount,
metadata.state,
)
Uni.createFrom().item(())
Option(value).fold(
{
log.debugf("Instance %s metadata missing from Redis (may have expired)", instanceId)
Uni.createFrom().item(())
},
) { json =>
val metadata = mapper.readValue(json, classOf[InstanceMetadata])
val isNew = !instances.containsKey(instanceId)
instances.put(instanceId, metadata)
if isNew then
meterRegistry.counter("nowchess.coordinator.instances.joined").increment()
log.infof("Instance %s joined registry (subscriptions=%d)", instanceId, metadata.subscriptionCount)
else
log.debugf(
"Instance %s updated (subscriptions=%d state=%s)",
instanceId,
metadata.subscriptionCount,
metadata.state,
)
Uni.createFrom().item(())
}
catch
case ex: Exception =>
log.warnf(ex, "Failed to parse instance metadata for %s", instanceId)
log.warnf(ex, "Failed to parse instance metadata for %s — removing from registry", instanceId)
instances.remove(instanceId)
meterRegistry.counter("nowchess.coordinator.instances.removed").increment()
Uni.createFrom().item(())
}
.onFailure()
@@ -112,15 +121,19 @@ class InstanceRegistry:
val stale = instances.asScala
.collect { case (id, inst) =>
try
if Instant.parse(inst.lastHeartbeat).isBefore(cutoff) then Some(id)
else None
val isHeartbeatStale = Instant.parse(inst.lastHeartbeat).isBefore(cutoff)
val isDead = inst.state == "DEAD"
if isHeartbeatStale || isDead then Some(id) else None
catch case _: Exception => None
}
.flatten
.toList
stale.foreach { id =>
instances.remove(id)
val inst = Option(instances.remove(id))
meterRegistry.counter("nowchess.coordinator.instances.evicted").increment()
log.warnf("Evicted stale instance %s (heartbeat older than %s)", id, maxAge)
inst.foreach { i =>
if i.state == "DEAD" then log.warnf("Evicted dead instance %s", id)
else log.warnf("Evicted stale instance %s (heartbeat older than %s)", id, maxAge)
}
}
stale
@@ -4,6 +4,7 @@ import jakarta.enterprise.context.ApplicationScoped
import jakarta.inject.Inject
import de.nowchess.coordinator.config.CoordinatorConfig
import io.quarkus.redis.datasource.RedisDataSource
import io.quarkus.scheduler.Scheduled
import org.jboss.logging.Logger
import scala.compiletime.uninitialized
import scala.concurrent.duration.*
@@ -125,3 +126,8 @@ class LoadBalancer:
catch
case ex: Exception =>
log.warnf(ex, "Failed to update Redis game sets")
@Scheduled(every = "30s")
def periodicRebalanceCheck(): Unit =
try if shouldRebalance then rebalance
catch case ex: Exception => log.warnf(ex, "Periodic rebalance check failed")
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0
MINOR=19
MINOR=21
PATCH=0
+119
View File
@@ -1253,3 +1253,122 @@
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-05-13)
### Features
* add authentication permissions for metrics endpoints in application.yml ([04edd4d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/04edd4d6fd8a63196c36f6d67992832febc9bebb))
* add CORS configuration and reorder JWT settings in application.yml ([a49f9be](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a49f9be146f04c14561c305d980846a92f8c12b2))
* add GameRules stub with PositionStatus enum ([76d4168](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/76d4168038de23e5d6083d4e8f0504fbf31d15a3))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add MovedInCheck/Checkmate/Stalemate MoveResult variants (stub dispatch) ([8b7ec57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8b7ec57e5ea6ee1615a1883848a426dc07d26364))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* implement GameRules with isInCheck, legalMoves, gameStatus ([94a02ff](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/94a02ff6849436d9496c70a0f16c21666dae8e4e))
* implement legal castling ([#1](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/1)) ([00d326c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/00d326c1ba67711fbe180f04e1100c3f01dd0254))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-10 Implement Pawn Promotion ([#12](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/12)) ([13bfc16](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/13bfc16cfe25db78ec607db523ca6d993c13430c))
* NCS-11 50-move rule ([#9](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/9)) ([412ed98](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/412ed986a95703a3b282276540153480ceed229d))
* NCS-13 Implement Threefold Repetition ([#31](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/31)) ([767d305](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/767d3051a76c266050b6335774d66e2db2273c16))
* NCS-14 implemented insufficient moves rule ([#30](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/30)) ([b0399a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0399a4e489950083066c9538df9a84dcc7a4613))
* NCS-16 Core Separation via Patterns ([#10](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/10)) ([1361dfc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1361dfc89553b146864fb8ff3526cf12cf3f293a))
* NCS-17 Implement basic ScalaFX UI ([#14](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/14)) ([3ff8031](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ff80318b4f16c59733a46498581a5c27f048287))
* NCS-21 Write Scripts to automate certain tasks ([#15](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/15)) ([8051871](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/80518719d536a087d339fe02530825dc07f8b388))
* NCS-25 Add linters to keep quality up ([#27](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/27)) ([fd4e67d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fd4e67d4f782a7e955822d90cb909d0a81676fb2))
* NCS-37 Quarkus integration ([#35](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/35)) ([f088c4e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f088c4e9ffcc498d3d1b6f01e8f50042d5830d55))
* NCS-40 Rework Draw System ([#34](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/34)) ([33e785d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e785d22af87724839b62ae91dfe74a05b398c3))
* NCS-41 Bot Platform ([#33](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/33)) ([8744bee](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8744bee2dd20966dae90a09c21a43d5b06f59e00))
* NCS-53 changed IO to MicroService for easier scaling ([#37](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/37)) ([b5a2966](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b5a2966adafa9650f0f7d601bdeb8fdd13710327))
* NCS-6 Implementing FEN & PGN ([#7](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/7)) ([f28e69d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f28e69dc181416aa2f221fdc4b45c2cda5efbf07))
* NCS-78 Add Traceability to the Applications ([#48](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/48)) ([c96a09b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c96a09bb5cee59fc23205bb63baa8b217a7e1b00))
* NCS-9 En passant implementation ([#8](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/8)) ([919beb3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/919beb3b4bfa8caf2f90976a415fe9b19b7e9747))
* **rule:** Rules as a microservice ([#39](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/39)) ([093134d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/093134d36c6844ba02a36a28d5d044f09291cd1d))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
* wire check/checkmate/stalemate into processMove and gameLoop ([5264a22](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5264a225418b885c5e6ea6411b96f85e38837f6c))
### Bug Fixes
* add missing kings to gameLoop capture test board ([aedd787](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/aedd787b77203c2af934751dba7b784eaf165032))
* **auth:** change InternalAuthFilter to use @Singleton and add HTTP tests for secret validation ([c08d530](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c08d5303eb9e70d36c8eebf6a061ccb71e118fe5))
* **auth:** update InternalAuthFilter to use @ApplicationScoped and add index-dependency configuration ([6e0fd95](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6e0fd9523e001756ce7109e639ebb54be4fcdabf))
* correct test board positions and captureOutput/withInput interaction ([f0481e2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f0481e2561b779df00925b46ee281dc36a795150))
* **heartbeat:** inject ObjectMapper into InstanceHeartbeatService ([#42](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/42)) ([0c98151](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0c981517da1f94cd10ae396e47bde2b35d0b3ba0))
* IO microservice ([#38](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/38)) ([a386f57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a386f57c21d34ead6cc6f92836c52b714597e289))
* Lints ([dc224ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/dc224abe26acf5361c56956006e1cc51b75b0b7e))
* **redis:** add max pool wait time and switch to ReactiveRedisDataSource for heartbeat updates ([33e5017](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e5017f51a998327b180f778f73964cc10c05d3))
* **redis:** enhance GameRedisSubscriberManager to use ReactiveRedisDataSource and improve subscription handling ([0eb752d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0eb752d4935377f75aab710b7f4eda4b29098e6a))
* **redis:** prevent concurrent Redis heartbeat refreshes using AtomicBoolean ([847b132](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/847b13202cb909d18ca3304c27ebe17ce2312b8e))
* **redis:** simplify refreshRedisHeartbeat logic and ensure proper error handling ([1813ea1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1813ea1d2d5d093f7925f87371b5e29820bf1136))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
* update documentation to reflect new functions in CoordinatorGrpcServer and InstanceRegistry ([f7ce4df](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f7ce4df595cbdc2ef84122781f4851ff140c0f44))
* update main class path in build configuration and adjust VCS directory mapping ([7b1f8b1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7b1f8b117623d327232a1a92a8a44d18582e0189))
* update move validation to check for king safety ([#13](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/13)) ([e5e20c5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e5e20c566e368b12ca1dc59680c34e9112bf6762))
### Reverts
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-05-13)
### Features
* add authentication permissions for metrics endpoints in application.yml ([04edd4d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/04edd4d6fd8a63196c36f6d67992832febc9bebb))
* add CORS configuration and reorder JWT settings in application.yml ([a49f9be](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a49f9be146f04c14561c305d980846a92f8c12b2))
* add GameRules stub with PositionStatus enum ([76d4168](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/76d4168038de23e5d6083d4e8f0504fbf31d15a3))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add MovedInCheck/Checkmate/Stalemate MoveResult variants (stub dispatch) ([8b7ec57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8b7ec57e5ea6ee1615a1883848a426dc07d26364))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* implement GameRules with isInCheck, legalMoves, gameStatus ([94a02ff](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/94a02ff6849436d9496c70a0f16c21666dae8e4e))
* implement legal castling ([#1](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/1)) ([00d326c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/00d326c1ba67711fbe180f04e1100c3f01dd0254))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-10 Implement Pawn Promotion ([#12](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/12)) ([13bfc16](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/13bfc16cfe25db78ec607db523ca6d993c13430c))
* NCS-11 50-move rule ([#9](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/9)) ([412ed98](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/412ed986a95703a3b282276540153480ceed229d))
* NCS-13 Implement Threefold Repetition ([#31](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/31)) ([767d305](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/767d3051a76c266050b6335774d66e2db2273c16))
* NCS-14 implemented insufficient moves rule ([#30](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/30)) ([b0399a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0399a4e489950083066c9538df9a84dcc7a4613))
* NCS-16 Core Separation via Patterns ([#10](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/10)) ([1361dfc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1361dfc89553b146864fb8ff3526cf12cf3f293a))
* NCS-17 Implement basic ScalaFX UI ([#14](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/14)) ([3ff8031](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ff80318b4f16c59733a46498581a5c27f048287))
* NCS-21 Write Scripts to automate certain tasks ([#15](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/15)) ([8051871](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/80518719d536a087d339fe02530825dc07f8b388))
* NCS-25 Add linters to keep quality up ([#27](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/27)) ([fd4e67d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fd4e67d4f782a7e955822d90cb909d0a81676fb2))
* NCS-37 Quarkus integration ([#35](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/35)) ([f088c4e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f088c4e9ffcc498d3d1b6f01e8f50042d5830d55))
* NCS-40 Rework Draw System ([#34](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/34)) ([33e785d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e785d22af87724839b62ae91dfe74a05b398c3))
* NCS-41 Bot Platform ([#33](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/33)) ([8744bee](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8744bee2dd20966dae90a09c21a43d5b06f59e00))
* NCS-53 changed IO to MicroService for easier scaling ([#37](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/37)) ([b5a2966](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b5a2966adafa9650f0f7d601bdeb8fdd13710327))
* NCS-6 Implementing FEN & PGN ([#7](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/7)) ([f28e69d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f28e69dc181416aa2f221fdc4b45c2cda5efbf07))
* NCS-78 Add Traceability to the Applications ([#48](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/48)) ([c96a09b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c96a09bb5cee59fc23205bb63baa8b217a7e1b00))
* NCS-9 En passant implementation ([#8](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/8)) ([919beb3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/919beb3b4bfa8caf2f90976a415fe9b19b7e9747))
* **rule:** Rules as a microservice ([#39](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/39)) ([093134d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/093134d36c6844ba02a36a28d5d044f09291cd1d))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
* wire check/checkmate/stalemate into processMove and gameLoop ([5264a22](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5264a225418b885c5e6ea6411b96f85e38837f6c))
### Bug Fixes
* add missing kings to gameLoop capture test board ([aedd787](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/aedd787b77203c2af934751dba7b784eaf165032))
* **auth:** change InternalAuthFilter to use @Singleton and add HTTP tests for secret validation ([c08d530](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c08d5303eb9e70d36c8eebf6a061ccb71e118fe5))
* **auth:** update InternalAuthFilter to use @ApplicationScoped and add index-dependency configuration ([6e0fd95](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6e0fd9523e001756ce7109e639ebb54be4fcdabf))
* correct test board positions and captureOutput/withInput interaction ([f0481e2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f0481e2561b779df00925b46ee281dc36a795150))
* **heartbeat:** inject ObjectMapper into InstanceHeartbeatService ([#42](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/42)) ([0c98151](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0c981517da1f94cd10ae396e47bde2b35d0b3ba0))
* IO microservice ([#38](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/38)) ([a386f57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a386f57c21d34ead6cc6f92836c52b714597e289))
* Lints ([dc224ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/dc224abe26acf5361c56956006e1cc51b75b0b7e))
* **redis:** add max pool wait time and switch to ReactiveRedisDataSource for heartbeat updates ([33e5017](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e5017f51a998327b180f778f73964cc10c05d3))
* **redis:** enhance GameRedisSubscriberManager to use ReactiveRedisDataSource and improve subscription handling ([0eb752d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0eb752d4935377f75aab710b7f4eda4b29098e6a))
* **redis:** prevent concurrent Redis heartbeat refreshes using AtomicBoolean ([847b132](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/847b13202cb909d18ca3304c27ebe17ce2312b8e))
* **redis:** simplify refreshRedisHeartbeat logic and ensure proper error handling ([1813ea1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1813ea1d2d5d093f7925f87371b5e29820bf1136))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
* update documentation to reflect new functions in CoordinatorGrpcServer and InstanceRegistry ([f7ce4df](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f7ce4df595cbdc2ef84122781f4851ff140c0f44))
* update main class path in build configuration and adjust VCS directory mapping ([7b1f8b1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7b1f8b117623d327232a1a92a8a44d18582e0189))
* update move validation to check for king safety ([#13](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/13)) ([e5e20c5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e5e20c566e368b12ca1dc59680c34e9112bf6762))
### Reverts
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
@@ -60,8 +60,8 @@ class GameEngine(
@SuppressWarnings(Array("DisableSyntax.var"))
private var pendingTakebackRequest: Option[Color] = initialTakebackRequest
GameEngine.activeGamesCount.incrementAndGet()
meterRegistry.foreach { reg =>
GameEngine.activeGamesCount.incrementAndGet()
reg.counter("nowchess.games.started").increment()
}
private def gamesCompletedCounter(result: String): Counter =
@@ -68,17 +68,15 @@ class GameRedisSubscriberManager:
heartbeatServiceOpt.foreach(_.addGameSubscription(gameId))
val handler: Consumer[String] = msg => handleC2sMessage(gameId, msg)
reactiveRedis
.pubsub(classOf[String])
.subscribe(c2sTopic(gameId), handler)
.subscribe()
.`with`(
subscriber => {
c2sListeners.put(gameId, subscriber)
log.debugf("Subscribed to game %s", gameId)
},
failure => log.warnf(failure, "Redis subscription failed for game %s", gameId),
)
try
val subscriber = reactiveRedis
.pubsub(classOf[String])
.subscribe(c2sTopic(gameId), handler)
.await()
.atMost(java.time.Duration.ofSeconds(5))
c2sListeners.put(gameId, subscriber)
log.debugf("Subscribed to game %s", gameId)
catch case ex: Exception => log.warnf(ex, "Redis subscription failed for game %s", gameId)
def unsubscribeGame(gameId: String): Unit =
Option(c2sListeners.remove(gameId)).foreach { subscriber =>
@@ -199,7 +199,7 @@ class InstanceHeartbeatService:
val json = mapper.writeValueAsString(metadata)
reactiveRedis
.value(classOf[String])
.setex(key, 5L, json)
.setex(key, 15L, json)
.subscribe()
.`with`(
_ => redisHeartbeatPending.set(false),
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0
MINOR=38
MINOR=40
PATCH=0