Compare commits

..

19 Commits

Author SHA1 Message Date
TeamCity b4c75e2a0f ci: bump version with Build-96 2026-05-17 20:01:45 +00:00
Janis 9bf995f47d fix: revert pod matching to original logic instanceId.contains(podName)
Build & Test (NowChessSystems) TeamCity build finished
Accidentally flipped pod matching direction in previous commits. Changed from correct instanceId.contains(podName) to incorrect podName.contains(instanceId), causing all health checks to fail.

Reverted all 3 locations to original working logic.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-17 21:35:16 +02:00
TeamCity 8df418627c ci: bump version with Build-95 2026-05-17 18:53:07 +00:00
Janis 6dbe1e62ac fix: correct pod matching logic from endsWith to contains
Build & Test (NowChessSystems) TeamCity build finished
Pod matching used endsWith(instanceId) which failed to match any pods because instanceId is randomly generated 8-char string, not pod name suffix. All instances marked dead causing cascading health check failures.

Changed to podName.contains(instanceId) to match instanceId embedded anywhere in pod name. Reverting incomplete fix from previous commit.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-17 20:26:08 +02:00
TeamCity 6311d8fd00 ci: bump version with Build-94 2026-05-17 17:07:55 +00:00
Janis 5205468534 fix: remove redundant line break in LoadBalancer.scala for improved readability
Build & Test (NowChessSystems) TeamCity build finished
2026-05-17 18:41:35 +02:00
Janis 4ec5b931de feat: integrate auto-scaler to handle instance health checks and scaling
Build & Test (NowChessSystems) TeamCity build failed
2026-05-17 18:01:55 +02:00
Janis ebba729af3 fix: ensure full hierarchy registration for reflection in NativeReflectionConfig 2026-05-17 17:59:54 +02:00
Janis 5619c8223a fix: resolve 6 coordinator bugs (cache eviction, rebalance race, pod matching, lookup inefficiency)
- Add lastUpdatedMs timestamp to GameCacheDto to track actual game updates instead of heartbeat time. Fix cache eviction incorrectly marking correspondence games as idle.
- Use atomic SPOP in LoadBalancer.getGamesToMove() to prevent concurrent rebalance calls from selecting same games for migration.
- Add game→instance reverse mapping (nowchess:game:$gameId:instance) to eliminate O(instances) linear scan during cache eviction.
- Fix HealthMonitor pod matching from loose contains() to reliable endsWith() to prevent matching unintended pods with similar names.
- Update FailoverService to maintain game→instance mappings when migrating games during failover.
- Update CacheEvictionManager to use game→instance mapping for O(1) lookup instead of O(n) instance scan.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-17 17:07:29 +02:00
Janis 2d76c001fe fix: refresh Redis TTL on instance heartbeat to prevent false DEAD marking
Instances were being incorrectly marked DEAD because their Redis key TTL was
not being refreshed on heartbeat. HealthMonitor.checkRedisHeartbeat() checks
pttl > 0, which fails when the TTL expires even if the instance is alive and
sending regular heartbeats.

Now pexpire(key, heartbeatTtl) is called on each heartbeat to keep the key
alive. Prevents scaling messages from undercounting healthy instances.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-17 14:50:34 +02:00
TeamCity b58bbbc782 ci: bump version with Build-93 2026-05-17 12:13:55 +00:00
Janis 1a02f9e186 fix: remove unused clearDrainingByPodName method and update HealthMonitor to clear draining instances
Build & Test (NowChessSystems) TeamCity build finished
2026-05-17 13:49:15 +02:00
TeamCity 255f43ddda ci: bump version with Build-92 2026-05-16 13:59:43 +00:00
Janis f109fe3860 fix: improve pod instance ID matching logic in AutoScaler and HealthMonitor
Build & Test (NowChessSystems) TeamCity build finished
2026-05-16 15:45:41 +02:00
Janis a07bf89fae feat: add configurable CPU and memory scaling thresholds for auto-scaling
Build & Test (NowChessSystems) TeamCity build finished
2026-05-16 15:22:56 +02:00
TeamCity b47ad7ef89 ci: bump version with Build-90 2026-05-16 13:15:48 +00:00
Janis 2e4ba43597 feat: implement clock expiry scanning and handling for game records (#54)
Build & Test (NowChessSystems) TeamCity build finished
Reviewed-on: #54
2026-05-16 15:09:04 +02:00
TeamCity b0d27d2de2 ci: bump version with Build-89 2026-05-16 11:31:37 +00:00
Janis 8f9eb12f66 feat: implement clock expiry scanning and handling for game records (#53)
Build & Test (NowChessSystems) TeamCity build finished
Reviewed-on: #53
2026-05-16 13:24:48 +02:00
22 changed files with 918 additions and 186 deletions
+354
View File
@@ -569,3 +569,357 @@
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0)) * streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40)) * update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9)) * update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-16)
### Features
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-16)
### Features
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-16)
### Features
* add configurable CPU and memory scaling thresholds for auto-scaling ([a07bf89](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a07bf89fae43e3ef5fdd30aed0429742a95f8bbe))
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* improve pod instance ID matching logic in AutoScaler and HealthMonitor ([f109fe3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f109fe3860371dd0be3434b10a721936dc258b16))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-17)
### Features
* add configurable CPU and memory scaling thresholds for auto-scaling ([a07bf89](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a07bf89fae43e3ef5fdd30aed0429742a95f8bbe))
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* improve pod instance ID matching logic in AutoScaler and HealthMonitor ([f109fe3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f109fe3860371dd0be3434b10a721936dc258b16))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* remove unused clearDrainingByPodName method and update HealthMonitor to clear draining instances ([1a02f9e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1a02f9e18673d0038e9a307fee5ea5219dc76af8))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-17)
### Features
* add configurable CPU and memory scaling thresholds for auto-scaling ([a07bf89](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a07bf89fae43e3ef5fdd30aed0429742a95f8bbe))
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* improve pod instance ID matching logic in AutoScaler and HealthMonitor ([f109fe3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f109fe3860371dd0be3434b10a721936dc258b16))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* refresh Redis TTL on instance heartbeat to prevent false DEAD marking ([2d76c00](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2d76c001fe22868190a546f1794cf0ade36bb9a9))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* remove redundant line break in LoadBalancer.scala for improved readability ([5205468](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/520546853447e0e0992b258c264272f8f7b8b438))
* remove unused clearDrainingByPodName method and update HealthMonitor to clear draining instances ([1a02f9e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1a02f9e18673d0038e9a307fee5ea5219dc76af8))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* resolve 6 coordinator bugs (cache eviction, rebalance race, pod matching, lookup inefficiency) ([5619c82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5619c8223ad7091706909eda8c907a29d215fd30))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-17)
### Features
* add configurable CPU and memory scaling thresholds for auto-scaling ([a07bf89](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a07bf89fae43e3ef5fdd30aed0429742a95f8bbe))
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* correct pod matching logic from endsWith to contains ([6dbe1e6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6dbe1e62ac74f4ddbf03049b20184f7fac793f81))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* improve pod instance ID matching logic in AutoScaler and HealthMonitor ([f109fe3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f109fe3860371dd0be3434b10a721936dc258b16))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* refresh Redis TTL on instance heartbeat to prevent false DEAD marking ([2d76c00](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2d76c001fe22868190a546f1794cf0ade36bb9a9))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* remove redundant line break in LoadBalancer.scala for improved readability ([5205468](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/520546853447e0e0992b258c264272f8f7b8b438))
* remove unused clearDrainingByPodName method and update HealthMonitor to clear draining instances ([1a02f9e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1a02f9e18673d0038e9a307fee5ea5219dc76af8))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* resolve 6 coordinator bugs (cache eviction, rebalance race, pod matching, lookup inefficiency) ([5619c82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5619c8223ad7091706909eda8c907a29d215fd30))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
## (2026-05-17)
### Features
* add configurable CPU and memory scaling thresholds for auto-scaling ([a07bf89](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a07bf89fae43e3ef5fdd30aed0429742a95f8bbe))
* add coordinator startup validation and K8s pod watch ([81b045d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/81b045d01bb054a4bc9dc9e02fc30f814e756205))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* add periodic health check to evict dead instances ([380a2cc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/380a2cceeb5873bf93ff17a1e87d62408ef8e178))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement clock expiry scanning and handling for game records ([#54](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/54)) ([2e4ba43](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2e4ba435978ef415b4ee2d7d2fc4af3b4e834b3d))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* scale up on high CPU load, not just subscription count ([255e2da](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/255e2da33c37e186ed14f2862f2d2e1b4adc59bf))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* add instance-dead-timeout configuration and update HealthMonitor to use it for stale instance eviction ([be0b710](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/be0b710543b542da5c301efef7d2d587d0ba758a))
* clean up code formatting and improve error handling in gRPC server and failover service ([ad9495a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ad9495afa3e93593b57154a187346c9b01393911))
* coordinator auto-scaling, cache eviction, rebalancing, and grpc timeouts ([d0c7169](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d0c71693bb6f55fafdce5bcea0d5f38b9bb505ef))
* **coordinator:** refine type casting in rolloutSpec method ([#45](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/45)) ([d522f7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d522f7f6edf9c985f03dd16816439d4184f1a589))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#43](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/43)) ([fa3c6b2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa3c6b2886dc59c14c5dad834acc9b41e42023bb))
* **coordinator:** use genericKubernetesResources API for Argo Rollout scaling ([#44](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/44)) ([82d0b75](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/82d0b754be1075084944b466858672d944f9f7d8))
* correct pod matching logic from endsWith to contains ([6dbe1e6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6dbe1e62ac74f4ddbf03049b20184f7fac793f81))
* **dependencies:** replace Fabric8 Kubernetes client with Quarkus Kubernetes client ([5f44570](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5f44570b357277d09f33b7296860c421e2e70ce0))
* don't block event loop during scale-down drain ([1d121c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1d121c727cbd4df477827cf64d065b7356a56e59))
* don't trigger scale-down if already at min replicas ([4b3b5e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4b3b5e7c4ed9b3cd4fe2490e9f268f2e3a0d9e85))
* enhance AutoScaler and InstanceRegistry for replica management and stale instance eviction ([b4920d3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b4920d3817e58bda94d7764e608b856ce9a909f7))
* force-delete hanging pods and remove failed instances from registry ([960a419](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/960a419792e1161fb7241e465b7349efe4a10137))
* improve pod instance ID matching logic in AutoScaler and HealthMonitor ([f109fe3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f109fe3860371dd0be3434b10a721936dc258b16))
* linter formatting and improve code readability ([4a36096](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4a36096a5586e3f321d9c34c53e60d02bcc02c55))
* **middleware:** update paths for bot generation and stockfish configuration ([2dd0501](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2dd0501687db08dcd242359f6837125baf8a2fdc))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* refresh Redis TTL on instance heartbeat to prevent false DEAD marking ([2d76c00](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2d76c001fe22868190a546f1794cf0ade36bb9a9))
* remove corrupted instances immediately and evict dead instances ([43184d2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43184d296da5a6a7b760ac90c2b739220d86bce3))
* remove redundant line break in LoadBalancer.scala for improved readability ([5205468](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/520546853447e0e0992b258c264272f8f7b8b438))
* remove unused clearDrainingByPodName method and update HealthMonitor to clear draining instances ([1a02f9e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1a02f9e18673d0038e9a307fee5ea5219dc76af8))
* replace null checks with Option in coordinator ([2b04d7f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2b04d7fa713e06662bff5afe3fb3f9d04541ce51))
* resolve 6 coordinator bugs (cache eviction, rebalance race, pod matching, lookup inefficiency) ([5619c82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5619c8223ad7091706909eda8c907a29d215fd30))
* revert pod matching to original logic instanceId.contains(podName) ([9bf995f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9bf995f47d4dec19dd62ce2f7d52286ec2aa575f))
* scalafix violations in metrics check and health monitor ([b991878](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b99187821489b296d66da5a5a13f5d545b6045c6))
* scale up immediately when instance is lost ([43525d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/43525d41a3884c00f1db26bf3c8c4cd9a607c260))
* streamline logging for evicted instances in InstanceRegistry ([10937e7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10937e756a56e0e8fcf939decfdcaa4394506cc0))
* update grpcServer variable to use Instance wrapper and add optional access method ([d5c8da2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d5c8da20f8805199e920ea5afbd9cdb39a078e40))
* update HealthMonitor to evict instances without associated pods ([0f41f13](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0f41f13ce68b76846684bab67241a122250dfaf9))
@@ -47,6 +47,8 @@ nowchess:
k8s-rollout-label-selector: "app=nowchess-core" k8s-rollout-label-selector: "app=nowchess-core"
startup-validation-timeout: 15s startup-validation-timeout: 15s
failover-wait-timeout: 30s failover-wait-timeout: 30s
scale-cpu-threshold-percent: 0.8
scale-memory-threshold-percent: 0.8
--- ---
# dev profile # dev profile
@@ -62,3 +62,9 @@ trait CoordinatorConfig:
@WithName("failover-wait-timeout") @WithName("failover-wait-timeout")
def failoverWaitTimeout: Duration def failoverWaitTimeout: Duration
@WithName("scale-cpu-threshold-percent")
def scaleCpuThresholdPercent: Double
@WithName("scale-memory-threshold-percent")
def scaleMemoryThresholdPercent: Double
@@ -10,9 +10,11 @@ import io.fabric8.kubernetes.client.KubernetesClient
import io.micrometer.core.instrument.{Gauge, MeterRegistry} import io.micrometer.core.instrument.{Gauge, MeterRegistry}
import io.quarkus.scheduler.Scheduled import io.quarkus.scheduler.Scheduled
import org.jboss.logging.Logger import org.jboss.logging.Logger
import io.fabric8.kubernetes.client.KubernetesClientException
import scala.jdk.CollectionConverters.* import scala.jdk.CollectionConverters.*
import java.util.concurrent.atomic.AtomicReference import java.util.concurrent.atomic.AtomicReference
import java.util.concurrent.ConcurrentHashMap
import scala.compiletime.uninitialized import scala.compiletime.uninitialized
@ApplicationScoped @ApplicationScoped
@@ -37,9 +39,16 @@ class AutoScaler:
private var meterRegistry: MeterRegistry = uninitialized private var meterRegistry: MeterRegistry = uninitialized
// scalafix:on DisableSyntax.var // scalafix:on DisableSyntax.var
private val log = Logger.getLogger(classOf[AutoScaler]) private val log = Logger.getLogger(classOf[AutoScaler])
private val lastScaleTime = new java.util.concurrent.atomic.AtomicLong(0L) private val lastScaleTime = new java.util.concurrent.atomic.AtomicLong(0L)
private val avgLoadRef = new AtomicReference[Double](0.0) private val avgLoadRef = new AtomicReference[Double](0.0)
private val drainingForScaleDown = ConcurrentHashMap.newKeySet[String]()
def isDrainingForScaleDown(instanceId: String): Boolean =
drainingForScaleDown.contains(instanceId)
def clearDraining(instanceId: String): Unit =
drainingForScaleDown.remove(instanceId)
private def kubeClientOpt: Option[KubernetesClient] = private def kubeClientOpt: Option[KubernetesClient] =
if kubeClientInstance.isUnsatisfied then None if kubeClientInstance.isUnsatisfied then None
@@ -72,33 +81,81 @@ class AutoScaler:
} }
// scalafix:on DisableSyntax.asInstanceOf // scalafix:on DisableSyntax.asInstanceOf
private def parseMillicores(s: String): Long =
if s.endsWith("n") then s.dropRight(1).toLongOption.map(_ / 1000000).getOrElse(0L)
else if s.endsWith("m") then s.dropRight(1).toLongOption.getOrElse(0L)
else s.toLongOption.map(_ * 1000).getOrElse(0L)
private def parseBytes(s: String): Long =
if s.endsWith("Ki") then s.dropRight(2).toLongOption.map(_ * 1024L).getOrElse(0L)
else if s.endsWith("Mi") then s.dropRight(2).toLongOption.map(_ * 1024L * 1024L).getOrElse(0L)
else if s.endsWith("Gi") then s.dropRight(2).toLongOption.map(_ * 1024L * 1024L * 1024L).getOrElse(0L)
else if s.endsWith("K") then s.dropRight(1).toLongOption.map(_ * 1000L).getOrElse(0L)
else if s.endsWith("M") then s.dropRight(1).toLongOption.map(_ * 1000L * 1000L).getOrElse(0L)
else if s.endsWith("G") then s.dropRight(1).toLongOption.map(_ * 1000L * 1000L * 1000L).getOrElse(0L)
else s.toLongOption.getOrElse(0L)
private def exceedsRatio(
used: Long,
request: Long,
threshold: Double,
resource: String,
instanceId: String,
): Boolean =
if request <= 0 then false
else
val ratio = used.toDouble / request.toDouble
log.debugf(
"Instance %s %s: %d used / %d requested = %.0f%%",
instanceId,
resource,
used,
request,
ratio * 100,
)
ratio > threshold
// scalafix:off DisableSyntax.asInstanceOf // scalafix:off DisableSyntax.asInstanceOf
private def isResourceConstrained(instanceId: String): Boolean = private def isResourceConstrained(instanceId: String): Boolean =
kubeClientOpt.fold(false) { kube => kubeClientOpt.fold(false) { kube =>
try try
val pods = val pods =
kube.pods().inNamespace(config.k8sNamespace).withLabel(config.k8sRolloutLabelSelector).list().getItems.asScala kube.pods().inNamespace(config.k8sNamespace).withLabel(config.k8sRolloutLabelSelector).list().getItems.asScala
pods.find(_.getMetadata.getName.contains(instanceId)).exists { pod => pods.find(pod => instanceId.contains(pod.getMetadata.getName)).exists { pod =>
try try
val metricsRes = kube val requests = Option(pod.getSpec)
.genericKubernetesResources(metricsApiVersion, "PodMetrics") .flatMap(s => Option(s.getContainers))
.inNamespace(config.k8sNamespace) .flatMap(cs => if cs.isEmpty then None else Option(cs.get(0)))
.withName(pod.getMetadata.getName) .flatMap(c => Option(c.getResources))
.get() .flatMap(r => Option(r.getRequests))
val metricsMap = metricsRes.asInstanceOf[java.util.Map[String, AnyRef]]
Option(metricsMap.get("metrics")) val cpuRequestMillis =
.map(_.asInstanceOf[java.util.Map[String, AnyRef]]) requests.flatMap(m => Option(m.get("cpu"))).map(q => parseMillicores(q.toString)).getOrElse(0L)
.flatMap(m => Option(m.get("containers")).map(_.asInstanceOf[java.util.List[AnyRef]])) val memRequestBytes =
.filter(!_.isEmpty) requests.flatMap(m => Option(m.get("memory"))).map(q => parseBytes(q.toString)).getOrElse(0L)
.map(_.get(0).asInstanceOf[java.util.Map[String, AnyRef]])
.flatMap(c => Option(c.get("usage")).map(_.asInstanceOf[java.util.Map[String, AnyRef]])) if cpuRequestMillis <= 0 && memRequestBytes <= 0 then
.flatMap(u => Option(u.get("cpu"))) log.debugf("No resource requests found for instance %s, skipping resource check", instanceId)
.map(_.toString) false
.exists { cpuStr => else
val cpuMillis = val metricsRes = kube
if cpuStr.endsWith("m") then cpuStr.dropRight(1).toLongOption.getOrElse(0L) .genericKubernetesResources(metricsApiVersion, "PodMetrics")
else cpuStr.toLongOption.map(_ * 1000).getOrElse(0L) .inNamespace(config.k8sNamespace)
cpuMillis > 800 .withName(pod.getMetadata.getName)
.get()
val metricsMap = metricsRes.asInstanceOf[java.util.Map[String, AnyRef]]
val usageOpt = Option(metricsMap.get("metrics"))
.map(_.asInstanceOf[java.util.Map[String, AnyRef]])
.flatMap(m => Option(m.get("containers")).map(_.asInstanceOf[java.util.List[AnyRef]]))
.filter(!_.isEmpty)
.map(_.get(0).asInstanceOf[java.util.Map[String, AnyRef]])
.flatMap(c => Option(c.get("usage")).map(_.asInstanceOf[java.util.Map[String, AnyRef]]))
usageOpt.exists { usage =>
val cpuUsed = Option(usage.get("cpu")).map(v => parseMillicores(v.toString)).getOrElse(0L)
val memUsed = Option(usage.get("memory")).map(v => parseBytes(v.toString)).getOrElse(0L)
exceedsRatio(cpuUsed, cpuRequestMillis, config.scaleCpuThresholdPercent, "CPU", instanceId) ||
exceedsRatio(memUsed, memRequestBytes, config.scaleMemoryThresholdPercent, "memory", instanceId)
} }
catch case _: Exception => false catch case _: Exception => false
} }
@@ -116,58 +173,90 @@ class AutoScaler:
if now - last >= 120000 && lastScaleTime.compareAndSet(last, now) then if now - last >= 120000 && lastScaleTime.compareAndSet(last, now) then
val instances = instanceRegistry.getAllInstances.filter(_.state == "HEALTHY") val instances = instanceRegistry.getAllInstances.filter(_.state == "HEALTHY")
if instances.nonEmpty then if instances.nonEmpty then
val avgLoad = instances.map(_.subscriptionCount).sum.toDouble / instances.size val avgLoad = instances.map(_.subscriptionCount).sum.toDouble / instances.size
val scaleUpLoad = config.scaleUpThreshold * config.maxGamesPerCore
val scaleDownLoad = config.scaleDownThreshold * config.maxGamesPerCore
avgLoadRef.set(avgLoad) avgLoadRef.set(avgLoad)
val hasHighCpuOrMemory = instances.exists(inst => isResourceConstrained(inst.instanceId)) val constrainedInstance = instances.find(inst => isResourceConstrained(inst.instanceId))
val hasHighCpuOrMemory = constrainedInstance.isDefined
if avgLoad > config.scaleUpThreshold * config.maxGamesPerCore || hasHighCpuOrMemory then scaleUp() log.infof(
else if avgLoad < config.scaleDownThreshold * config.maxGamesPerCore && instances.size > config.scaleMinReplicas "Scale check: instances=%d avgLoad=%.1f scaleUpAt=%.1f scaleDownAt=%.1f resourceConstrained=%s",
instances.size,
avgLoad,
scaleUpLoad,
scaleDownLoad,
constrainedInstance.map(_.instanceId).getOrElse("none"),
)
if avgLoad > scaleUpLoad || hasHighCpuOrMemory then scaleUp()
else if avgLoad < scaleDownLoad && instances.size > config.scaleMinReplicas
then scaleDown() then scaleDown()
private def patchRolloutReplicas(
kube: KubernetesClient,
direction: String,
delta: Int,
canScale: Int => Boolean,
atLimit: Int => Unit,
onSuccess: (Int, Int) => Unit,
maxRetries: Int = 3,
): Unit =
def attempt(retries: Int): Unit =
try
Option(
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.withName(config.k8sRolloutName)
.get(),
).foreach { rollout =>
rolloutSpec(rollout).foreach { spec =>
spec.get("replicas") match
case current: Integer =>
val n = current.intValue()
if !canScale(n) then atLimit(n)
else
spec.put("replicas", Integer.valueOf(n + delta))
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.resource(rollout)
.update()
meterRegistry.counter("nowchess.coordinator.scale.events", "direction", direction).increment()
onSuccess(n, n + delta)
case _ => ()
}
}
catch
case ex: KubernetesClientException if ex.getCode == 409 =>
if retries > 0 then
log.debugf("Conflict scaling %s %s, retrying (%d left)", direction, config.k8sRolloutName, retries - 1)
attempt(retries - 1)
else
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", direction).increment()
log.errorf(ex, "Failed to scale %s %s after conflict retries", direction, config.k8sRolloutName)
case ex: Exception =>
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", direction).increment()
log.errorf(ex, "Failed to scale %s %s", direction, config.k8sRolloutName)
attempt(maxRetries)
def scaleUp(): Unit = def scaleUp(): Unit =
log.info("Scaling up Argo Rollout") log.info("Scaling up Argo Rollout")
kubeClientOpt match kubeClientOpt match
case None => case None => log.warn("Kubernetes client not available, cannot scale")
log.warn("Kubernetes client not available, cannot scale")
case Some(kube) => case Some(kube) =>
try patchRolloutReplicas(
Option( kube,
kube direction = "up",
.genericKubernetesResources(argoApiVersion, argoKind) delta = 1,
.inNamespace(config.k8sNamespace) canScale = _ < config.scaleMaxReplicas,
.withName(config.k8sRolloutName) atLimit = n => log.infof("Already at max replicas %d for %s", n, config.k8sRolloutName),
.get(), onSuccess = (from, to) =>
).foreach { rollout => log.infof("Scaled up %s from %d to %d replicas", config.k8sRolloutName, from, to)
rolloutSpec(rollout).foreach { spec => loadBalancer.rebalance,
spec.get("replicas") match )
case replicas: Integer =>
val currentReplicas = replicas.intValue()
val maxReplicas = config.scaleMaxReplicas
if currentReplicas < maxReplicas then
spec.put("replicas", Integer.valueOf(currentReplicas + 1))
kube
.genericKubernetesResources(argoApiVersion, argoKind)
.inNamespace(config.k8sNamespace)
.resource(rollout)
.update()
meterRegistry.counter("nowchess.coordinator.scale.events", "direction", "up").increment()
log.infof(
"Scaled up %s from %d to %d replicas",
config.k8sRolloutName,
currentReplicas,
currentReplicas + 1,
)
loadBalancer.rebalance
else log.infof("Already at max replicas %d for %s", maxReplicas, config.k8sRolloutName)
case _ => ()
}
}
catch
case ex: Exception =>
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", "up").increment()
log.errorf(ex, "Failed to scale up %s", config.k8sRolloutName)
def scaleDown(): Unit = def scaleDown(): Unit =
log.info("Scaling down Argo Rollout") log.info("Scaling down Argo Rollout")
@@ -177,6 +266,7 @@ class AutoScaler:
underloadedInstance.foreach { inst => underloadedInstance.foreach { inst =>
log.infof("Marking instance %s for drain before scale-down", inst.instanceId) log.infof("Marking instance %s for drain before scale-down", inst.instanceId)
drainingForScaleDown.add(inst.instanceId)
failoverService failoverService
.onInstanceStreamDropped(inst.instanceId) .onInstanceStreamDropped(inst.instanceId)
.subscribe() .subscribe()
@@ -187,42 +277,34 @@ class AutoScaler:
} }
kubeClientOpt match kubeClientOpt match
case None => case None => log.warn("Kubernetes client not available, cannot scale")
log.warn("Kubernetes client not available, cannot scale")
case Some(kube) => case Some(kube) =>
try patchRolloutReplicas(
Option( kube,
kube direction = "down",
.genericKubernetesResources(argoApiVersion, argoKind) delta = -1,
.inNamespace(config.k8sNamespace) canScale = _ > config.scaleMinReplicas,
.withName(config.k8sRolloutName) atLimit = n => log.infof("Already at min replicas %d for %s", n, config.k8sRolloutName),
.get(), onSuccess = (from, to) =>
).foreach { rollout => log.infof("Scaled down %s from %d to %d replicas", config.k8sRolloutName, from, to)
rolloutSpec(rollout).foreach { spec => underloadedInstance.foreach(inst => forceDeletePod(inst.instanceId, kube)),
spec.get("replicas") match )
case replicas: Integer =>
val currentReplicas = replicas.intValue()
val minReplicas = config.scaleMinReplicas
if currentReplicas > minReplicas then private def forceDeletePod(instanceId: String, kube: KubernetesClient): Unit =
spec.put("replicas", Integer.valueOf(currentReplicas - 1)) try
kube val pods = kube
.genericKubernetesResources(argoApiVersion, argoKind) .pods()
.inNamespace(config.k8sNamespace) .inNamespace(config.k8sNamespace)
.resource(rollout) .withLabel(config.k8sRolloutLabelSelector)
.update() .list()
meterRegistry.counter("nowchess.coordinator.scale.events", "direction", "down").increment() .getItems
log.infof( .asScala
"Scaled down %s from %d to %d replicas", pods.find(pod => instanceId.contains(pod.getMetadata.getName)) match
config.k8sRolloutName, case Some(pod) =>
currentReplicas, kube.pods().inNamespace(config.k8sNamespace).withName(pod.getMetadata.getName).withGracePeriod(0L).delete()
currentReplicas - 1, log.infof("Force-deleted pod for drained instance %s", instanceId)
) case None =>
else log.infof("Already at min replicas %d for %s", minReplicas, config.k8sRolloutName) log.debugf("No pod found for drained instance %s, skipping deletion", instanceId)
case _ => () catch
} case ex: Exception =>
} log.warnf(ex, "Failed to force-delete pod for drained instance %s", instanceId)
catch
case ex: Exception =>
meterRegistry.counter("nowchess.coordinator.scale.failures", "direction", "down").increment()
log.errorf(ex, "Failed to scale down %s", config.k8sRolloutName)
@@ -75,6 +75,7 @@ class CacheEvictionManager:
try try
coreGrpcClient.evictGames(instance.hostname, instance.grpcPort, List(gameId)) coreGrpcClient.evictGames(instance.hostname, instance.grpcPort, List(gameId))
redis.key(classOf[String]).del(key) redis.key(classOf[String]).del(key)
redis.key(classOf[String]).del(s"$redisPrefix:game:$gameId:instance")
meterRegistry.counter("nowchess.coordinator.cache.evictions").increment() meterRegistry.counter("nowchess.coordinator.cache.evictions").increment()
log.infof("Evicted idle game %s from %s", gameId, instance.instanceId) log.infof("Evicted idle game %s from %s", gameId, instance.instanceId)
count + 1 count + 1
@@ -96,17 +97,18 @@ class CacheEvictionManager:
private def extractLastUpdatedTimestamp(json: String): Long = private def extractLastUpdatedTimestamp(json: String): Long =
Try { Try {
val parsed = objectMapper.readTree(json) val parsed = objectMapper.readTree(json)
Option(parsed.get("lastHeartbeat")) Option(parsed.get("lastUpdatedMs"))
.filter(_.isTextual) .filter(_.isNumber)
.fold(0L)(lh => Instant.parse(lh.asText()).toEpochMilli) .fold(0L)(_.asLong())
}.getOrElse(0L) }.getOrElse(0L)
private def findInstanceWithGame(gameId: String): Option[de.nowchess.coordinator.dto.InstanceMetadata] = private def findInstanceWithGame(gameId: String): Option[de.nowchess.coordinator.dto.InstanceMetadata] =
try try
instanceRegistry.getAllInstances.find { instance => val mapKey = s"$redisPrefix:game:$gameId:instance"
val setKey = s"$redisPrefix:instance:${instance.instanceId}:games" Option(redis.value(classOf[String]).get(mapKey))
redis.set(classOf[String]).sismember(setKey, gameId) .flatMap { instanceId =>
} instanceRegistry.getInstance(instanceId)
}
catch catch
case ex: Exception => case ex: Exception =>
log.debugf(ex, "Failed to find instance for game %s", gameId) log.debugf(ex, "Failed to find instance for game %s", gameId)
@@ -107,6 +107,7 @@ class FailoverService:
try try
val subscribed = coreGrpcClient.batchResubscribeGames(target.hostname, target.grpcPort, batch) val subscribed = coreGrpcClient.batchResubscribeGames(target.hostname, target.grpcPort, batch)
if subscribed > 0 then if subscribed > 0 then
updateGameInstanceMappings(batch, deadId, target.instanceId)
log.infof("Migrated %d games from %s to %s", subscribed, deadId, target.instanceId) log.infof("Migrated %d games from %s to %s", subscribed, deadId, target.instanceId)
true true
else false else false
@@ -116,6 +117,18 @@ class FailoverService:
false false
if success then true else tryMigrateBatch(batch, batchIdx, instances, deadId, attempt + 1) if success then true else tryMigrateBatch(batch, batchIdx, instances, deadId, attempt + 1)
private def updateGameInstanceMappings(gameIds: List[String], deadId: String, targetId: String): Unit =
try
val fromKey = s"$redisPrefix:instance:$deadId:games"
val toKey = s"$redisPrefix:instance:$targetId:games"
gameIds.foreach { gameId =>
redis.set(classOf[String]).sadd(toKey, gameId)
redis.value(classOf[String]).set(s"$redisPrefix:game:$gameId:instance", targetId)
}
catch
case ex: Exception =>
log.errorf(ex, "Failed to update game instance mappings")
private def cleanupDeadInstance(instanceId: String): Unit = private def cleanupDeadInstance(instanceId: String): Unit =
val setKey = s"$redisPrefix:instance:$instanceId:games" val setKey = s"$redisPrefix:instance:$instanceId:games"
redis.key(classOf[String]).del(setKey) redis.key(classOf[String]).del(setKey)
@@ -88,7 +88,9 @@ class HealthMonitor:
if evicted.nonEmpty then if evicted.nonEmpty then
log.warnf("Evicted %d stale instances: %s", evicted.size, evicted.mkString(", ")) log.warnf("Evicted %d stale instances: %s", evicted.size, evicted.mkString(", "))
evicted.foreach(deleteK8sPod) evicted.foreach(deleteK8sPod)
autoScaler.scaleUp() val unexpectedEvictions = evicted.filterNot(autoScaler.isDrainingForScaleDown)
evicted.foreach(autoScaler.clearDraining)
if unexpectedEvictions.nonEmpty then autoScaler.scaleUp()
val instances = instanceRegistry.getAllInstances val instances = instanceRegistry.getAllInstances
val failed = instances.collect { inst => val failed = instances.collect { inst =>
val isHealthy = checkHealth(inst.instanceId) val isHealthy = checkHealth(inst.instanceId)
@@ -99,7 +101,8 @@ class HealthMonitor:
Some(inst.instanceId) Some(inst.instanceId)
else None else None
}.flatten }.flatten
if failed.nonEmpty then autoScaler.scaleUp() val unexpectedFailures = failed.filterNot(autoScaler.isDrainingForScaleDown)
if unexpectedFailures.nonEmpty then autoScaler.scaleUp()
private def checkHealth(instanceId: String): Boolean = private def checkHealth(instanceId: String): Boolean =
val redisHealthy = checkRedisHeartbeat(instanceId) val redisHealthy = checkRedisHeartbeat(instanceId)
@@ -128,7 +131,7 @@ class HealthMonitor:
pods.exists { pod => pods.exists { pod =>
val podName = pod.getMetadata.getName val podName = pod.getMetadata.getName
podName.contains(instanceId) && isPodReady(pod) instanceId.contains(podName) && isPodReady(pod)
} }
catch catch
case ex: Exception => case ex: Exception =>
@@ -182,7 +185,7 @@ class HealthMonitor:
.getItems .getItems
.asScala .asScala
pods.find(pod => pod.getMetadata.getName.contains(instanceId)) match pods.find(pod => instanceId.contains(pod.getMetadata.getName)) match
case Some(pod) => case Some(pod) =>
val podName = pod.getMetadata.getName val podName = pod.getMetadata.getName
kube.pods().inNamespace(config.k8sNamespace).withName(podName).withGracePeriod(0L).delete() kube.pods().inNamespace(config.k8sNamespace).withName(podName).withGracePeriod(0L).delete()
@@ -227,12 +230,9 @@ class HealthMonitor:
} }
private def handlePodGone(pod: Pod): Unit = private def handlePodGone(pod: Pod): Unit =
val podName = pod.getMetadata.getName
findRegisteredInstance(pod).foreach { inst => findRegisteredInstance(pod).foreach { inst =>
log.warnf( log.warnf("Pod %s deleted — triggering failover for %s", podName, inst.instanceId)
"Pod %s deleted — triggering failover for %s",
pod.getMetadata.getName,
inst.instanceId,
)
failoverService failoverService
.onInstanceStreamDropped(inst.instanceId) .onInstanceStreamDropped(inst.instanceId)
.subscribe() .subscribe()
@@ -244,4 +244,4 @@ class HealthMonitor:
private def findRegisteredInstance(pod: Pod): Option[InstanceMetadata] = private def findRegisteredInstance(pod: Pod): Option[InstanceMetadata] =
val podName = pod.getMetadata.getName val podName = pod.getMetadata.getName
instanceRegistry.getAllInstances.find(inst => podName.contains(inst.instanceId)) instanceRegistry.getAllInstances.find(inst => inst.instanceId.contains(podName))
@@ -9,6 +9,7 @@ import scala.jdk.CollectionConverters.*
import scala.compiletime.uninitialized import scala.compiletime.uninitialized
import com.fasterxml.jackson.databind.ObjectMapper import com.fasterxml.jackson.databind.ObjectMapper
import de.nowchess.coordinator.dto.InstanceMetadata import de.nowchess.coordinator.dto.InstanceMetadata
import de.nowchess.coordinator.config.CoordinatorConfig
import java.util.concurrent.ConcurrentHashMap import java.util.concurrent.ConcurrentHashMap
import java.time.{Duration, Instant} import java.time.{Duration, Instant}
import io.micrometer.core.instrument.{Gauge, MeterRegistry} import io.micrometer.core.instrument.{Gauge, MeterRegistry}
@@ -27,6 +28,9 @@ class InstanceRegistry:
@Inject @Inject
private var meterRegistry: MeterRegistry = uninitialized private var meterRegistry: MeterRegistry = uninitialized
@Inject
private var config: CoordinatorConfig = uninitialized
// scalafix:on DisableSyntax.var // scalafix:on DisableSyntax.var
private val log = Logger.getLogger(classOf[InstanceRegistry]) private val log = Logger.getLogger(classOf[InstanceRegistry])
@@ -95,7 +99,13 @@ class InstanceRegistry:
metadata.subscriptionCount, metadata.subscriptionCount,
metadata.state, metadata.state,
) )
Uni.createFrom().item(()) val ttlMs = config.heartbeatTtl.toMillis
redis
.key(classOf[String])
.pexpire(key, ttlMs)
.map(_ => ())
.onFailure()
.recoverWithItem(())
} }
catch catch
case ex: Exception => case ex: Exception =>
@@ -69,7 +69,7 @@ class LoadBalancer:
val overloaded = instances val overloaded = instances
.filter(_.subscriptionCount > config.maxGamesPerCore) .filter(_.subscriptionCount > config.maxGamesPerCore)
.sortBy[Int](_.subscriptionCount) .sortBy(_.subscriptionCount)
.reverse .reverse
val underloaded = instances val underloaded = instances
.filter(_.subscriptionCount < avgLoad * 0.8) .filter(_.subscriptionCount < avgLoad * 0.8)
@@ -108,7 +108,9 @@ class LoadBalancer:
private def getGamesToMove(instanceId: String, count: Int): List[String] = private def getGamesToMove(instanceId: String, count: Int): List[String] =
try try
val setKey = s"$redisPrefix:instance:$instanceId:games" val setKey = s"$redisPrefix:instance:$instanceId:games"
redis.set(classOf[String]).smembers(setKey).asScala.toList.take(count) val result = scala.collection.mutable.ListBuffer[String]()
for _ <- 0 until count do Option(redis.set(classOf[String]).spop(setKey)).foreach(result += _)
result.toList
catch catch
case ex: Exception => case ex: Exception =>
log.debugf(ex, "Failed to get games for %s", instanceId) log.debugf(ex, "Failed to get games for %s", instanceId)
@@ -116,12 +118,10 @@ class LoadBalancer:
private def updateRedisGameSets(fromInstanceId: String, toInstanceId: String, gameIds: List[String]): Unit = private def updateRedisGameSets(fromInstanceId: String, toInstanceId: String, gameIds: List[String]): Unit =
try try
val fromKey = s"$redisPrefix:instance:$fromInstanceId:games" val toKey = s"$redisPrefix:instance:$toInstanceId:games"
val toKey = s"$redisPrefix:instance:$toInstanceId:games"
gameIds.foreach { gameId => gameIds.foreach { gameId =>
redis.set(classOf[String]).srem(fromKey, gameId)
redis.set(classOf[String]).sadd(toKey, gameId) redis.set(classOf[String]).sadd(toKey, gameId)
redis.value(classOf[String]).set(s"$redisPrefix:game:$gameId:instance", toInstanceId)
} }
catch catch
case ex: Exception => case ex: Exception =>
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0 MAJOR=0
MINOR=24 MINOR=31
PATCH=0 PATCH=0
+127
View File
@@ -1434,3 +1434,130 @@
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83)) * Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656)) * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-05-16)
### Features
* add authentication permissions for metrics endpoints in application.yml ([04edd4d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/04edd4d6fd8a63196c36f6d67992832febc9bebb))
* add CORS configuration and reorder JWT settings in application.yml ([a49f9be](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a49f9be146f04c14561c305d980846a92f8c12b2))
* add GameRules stub with PositionStatus enum ([76d4168](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/76d4168038de23e5d6083d4e8f0504fbf31d15a3))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add MovedInCheck/Checkmate/Stalemate MoveResult variants (stub dispatch) ([8b7ec57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8b7ec57e5ea6ee1615a1883848a426dc07d26364))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement GameRules with isInCheck, legalMoves, gameStatus ([94a02ff](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/94a02ff6849436d9496c70a0f16c21666dae8e4e))
* implement legal castling ([#1](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/1)) ([00d326c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/00d326c1ba67711fbe180f04e1100c3f01dd0254))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-10 Implement Pawn Promotion ([#12](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/12)) ([13bfc16](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/13bfc16cfe25db78ec607db523ca6d993c13430c))
* NCS-11 50-move rule ([#9](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/9)) ([412ed98](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/412ed986a95703a3b282276540153480ceed229d))
* NCS-13 Implement Threefold Repetition ([#31](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/31)) ([767d305](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/767d3051a76c266050b6335774d66e2db2273c16))
* NCS-14 implemented insufficient moves rule ([#30](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/30)) ([b0399a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0399a4e489950083066c9538df9a84dcc7a4613))
* NCS-16 Core Separation via Patterns ([#10](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/10)) ([1361dfc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1361dfc89553b146864fb8ff3526cf12cf3f293a))
* NCS-17 Implement basic ScalaFX UI ([#14](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/14)) ([3ff8031](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ff80318b4f16c59733a46498581a5c27f048287))
* NCS-21 Write Scripts to automate certain tasks ([#15](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/15)) ([8051871](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/80518719d536a087d339fe02530825dc07f8b388))
* NCS-25 Add linters to keep quality up ([#27](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/27)) ([fd4e67d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fd4e67d4f782a7e955822d90cb909d0a81676fb2))
* NCS-37 Quarkus integration ([#35](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/35)) ([f088c4e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f088c4e9ffcc498d3d1b6f01e8f50042d5830d55))
* NCS-40 Rework Draw System ([#34](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/34)) ([33e785d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e785d22af87724839b62ae91dfe74a05b398c3))
* NCS-41 Bot Platform ([#33](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/33)) ([8744bee](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8744bee2dd20966dae90a09c21a43d5b06f59e00))
* NCS-53 changed IO to MicroService for easier scaling ([#37](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/37)) ([b5a2966](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b5a2966adafa9650f0f7d601bdeb8fdd13710327))
* NCS-6 Implementing FEN & PGN ([#7](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/7)) ([f28e69d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f28e69dc181416aa2f221fdc4b45c2cda5efbf07))
* NCS-78 Add Traceability to the Applications ([#48](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/48)) ([c96a09b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c96a09bb5cee59fc23205bb63baa8b217a7e1b00))
* NCS-9 En passant implementation ([#8](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/8)) ([919beb3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/919beb3b4bfa8caf2f90976a415fe9b19b7e9747))
* **rule:** Rules as a microservice ([#39](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/39)) ([093134d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/093134d36c6844ba02a36a28d5d044f09291cd1d))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
* wire check/checkmate/stalemate into processMove and gameLoop ([5264a22](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5264a225418b885c5e6ea6411b96f85e38837f6c))
### Bug Fixes
* add missing kings to gameLoop capture test board ([aedd787](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/aedd787b77203c2af934751dba7b784eaf165032))
* **auth:** change InternalAuthFilter to use @Singleton and add HTTP tests for secret validation ([c08d530](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c08d5303eb9e70d36c8eebf6a061ccb71e118fe5))
* **auth:** update InternalAuthFilter to use @ApplicationScoped and add index-dependency configuration ([6e0fd95](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6e0fd9523e001756ce7109e639ebb54be4fcdabf))
* correct test board positions and captureOutput/withInput interaction ([f0481e2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f0481e2561b779df00925b46ee281dc36a795150))
* **heartbeat:** inject ObjectMapper into InstanceHeartbeatService ([#42](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/42)) ([0c98151](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0c981517da1f94cd10ae396e47bde2b35d0b3ba0))
* IO microservice ([#38](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/38)) ([a386f57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a386f57c21d34ead6cc6f92836c52b714597e289))
* Lints ([dc224ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/dc224abe26acf5361c56956006e1cc51b75b0b7e))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f))
* **redis:** add max pool wait time and switch to ReactiveRedisDataSource for heartbeat updates ([33e5017](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e5017f51a998327b180f778f73964cc10c05d3))
* **redis:** enhance GameRedisSubscriberManager to use ReactiveRedisDataSource and improve subscription handling ([0eb752d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0eb752d4935377f75aab710b7f4eda4b29098e6a))
* **redis:** prevent concurrent Redis heartbeat refreshes using AtomicBoolean ([847b132](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/847b13202cb909d18ca3304c27ebe17ce2312b8e))
* **redis:** simplify refreshRedisHeartbeat logic and ensure proper error handling ([1813ea1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1813ea1d2d5d093f7925f87371b5e29820bf1136))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
* update documentation to reflect new functions in CoordinatorGrpcServer and InstanceRegistry ([f7ce4df](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f7ce4df595cbdc2ef84122781f4851ff140c0f44))
* update main class path in build configuration and adjust VCS directory mapping ([7b1f8b1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7b1f8b117623d327232a1a92a8a44d18582e0189))
* update move validation to check for king safety ([#13](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/13)) ([e5e20c5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e5e20c566e368b12ca1dc59680c34e9112bf6762))
### Reverts
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-05-17)
### Features
* add authentication permissions for metrics endpoints in application.yml ([04edd4d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/04edd4d6fd8a63196c36f6d67992832febc9bebb))
* add CORS configuration and reorder JWT settings in application.yml ([a49f9be](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a49f9be146f04c14561c305d980846a92f8c12b2))
* add GameRules stub with PositionStatus enum ([76d4168](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/76d4168038de23e5d6083d4e8f0504fbf31d15a3))
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add MovedInCheck/Checkmate/Stalemate MoveResult variants (stub dispatch) ([8b7ec57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8b7ec57e5ea6ee1615a1883848a426dc07d26364))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* implement GameRules with isInCheck, legalMoves, gameStatus ([94a02ff](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/94a02ff6849436d9496c70a0f16c21666dae8e4e))
* implement legal castling ([#1](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/1)) ([00d326c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/00d326c1ba67711fbe180f04e1100c3f01dd0254))
* implement periodic scaling checks and enhance instance management in AutoScaler ([3f12f69](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f12f695f132b92f634d98df2c037292498b6e86))
* **logging:** add DEBUG/INFO/WARN logging across services (NCS-72) ([#41](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/41)) ([804a4bf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/804a4bf179e3dfb19e2be4390e7e543caf5237c6))
* NCS-10 Implement Pawn Promotion ([#12](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/12)) ([13bfc16](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/13bfc16cfe25db78ec607db523ca6d993c13430c))
* NCS-11 50-move rule ([#9](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/9)) ([412ed98](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/412ed986a95703a3b282276540153480ceed229d))
* NCS-13 Implement Threefold Repetition ([#31](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/31)) ([767d305](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/767d3051a76c266050b6335774d66e2db2273c16))
* NCS-14 implemented insufficient moves rule ([#30](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/30)) ([b0399a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0399a4e489950083066c9538df9a84dcc7a4613))
* NCS-16 Core Separation via Patterns ([#10](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/10)) ([1361dfc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1361dfc89553b146864fb8ff3526cf12cf3f293a))
* NCS-17 Implement basic ScalaFX UI ([#14](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/14)) ([3ff8031](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ff80318b4f16c59733a46498581a5c27f048287))
* NCS-21 Write Scripts to automate certain tasks ([#15](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/15)) ([8051871](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/80518719d536a087d339fe02530825dc07f8b388))
* NCS-25 Add linters to keep quality up ([#27](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/27)) ([fd4e67d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fd4e67d4f782a7e955822d90cb909d0a81676fb2))
* NCS-37 Quarkus integration ([#35](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/35)) ([f088c4e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f088c4e9ffcc498d3d1b6f01e8f50042d5830d55))
* NCS-40 Rework Draw System ([#34](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/34)) ([33e785d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e785d22af87724839b62ae91dfe74a05b398c3))
* NCS-41 Bot Platform ([#33](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/33)) ([8744bee](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8744bee2dd20966dae90a09c21a43d5b06f59e00))
* NCS-53 changed IO to MicroService for easier scaling ([#37](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/37)) ([b5a2966](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b5a2966adafa9650f0f7d601bdeb8fdd13710327))
* NCS-6 Implementing FEN & PGN ([#7](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/7)) ([f28e69d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f28e69dc181416aa2f221fdc4b45c2cda5efbf07))
* NCS-78 Add Traceability to the Applications ([#48](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/48)) ([c96a09b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c96a09bb5cee59fc23205bb63baa8b217a7e1b00))
* NCS-9 En passant implementation ([#8](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/8)) ([919beb3](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/919beb3b4bfa8caf2f90976a415fe9b19b7e9747))
* **rule:** Rules as a microservice ([#39](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/39)) ([093134d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/093134d36c6844ba02a36a28d5d044f09291cd1d))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
* wire check/checkmate/stalemate into processMove and gameLoop ([5264a22](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5264a225418b885c5e6ea6411b96f85e38837f6c))
### Bug Fixes
* add missing kings to gameLoop capture test board ([aedd787](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/aedd787b77203c2af934751dba7b784eaf165032))
* **auth:** change InternalAuthFilter to use @Singleton and add HTTP tests for secret validation ([c08d530](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c08d5303eb9e70d36c8eebf6a061ccb71e118fe5))
* **auth:** update InternalAuthFilter to use @ApplicationScoped and add index-dependency configuration ([6e0fd95](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6e0fd9523e001756ce7109e639ebb54be4fcdabf))
* correct test board positions and captureOutput/withInput interaction ([f0481e2](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f0481e2561b779df00925b46ee281dc36a795150))
* **heartbeat:** inject ObjectMapper into InstanceHeartbeatService ([#42](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/42)) ([0c98151](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0c981517da1f94cd10ae396e47bde2b35d0b3ba0))
* IO microservice ([#38](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/38)) ([a386f57](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a386f57c21d34ead6cc6f92836c52b714597e289))
* Lints ([dc224ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/dc224abe26acf5361c56956006e1cc51b75b0b7e))
* NCS-84 More Verbose Logging ([#51](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/51)) ([4ad92ab](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4ad92ab23698267f8faa59c4e18388d4a0042cca))
* NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f))
* **redis:** add max pool wait time and switch to ReactiveRedisDataSource for heartbeat updates ([33e5017](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/33e5017f51a998327b180f778f73964cc10c05d3))
* **redis:** enhance GameRedisSubscriberManager to use ReactiveRedisDataSource and improve subscription handling ([0eb752d](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0eb752d4935377f75aab710b7f4eda4b29098e6a))
* **redis:** prevent concurrent Redis heartbeat refreshes using AtomicBoolean ([847b132](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/847b13202cb909d18ca3304c27ebe17ce2312b8e))
* **redis:** simplify refreshRedisHeartbeat logic and ensure proper error handling ([1813ea1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1813ea1d2d5d093f7925f87371b5e29820bf1136))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
* resolve 6 coordinator bugs (cache eviction, rebalance race, pod matching, lookup inefficiency) ([5619c82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5619c8223ad7091706909eda8c907a29d215fd30))
* update documentation to reflect new functions in CoordinatorGrpcServer and InstanceRegistry ([f7ce4df](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f7ce4df595cbdc2ef84122781f4851ff140c0f44))
* update main class path in build configuration and adjust VCS directory mapping ([7b1f8b1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7b1f8b117623d327232a1a92a8a44d18582e0189))
* update move validation to check for king safety ([#13](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/13)) ([e5e20c5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e5e20c566e368b12ca1dc59680c34e9112bf6762))
### Reverts
* Revert "feat: add authentication permissions for metrics endpoints in application.yml" ([a298417](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a298417b9e4d68dc73bbf40be63d9484536e9f83))
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
@@ -1,13 +1,12 @@
package de.nowchess.chess.redis package de.nowchess.chess.redis
import com.fasterxml.jackson.databind.ObjectMapper import com.fasterxml.jackson.databind.ObjectMapper
import de.nowchess.api.dto.{GameStateEventDto, GameWritebackEventDto} import de.nowchess.api.dto.{GameStateDto, GameStateEventDto, GameWritebackEventDto}
import de.nowchess.api.game.{CorrespondenceClockState, LiveClockState} import de.nowchess.api.game.{CorrespondenceClockState, DrawReason, GameResult, LiveClockState, TimeControl, WinReason}
import de.nowchess.chess.grpc.IoGrpcClientWrapper
import de.nowchess.api.game.{DrawReason, GameResult, WinReason}
import de.nowchess.api.board.Color import de.nowchess.api.board.Color
import de.nowchess.chess.grpc.IoGrpcClientWrapper
import de.nowchess.chess.observer.{GameEvent, Observer} import de.nowchess.chess.observer.{GameEvent, Observer}
import de.nowchess.chess.registry.GameRegistry import de.nowchess.chess.registry.{GameEntry, GameRegistry}
import de.nowchess.chess.resource.GameDtoMapper import de.nowchess.chess.resource.GameDtoMapper
import io.quarkus.redis.datasource.RedisDataSource import io.quarkus.redis.datasource.RedisDataSource
import org.jboss.logging.Logger import org.jboss.logging.Logger
@@ -26,61 +25,69 @@ class GameRedisPublisher(
onGameOver: String => Unit, onGameOver: String => Unit,
) extends Observer: ) extends Observer:
def emitInitialWriteback(): Unit =
try
registry.get(gameId).foreach { entry =>
val dto = GameDtoMapper.toGameStateDto(entry, ioClient)
writebackEmit(objectMapper.writeValueAsString(buildWriteback(entry, dto)))
}
catch case ex: Exception => GameRedisPublisher.log.warnf(ex, "Failed to emit initial writeback for game %s", gameId)
def onGameEvent(event: GameEvent): Unit = def onGameEvent(event: GameEvent): Unit =
try try
GameRedisPublisher.log.debugf("Publishing game event for game %s", gameId) GameRedisPublisher.log.debugf("Publishing game event for game %s", gameId)
registry.get(gameId).foreach { entry => registry.get(gameId).foreach { entry =>
val dto = GameDtoMapper.toGameStateDto(entry, ioClient) val dto = GameDtoMapper.toGameStateDto(entry, ioClient)
val json = objectMapper.writeValueAsString(GameStateEventDto(dto)) redis.pubsub(classOf[String]).publish(s2cTopicName, objectMapper.writeValueAsString(GameStateEventDto(dto)))
redis.pubsub(classOf[String]).publish(s2cTopicName, json) writebackEmit(objectMapper.writeValueAsString(buildWriteback(entry, dto)))
val clock = entry.engine.currentClockState
val wb = GameWritebackEventDto(
gameId = gameId,
fen = dto.fen,
pgn = dto.pgn,
moveCount = entry.engine.context.moves.size,
whiteId = entry.white.id.value,
whiteName = entry.white.displayName,
blackId = entry.black.id.value,
blackName = entry.black.displayName,
mode = entry.mode.toString,
resigned = entry.resigned,
limitSeconds = entry.engine.timeControl match {
case de.nowchess.api.game.TimeControl.Clock(l, _) => Some(l); case _ => None
},
incrementSeconds = entry.engine.timeControl match {
case de.nowchess.api.game.TimeControl.Clock(_, i) => Some(i); case _ => None
},
daysPerMove = entry.engine.timeControl match {
case de.nowchess.api.game.TimeControl.Correspondence(d) => Some(d); case _ => None
},
whiteRemainingMs = clock.collect { case c: LiveClockState => c.whiteRemainingMs },
blackRemainingMs = clock.collect { case c: LiveClockState => c.blackRemainingMs },
incrementMs = clock.collect { case c: LiveClockState => c.incrementMs },
clockLastTickAt = clock.collect { case c: LiveClockState => c.lastTickAt.toEpochMilli },
clockMoveDeadline = clock.collect { case c: CorrespondenceClockState => c.moveDeadline.toEpochMilli },
clockActiveColor = clock.map(_.activeColor.label.toLowerCase),
pendingDrawOffer = entry.engine.pendingDrawOfferBy.map(_.label.toLowerCase),
result = entry.engine.context.result.map {
case GameResult.Win(Color.White, _) => "white"
case GameResult.Win(Color.Black, _) => "black"
case GameResult.Draw(_) => "draw"
},
terminationReason = entry.engine.context.result.map {
case GameResult.Win(_, WinReason.Checkmate) => "checkmate"
case GameResult.Win(_, WinReason.Resignation) => "resignation"
case GameResult.Win(_, WinReason.TimeControl) => "timeout"
case GameResult.Draw(DrawReason.Stalemate) => "stalemate"
case GameResult.Draw(DrawReason.InsufficientMaterial) => "insufficient_material"
case GameResult.Draw(DrawReason.FiftyMoveRule) => "fifty_move"
case GameResult.Draw(DrawReason.ThreefoldRepetition) => "repetition"
case GameResult.Draw(DrawReason.Agreement) => "agreement"
},
redoStack = entry.engine.redoStackMoves.map(GameDtoMapper.moveToUci),
pendingTakebackRequest = entry.engine.pendingTakebackRequestBy.map(_.label.toLowerCase),
)
writebackEmit(objectMapper.writeValueAsString(wb))
if entry.engine.context.result.isDefined then onGameOver(gameId) if entry.engine.context.result.isDefined then onGameOver(gameId)
} }
catch case ex: Exception => GameRedisPublisher.log.warnf(ex, "Failed to publish game event for game %s", gameId) catch case ex: Exception => GameRedisPublisher.log.warnf(ex, "Failed to publish game event for game %s", gameId)
private def buildWriteback(entry: GameEntry, dto: GameStateDto): GameWritebackEventDto =
val clock = entry.engine.currentClockState
GameWritebackEventDto(
gameId = gameId,
fen = dto.fen,
pgn = dto.pgn,
moveCount = entry.engine.context.moves.size,
whiteId = entry.white.id.value,
whiteName = entry.white.displayName,
blackId = entry.black.id.value,
blackName = entry.black.displayName,
mode = entry.mode.toString,
resigned = entry.resigned,
limitSeconds = entry.engine.timeControl match {
case TimeControl.Clock(l, _) => Some(l); case _ => None
},
incrementSeconds = entry.engine.timeControl match {
case TimeControl.Clock(_, i) => Some(i); case _ => None
},
daysPerMove = entry.engine.timeControl match {
case TimeControl.Correspondence(d) => Some(d); case _ => None
},
whiteRemainingMs = clock.collect { case c: LiveClockState => c.whiteRemainingMs },
blackRemainingMs = clock.collect { case c: LiveClockState => c.blackRemainingMs },
incrementMs = clock.collect { case c: LiveClockState => c.incrementMs },
clockLastTickAt = clock.collect { case c: LiveClockState => c.lastTickAt.toEpochMilli },
clockMoveDeadline = clock.collect { case c: CorrespondenceClockState => c.moveDeadline.toEpochMilli },
clockActiveColor = clock.map(_.activeColor.label.toLowerCase),
pendingDrawOffer = entry.engine.pendingDrawOfferBy.map(_.label.toLowerCase),
result = entry.engine.context.result.map {
case GameResult.Win(Color.White, _) => "white"
case GameResult.Win(Color.Black, _) => "black"
case GameResult.Draw(_) => "draw"
},
terminationReason = entry.engine.context.result.map {
case GameResult.Win(_, WinReason.Checkmate) => "checkmate"
case GameResult.Win(_, WinReason.Resignation) => "resignation"
case GameResult.Win(_, WinReason.TimeControl) => "timeout"
case GameResult.Draw(DrawReason.Stalemate) => "stalemate"
case GameResult.Draw(DrawReason.InsufficientMaterial) => "insufficient_material"
case GameResult.Draw(DrawReason.FiftyMoveRule) => "fifty_move"
case GameResult.Draw(DrawReason.ThreefoldRepetition) => "repetition"
case GameResult.Draw(DrawReason.Agreement) => "agreement"
},
redoStack = entry.engine.redoStackMoves.map(GameDtoMapper.moveToUci),
pendingTakebackRequest = entry.engine.pendingTakebackRequestBy.map(_.label.toLowerCase),
)
@@ -13,6 +13,7 @@ import de.nowchess.chess.service.InstanceHeartbeatService
import io.quarkus.redis.datasource.ReactiveRedisDataSource import io.quarkus.redis.datasource.ReactiveRedisDataSource
import io.quarkus.redis.datasource.RedisDataSource import io.quarkus.redis.datasource.RedisDataSource
import io.quarkus.redis.datasource.pubsub.ReactivePubSubCommands import io.quarkus.redis.datasource.pubsub.ReactivePubSubCommands
import jakarta.annotation.PostConstruct
import jakarta.annotation.PreDestroy import jakarta.annotation.PreDestroy
import jakarta.enterprise.context.ApplicationScoped import jakarta.enterprise.context.ApplicationScoped
import jakarta.enterprise.inject.Instance import jakarta.enterprise.inject.Instance
@@ -45,6 +46,30 @@ class GameRedisSubscriberManager:
private val c2sListeners = new ConcurrentHashMap[String, ReactivePubSubCommands.ReactiveRedisSubscriber]() private val c2sListeners = new ConcurrentHashMap[String, ReactivePubSubCommands.ReactiveRedisSubscriber]()
private val s2cObservers = new ConcurrentHashMap[String, Observer]() private val s2cObservers = new ConcurrentHashMap[String, Observer]()
// scalafix:off DisableSyntax.var
private var clockExpireSubscriber: Option[ReactivePubSubCommands.ReactiveRedisSubscriber] = None
// scalafix:on DisableSyntax.var
private def clockExpireChannel: String = s"${redisConfig.prefix}:game:clock:expire"
@PostConstruct
def subscribeClockExpiry(): Unit =
val handler: Consumer[String] = gameId => handleClockExpiry(gameId)
try
val subscriber = reactiveRedis
.pubsub(classOf[String])
.subscribe(clockExpireChannel, handler)
.await()
.atMost(java.time.Duration.ofSeconds(5))
clockExpireSubscriber = Some(subscriber)
log.infof("Subscribed to clock expiry channel %s", clockExpireChannel)
catch case ex: Exception => log.warnf(ex, "Failed to subscribe to clock expiry channel")
private def handleClockExpiry(gameId: String): Unit =
if !s2cObservers.containsKey(gameId) then
log.infof("Clock expired for game %s — loading engine to enforce timeout", gameId)
subscribeGame(gameId)
private def c2sTopic(gameId: String): String = private def c2sTopic(gameId: String): String =
s"${redisConfig.prefix}:game:$gameId:c2s" s"${redisConfig.prefix}:game:$gameId:c2s"
@@ -65,6 +90,7 @@ class GameRedisSubscriberManager:
) )
s2cObservers.put(gameId, obs) s2cObservers.put(gameId, obs)
registry.get(gameId).foreach(_.engine.subscribe(obs)) registry.get(gameId).foreach(_.engine.subscribe(obs))
obs.emitInitialWriteback()
heartbeatServiceOpt.foreach(_.addGameSubscription(gameId)) heartbeatServiceOpt.foreach(_.addGameSubscription(gameId))
val handler: Consumer[String] = msg => handleC2sMessage(gameId, msg) val handler: Consumer[String] = msg => handleC2sMessage(gameId, msg)
@@ -156,5 +182,6 @@ class GameRedisSubscriberManager:
@PreDestroy @PreDestroy
def cleanup(): Unit = def cleanup(): Unit =
clockExpireSubscriber.foreach(_.unsubscribe(clockExpireChannel).await().indefinitely())
c2sListeners.forEach((gameId, subscriber) => subscriber.unsubscribe(c2sTopic(gameId)).await().indefinitely()) c2sListeners.forEach((gameId, subscriber) => subscriber.unsubscribe(c2sTopic(gameId)).await().indefinitely())
s2cObservers.forEach((gameId, obs) => registry.get(gameId).foreach(_.engine.unsubscribe(obs))) s2cObservers.forEach((gameId, obs) => registry.get(gameId).foreach(_.engine.unsubscribe(obs)))
@@ -22,4 +22,5 @@ case class GameCacheDto(
pendingDrawOffer: Option[String], pendingDrawOffer: Option[String],
redoStack: List[String] = Nil, redoStack: List[String] = Nil,
pendingTakebackRequest: Option[String] = None, pendingTakebackRequest: Option[String] = None,
lastUpdatedMs: Long = System.currentTimeMillis(),
) )
@@ -143,6 +143,7 @@ class RedisGameRegistry extends GameRegistry:
clockMoveDeadline = Option(record.clockMoveDeadline).map(_.longValue), clockMoveDeadline = Option(record.clockMoveDeadline).map(_.longValue),
clockActiveColor = Option(record.clockActiveColor), clockActiveColor = Option(record.clockActiveColor),
pendingDrawOffer = Option(record.pendingDrawOffer), pendingDrawOffer = Option(record.pendingDrawOffer),
lastUpdatedMs = System.currentTimeMillis(),
) )
(dto, reconstruct(dto)) (dto, reconstruct(dto))
} match } match
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0 MAJOR=0
MINOR=41 MINOR=43
PATCH=0 PATCH=0
+49
View File
@@ -232,3 +232,52 @@
* NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f)) * NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4)) * **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736)) * remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
## (2026-05-16)
### Features
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* **config:** update application.yml to nest HTTP port configuration ([3efebd5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3efebd5ed0493159c51f7246d18d59bac58cf875))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
### Bug Fixes
* NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
## (2026-05-17)
### Features
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* **config:** update application.yml to nest HTTP port configuration ([3efebd5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3efebd5ed0493159c51f7246d18d59bac58cf875))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* implement clock expiry scanning and handling for game records ([#53](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/53)) ([8f9eb12](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8f9eb12f663efabe4dc72b94394438652ad0ef02))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
* update application.yml with new API root paths and add Micrometer and OpenTelemetry dependencies ([72ce262](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/72ce262bc491f94297700e6002fb5d0812e2cc2a))
### Bug Fixes
* ensure full hierarchy registration for reflection in NativeReflectionConfig ([ebba729](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/ebba729af3265df1619dfbe46fd1945b2a7e30b7))
* NCS-85 Database Writeback fails without Logs ([#52](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/52)) ([7323908](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/73239088d985f01aa6b1067ed9097a845e471d4f))
* **redis:** update Redis configuration with max pool size and waiting parameters ([5baf6a7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5baf6a7cdbea484fc49c02e2b5a1c3919b7fa2c4))
* remove unused HTTP root-path configurations from application.yml ([3ed3e59](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3ed3e59ee456d54cd3d65ece4f36623e256b9736))
+1
View File
@@ -60,6 +60,7 @@ dependencies {
implementation("io.quarkus:quarkus-opentelemetry") implementation("io.quarkus:quarkus-opentelemetry")
implementation("com.fasterxml.jackson.module:jackson-module-scala_3:${versions["JACKSON_SCALA"]!!}") implementation("com.fasterxml.jackson.module:jackson-module-scala_3:${versions["JACKSON_SCALA"]!!}")
implementation("io.quarkus:quarkus-redis-client") implementation("io.quarkus:quarkus-redis-client")
implementation("io.quarkus:quarkus-scheduler")
testImplementation(platform("org.junit:junit-bom:5.13.4")) testImplementation(platform("org.junit:junit-bom:5.13.4"))
testImplementation("org.junit.jupiter:junit-jupiter") testImplementation("org.junit.jupiter:junit-jupiter")
@@ -9,5 +9,6 @@ import io.quarkus.runtime.annotations.RegisterForReflection
classOf[GameRecord], classOf[GameRecord],
classOf[GameWritebackEventDto], classOf[GameWritebackEventDto],
), ),
registerFullHierarchy = true,
) )
class NativeReflectionConfig class NativeReflectionConfig
@@ -34,6 +34,19 @@ class GameRecordRepository:
.asScala .asScala
.toList .toList
def findExpiredLiveClockGames(nowMs: Long): List[GameRecord] =
em.createQuery(
"SELECT g FROM GameRecord g WHERE g.result IS NULL AND g.clockLastTickAt IS NOT NULL AND g.whiteRemainingMs IS NOT NULL",
classOf[GameRecord],
).getResultList
.asScala
.toList
.filter { g =>
val remaining =
if g.clockActiveColor == "white" then g.whiteRemainingMs.longValue else g.blackRemainingMs.longValue
g.clockLastTickAt.longValue + remaining < nowMs
}
def findByPlayerIdRunning(playerId: String, offset: Int, limit: Int): List[GameRecord] = def findByPlayerIdRunning(playerId: String, offset: Int, limit: Int): List[GameRecord] =
em.createQuery( em.createQuery(
"SELECT g FROM GameRecord g WHERE g.whiteId = :id OR g.blackId = :id AND g.result = null ORDER BY g.updatedAt DESC", "SELECT g FROM GameRecord g WHERE g.whiteId = :id OR g.blackId = :id AND g.result = null ORDER BY g.updatedAt DESC",
@@ -0,0 +1,36 @@
package de.nowchess.store.service
import de.nowchess.store.config.RedisConfig
import de.nowchess.store.repository.GameRecordRepository
import io.quarkus.redis.datasource.RedisDataSource
import io.quarkus.scheduler.Scheduled
import jakarta.enterprise.context.ApplicationScoped
import jakarta.inject.Inject
import org.jboss.logging.Logger
import scala.compiletime.uninitialized
@ApplicationScoped
class ClockExpiryScanner:
@Inject
// scalafix:off DisableSyntax.var
var repository: GameRecordRepository = uninitialized
@Inject var redis: RedisDataSource = uninitialized
@Inject var redisConfig: RedisConfig = uninitialized
// scalafix:on
private val log = Logger.getLogger(classOf[ClockExpiryScanner])
private def clockExpireChannel: String = s"${redisConfig.prefix}:game:clock:expire"
@Scheduled(every = "30s")
def scan(): Unit =
try
val nowMs = System.currentTimeMillis()
val expired = repository.findExpiredLiveClockGames(nowMs)
if expired.nonEmpty then
log.infof("Found %d games with expired clocks", expired.size)
expired.foreach { record =>
log.infof("Publishing clock expiry for game %s", record.gameId)
redis.pubsub(classOf[String]).publish(clockExpireChannel, record.gameId)
}
catch case ex: Exception => log.warnf(ex, "Clock expiry scan failed")
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0 MAJOR=0
MINOR=15 MINOR=17
PATCH=0 PATCH=0