ci: bump version with Build-132

fix(official-bots): make botToken optional, fall back to env, fix 502 status
botToken in JoinTournamentRequest is now Option[String]. When absent the service resolves it from TOURNAMENT_BOT_TOKEN env var so official-bot join requests no longer need a token in the body. Response status on join failure changed from BAD_GATEWAY (502) to BAD_REQUEST (400). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-21 14:10:10 +00:00 · 2026-06-21 15:40:09 +02:00 · 2026-06-21 15:36:07 +02:00 · 2026-06-21 13:28:40 +00:00 · 2026-06-21 15:03:07 +02:00
19 changed files with 731 additions and 62 deletions
@@ -36,3 +36,32 @@
 ### Bug Fixes

 * **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
+##  (2026-06-21)
+
+### Features
+
+* **analytics:** add 7 new Spark analytics jobs and extend GameSource ([8e17c14](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8e17c14dff740cd115011dfbf17de35083b8fe46))
+* **analytics:** add Dockerfile, CI workflow, and stable jar name for K8s deployment ([95215b6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/95215b6a420fd526df1aa395f9b087556c8ad03b))
+* **analytics:** add PostgreSQL JDBC write-back to all four batch jobs ([0e0ea4c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0e0ea4c9893c6efed52e633e55d05ab3ed004502))
+* **analytics:** add Spark batch analytics module ([259b3bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/259b3bbb24c0f23326269b93f4b3c84012f727cd))
+* **analytics:** add Structured Streaming, MLlib clustering, GraphX jobs ([e1d80b9](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e1d80b9331666feea191b1fd08aa762f3581c918))
+* **official-bots:** park expert bot on tournament server at startup ([#76](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/76)) ([751a58b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/751a58b6061f7434115e229a7661894c76768bc2))
+
+### Bug Fixes
+
+* **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
+##  (2026-06-21)
+
+### Features
+
+* **analytics:** add 7 new Spark analytics jobs and extend GameSource ([8e17c14](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8e17c14dff740cd115011dfbf17de35083b8fe46))
+* **analytics:** add Dockerfile, CI workflow, and stable jar name for K8s deployment ([95215b6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/95215b6a420fd526df1aa395f9b087556c8ad03b))
+* **analytics:** add PostgreSQL JDBC write-back to all four batch jobs ([0e0ea4c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0e0ea4c9893c6efed52e633e55d05ab3ed004502))
+* **analytics:** add Spark batch analytics module ([259b3bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/259b3bbb24c0f23326269b93f4b3c84012f727cd))
+* **analytics:** add Structured Streaming, MLlib clustering, GraphX jobs ([e1d80b9](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e1d80b9331666feea191b1fd08aa762f3581c918))
+* **analytics:** always write results to PostgreSQL regardless of input source ([da0e6d1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/da0e6d1ee2d391ecb6291396f82471eb51b1b25e))
+* **official-bots:** park expert bot on tournament server at startup ([#76](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/76)) ([751a58b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/751a58b6061f7434115e229a7661894c76768bc2))
+
+### Bug Fixes
+
+* **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
@@ -0,0 +1,72 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+import org.apache.spark.sql.types.DataTypes
+import org.apache.spark.sql.types.StructField
+import org.apache.spark.sql.types.StructType
+
+import scala.jdk.CollectionConverters.*
+
+object ColorAdvantageJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-color-advantage"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Color Advantage")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .load(spark, jdbcUrl, dbUser, dbPass)
+      .select("result")
+      .filter(F.col("result").isNotNull)
+
+    val totalGames = games.count()
+    val whiteWins  = games.filter(F.col("result") === "white").count()
+    val blackWins  = games.filter(F.col("result") === "black").count()
+    val draws      = games.filter(F.col("result") === "draw").count()
+
+    val schema = StructType(
+      Seq(
+        StructField("color", DataTypes.StringType, false),
+        StructField("total_games", DataTypes.LongType, false),
+        StructField("wins", DataTypes.LongType, false),
+        StructField("losses", DataTypes.LongType, false),
+        StructField("draws", DataTypes.LongType, false),
+      ),
+    )
+
+    val rows = List(
+      Row("white", totalGames, whiteWins, blackWins, draws),
+      Row("black", totalGames, blackWins, whiteWins, draws),
+    )
+
+    val stats = spark
+      .createDataFrame(rows.asJava, schema)
+      .withColumn("win_rate", F.round(F.col("wins") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.asc("color"))
+
+    stats.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/color_advantage")
+
+    stats.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_color_advantage")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,99 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object DailyActivityJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-daily-activity"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Daily Activity")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
+      .select("result", "utc_date", "utc_time")
+      .filter(F.col("utc_time").isNotNull.and(F.col("utc_date").isNotNull))
+
+    val hourOfDay = F.regexp_extract(F.col("utc_time"), "^(\\d{2})", 1).cast("int")
+    val dow       = F.dayofweek(F.to_date(F.col("utc_date"), "yyyy.MM.dd"))
+
+    val tagged = games
+      .withColumn("hour_of_day", hourOfDay)
+      .withColumn("dow", dow)
+
+    val hourly = tagged
+      .groupBy("hour_of_day")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("white_win_rate", F.round(F.col("white_wins") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.asc("hour_of_day"))
+      .select("hour_of_day", "total_games", "white_wins", "black_wins", "draws", "white_win_rate")
+
+    hourly.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/hourly_activity")
+
+    hourly.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_hourly_activity")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
+
+    val dayName = F
+      .when(F.col("dow") === 1, "Sunday")
+      .when(F.col("dow") === 2, "Monday")
+      .when(F.col("dow") === 3, "Tuesday")
+      .when(F.col("dow") === 4, "Wednesday")
+      .when(F.col("dow") === 5, "Thursday")
+      .when(F.col("dow") === 6, "Friday")
+      .otherwise("Saturday")
+
+    val weekly = tagged
+      .withColumn("day_of_week", dayName)
+      .withColumn("day_order", F.col("dow"))
+      .groupBy("day_of_week", "day_order")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("white_win_rate", F.round(F.col("white_wins") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.asc("day_order"))
+      .drop("day_order")
+      .select("day_of_week", "total_games", "white_wins", "black_wins", "draws", "white_win_rate")
+
+    weekly.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/weekly_activity")
+
+    weekly.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_weekly_activity")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,58 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object EloDistributionJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-elo-distribution"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Elo Distribution")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
+      .filter(F.col("white_elo").isNotNull)
+
+    val whiteElo = games.select(F.col("white_elo").as("elo"))
+    val blackElo = games.select(F.col("black_elo").as("elo"))
+    val allElo   = whiteElo.union(blackElo).filter(F.col("elo").isNotNull)
+
+    val bucketMin = (F.floor(F.col("elo") / 200) * 200).cast("int")
+    val bucketLabel = F.when(
+      F.col("elo") >= 2800,
+      F.lit("2800+"),
+    ).otherwise(F.concat(bucketMin.cast("string"), F.lit("-"), (bucketMin + 199).cast("string")))
+
+    val distribution = allElo
+      .withColumn("elo_bucket", bucketLabel)
+      .withColumn("bucket_order", F.when(F.col("elo") >= 2800, 2800).otherwise(bucketMin))
+      .groupBy("elo_bucket", "bucket_order")
+      .agg(F.count("*").as("player_count"))
+      .orderBy(F.asc("bucket_order"))
+      .select("elo_bucket", "player_count")
+
+    distribution.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/elo_distribution")
+
+    distribution.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_elo_distribution")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,111 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object GameLengthJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-game-length"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Game Length")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .load(spark, jdbcUrl, dbUser, dbPass)
+      .select("result", "move_count")
+      .filter(F.col("result").isNotNull.and(F.col("move_count").isNotNull))
+
+    val moves = F.col("move_count")
+    val bucket = F
+      .when(moves <= 10, "1-10")
+      .when(moves <= 20, "11-20")
+      .when(moves <= 30, "21-30")
+      .when(moves <= 40, "31-40")
+      .when(moves <= 60, "41-60")
+      .when(moves <= 100, "61-100")
+      .otherwise("101+")
+    val bucketOrder = F
+      .when(moves <= 10, 1)
+      .when(moves <= 20, 2)
+      .when(moves <= 30, 3)
+      .when(moves <= 40, 4)
+      .when(moves <= 60, 5)
+      .when(moves <= 100, 6)
+      .otherwise(7)
+
+    val tagged = games
+      .withColumn("move_bucket", bucket)
+      .withColumn("bucket_order", bucketOrder)
+
+    val distribution = tagged
+      .groupBy("move_bucket", "bucket_order")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("white_win_rate", F.round(F.col("white_wins") / F.col("total_games").cast("double"), 3))
+      .withColumn("black_win_rate", F.round(F.col("black_wins") / F.col("total_games").cast("double"), 3))
+      .withColumn("draw_rate", F.round(F.col("draws") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.asc("bucket_order"))
+      .drop("bucket_order")
+      .select(
+        "move_bucket",
+        "total_games",
+        "white_wins",
+        "black_wins",
+        "draws",
+        "white_win_rate",
+        "black_win_rate",
+        "draw_rate",
+      )
+
+    distribution.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/game_length_distribution")
+
+    distribution.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_game_length_distribution")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
+
+    val byResult = games
+      .groupBy("result")
+      .agg(
+        F.round(F.avg("move_count"), 1).as("avg_move_count"),
+        F.min("move_count").as("min_moves"),
+        F.max("move_count").as("max_moves"),
+      )
+      .orderBy(F.asc("result"))
+
+    byResult.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/game_length_by_result")
+
+    byResult.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_game_length_by_result")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -33,6 +33,19 @@ object GameSource:
      case Some(path) => fromLichessPgn(spark, path)
      case None       => fromJdbc(spark, jdbcUrl, dbUser, dbPass)

+  def loadExtended(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String): DataFrame =
+    sys.env.get(PgnPathEnv) match
+      case Some(path) => fromLichessPgnExtended(spark, path)
+      case None =>
+        fromJdbc(spark, jdbcUrl, dbUser, dbPass)
+          .withColumn("white_elo", F.lit(null).cast("int"))
+          .withColumn("black_elo", F.lit(null).cast("int"))
+          .withColumn("time_control", F.lit(null).cast("string"))
+          .withColumn("utc_date", F.lit(null).cast("string"))
+          .withColumn("utc_time", F.lit(null).cast("string"))
+          .withColumn("termination", F.lit(null).cast("string"))
+          .withColumn("eco", F.lit(null).cast("string"))
+
  def fromJdbc(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String): DataFrame =
    spark.read
      .format("jdbc")
@@ -89,6 +102,49 @@ object GameSource:
      )
      .filter((F.col("white_id") =!= "").and(F.col("black_id") =!= ""))

+  private def fromLichessPgnExtended(spark: SparkSession, path: String): DataFrame =
+    val resolved = resolvePath(spark, path)
+    val record   = F.col("value")
+
+    val resultTag = F.regexp_extract(record, "Result \"([^\"]*)\"", 1)
+    val result = F
+      .when(resultTag === "1-0", "white")
+      .when(resultTag === "0-1", "black")
+      .when(resultTag === "1/2-1/2", "draw")
+      .otherwise(F.lit(null).cast("string"))
+
+    val moveText  = F.coalesce(F.split(record, "\n\n").getItem(1), F.lit(""))
+    val noComment = F.regexp_replace(moveText, "\\{[^}]*\\}", "")
+    val noResult  = F.regexp_replace(noComment, "(1-0|0-1|1/2-1/2|\\*)", "")
+    val noNumbers = F.regexp_replace(noResult, "\\d+\\.+", " ")
+    val plies     = F.size(F.filter(F.split(F.trim(noNumbers), "\\s+"), tok => F.length(tok) > 0))
+
+    def nullable(extracted: org.apache.spark.sql.Column): org.apache.spark.sql.Column =
+      F.when(F.length(extracted) > 0, extracted).otherwise(F.lit(null).cast("string"))
+
+    val whiteElo = nullable(F.regexp_extract(record, "WhiteElo \"([^\"]*)\"", 1)).cast("int")
+    val blackElo = nullable(F.regexp_extract(record, "BlackElo \"([^\"]*)\"", 1)).cast("int")
+
+    spark.read
+      .option("lineSep", "[Event ")
+      .text(resolved)
+      .filter(F.length(F.trim(record)) > 0)
+      .select(
+        F.regexp_extract(record, "White \"([^\"]*)\"", 1).as("white_id"),
+        F.regexp_extract(record, "Black \"([^\"]*)\"", 1).as("black_id"),
+        result.as("result"),
+        plies.as("move_count"),
+        F.concat(F.lit("[Event "), record).as("pgn"),
+        whiteElo.as("white_elo"),
+        blackElo.as("black_elo"),
+        nullable(F.regexp_extract(record, "TimeControl \"([^\"]*)\"", 1)).as("time_control"),
+        nullable(F.regexp_extract(record, "UTCDate \"([^\"]*)\"", 1)).as("utc_date"),
+        nullable(F.regexp_extract(record, "UTCTime \"([^\"]*)\"", 1)).as("utc_time"),
+        nullable(F.regexp_extract(record, "Termination \"([^\"]*)\"", 1)).as("termination"),
+        nullable(F.regexp_extract(record, "ECO \"([^\"]*)\"", 1)).as("eco"),
+      )
+      .filter((F.col("white_id") =!= "").and(F.col("black_id") =!= ""))
+
  /** Turns an http(s)/ftp URL into a cluster-local path by fetching it once with SparkContext.addFile, which
    * distributes the file to every executor. `.zst` is decompressed in-process and the plain `.pgn` is redistributed.
    * Non-URL paths are returned unchanged.
@@ -72,16 +72,15 @@ object OpeningBookJob:
      .option("header", "true")
      .csv(s"$outputDir/opening_book_top1000")

-    if !GameSource.isPgnMode then
-      top1000.write
-        .mode("overwrite")
-        .format("jdbc")
-        .option("url", jdbcUrl)
-        .option("dbtable", "analytics_opening_stats")
-        .option("user", dbUser)
-        .option("password", dbPass)
-        .option("driver", "org.postgresql.Driver")
-        .save()
+    top1000.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_opening_stats")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()

  /** Extracts the first `maxPlies` moves from a PGN column as a space-separated string.
    *
@@ -119,26 +119,25 @@ object PlayerClusteringJob:
      .option("header", "true")
      .csv(s"$outputDir/cluster_archetypes")

-    if !GameSource.isPgnMode then
-      clustersDf.write
-        .mode("overwrite")
-        .format("jdbc")
-        .option("url", jdbcUrl)
-        .option("dbtable", "analytics_player_clusters")
-        .option("user", dbUser)
-        .option("password", dbPass)
-        .option("driver", "org.postgresql.Driver")
-        .save()
+    clustersDf.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_player_clusters")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()

-      archetypes.write
-        .mode("overwrite")
-        .format("jdbc")
-        .option("url", jdbcUrl)
-        .option("dbtable", "analytics_cluster_archetypes")
-        .option("user", dbUser)
-        .option("password", dbPass)
-        .option("driver", "org.postgresql.Driver")
-        .save()
+    archetypes.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_cluster_archetypes")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()

  private def buildPlayerStats(games: org.apache.spark.sql.DataFrame): org.apache.spark.sql.DataFrame =
    val asWhite = games.select(
@@ -109,16 +109,15 @@ object PlayerGraphJob:
      .mode("overwrite")
      .parquet(s"$outputDir/player_graph")

-    if !GameSource.isPgnMode then
-      result.write
-        .mode("overwrite")
-        .format("jdbc")
-        .option("url", jdbcUrl)
-        .option("dbtable", "analytics_player_graph")
-        .option("user", dbUser)
-        .option("password", dbPass)
-        .option("driver", "org.postgresql.Driver")
-        .save()
+    result.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_player_graph")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()

    // How many players belong to each connected component?
    // A large dominant component + many singletons is the expected shape.
@@ -135,6 +134,16 @@ object PlayerGraphJob:
      .option("header", "true")
      .csv(s"$outputDir/component_sizes")

+    componentSizes.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_component_sizes")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
+
  // Build a two-column DataFrame (vertex_id: Long, valueCol: valueType) from an RDD.
  // Used to bridge GraphX RDD results into the DataFrame API without implicits.
  private def rddToFrame[T](
@@ -77,13 +77,17 @@ object PlayerStatsJob:
      .mode("overwrite")
      .parquet(s"$outputDir/player_stats")

-    if !GameSource.isPgnMode then
-      stats.write
-        .mode("overwrite")
-        .format("jdbc")
-        .option("url", jdbcUrl)
-        .option("dbtable", "analytics_player_stats")
-        .option("user", dbUser)
-        .option("password", dbPass)
-        .option("driver", "org.postgresql.Driver")
-        .save()
+    stats.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/player_stats_csv")
+
+    stats.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_player_stats")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,75 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object RatingMismatchJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-rating-mismatch"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Rating Mismatch")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
+      .select("result", "white_elo", "black_elo")
+      .filter(F.col("white_elo").isNotNull.and(F.col("black_elo").isNotNull))
+
+    val eloDiff = F.col("white_elo") - F.col("black_elo")
+    val bracket = F
+      .when(eloDiff < -200, "Black +200")
+      .when(eloDiff < -100, "Black +100–200")
+      .when(eloDiff < -50, "Black +50–100")
+      .when(eloDiff <= 50, "Even (±50)")
+      .when(eloDiff <= 100, "White +50–100")
+      .when(eloDiff <= 200, "White +100–200")
+      .otherwise("White +200")
+    val bracketOrder = F
+      .when(eloDiff < -200, 1)
+      .when(eloDiff < -100, 2)
+      .when(eloDiff < -50, 3)
+      .when(eloDiff <= 50, 4)
+      .when(eloDiff <= 100, 5)
+      .when(eloDiff <= 200, 6)
+      .otherwise(7)
+
+    val stats = games
+      .withColumn("elo_diff", eloDiff)
+      .withColumn("bracket", bracket)
+      .withColumn("bracket_order", bracketOrder)
+      .groupBy("bracket", "bracket_order")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("white_win_rate", F.round(F.col("white_wins") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.asc("bracket_order"))
+      .drop("bracket_order")
+      .select("bracket", "total_games", "white_wins", "black_wins", "draws", "white_win_rate")
+
+    stats.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/rating_mismatch")
+
+    stats.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_rating_mismatch")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,54 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object TerminationStatsJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-termination-stats"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Termination Stats")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
+      .select("result", "termination")
+      .filter(F.col("termination").isNotNull.and(F.col("termination") =!= ""))
+
+    val stats = games
+      .groupBy("termination")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("draw_rate", F.round(F.col("draws") / F.col("total_games").cast("double"), 3))
+      .withColumnRenamed("termination", "termination_type")
+      .orderBy(F.desc("total_games"))
+      .select("termination_type", "total_games", "white_wins", "black_wins", "draws", "draw_rate")
+
+    stats.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/termination_stats")
+
+    stats.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_termination_stats")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -0,0 +1,68 @@
+package de.nowchess.analytics
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.functions as F
+
+object TimeControlJob:
+
+  def main(args: Array[String]): Unit =
+    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
+    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
+    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
+    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-time-control"
+
+    val spark = SparkSession
+      .builder()
+      .appName("NowChess Time Control")
+      .getOrCreate()
+
+    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
+    spark.stop()
+
+  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
+    val games = GameSource
+      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
+      .select("result", "time_control")
+      .filter(
+        F.col("time_control").isNotNull
+          .and(F.col("time_control") =!= "")
+          .and(F.col("time_control") =!= "-"),
+      )
+
+    val baseSeconds = F.regexp_extract(F.col("time_control"), "^(?:\\d+/)?(\\d+)", 1).cast("int")
+    val category = F
+      .when(baseSeconds < 30, "UltraBullet")
+      .when(baseSeconds < 180, "Bullet")
+      .when(baseSeconds < 480, "Blitz")
+      .when(baseSeconds < 1500, "Rapid")
+      .when(baseSeconds < 86400, "Classical")
+      .otherwise("Correspondence")
+
+    val stats = games
+      .withColumn("category", category)
+      .groupBy("category")
+      .agg(
+        F.count("*").as("total_games"),
+        F.sum(F.when(F.col("result") === "white", 1).otherwise(0)).as("white_wins"),
+        F.sum(F.when(F.col("result") === "black", 1).otherwise(0)).as("black_wins"),
+        F.sum(F.when(F.col("result") === "draw", 1).otherwise(0)).as("draws"),
+      )
+      .withColumn("white_win_rate", F.round(F.col("white_wins") / F.col("total_games").cast("double"), 3))
+      .withColumn("draw_rate", F.round(F.col("draws") / F.col("total_games").cast("double"), 3))
+      .orderBy(F.desc("total_games"))
+      .select("category", "total_games", "white_wins", "black_wins", "draws", "white_win_rate", "draw_rate")
+
+    stats.write
+      .mode("overwrite")
+      .option("header", "true")
+      .csv(s"$outputDir/time_control_stats")
+
+    stats.write
+      .mode("overwrite")
+      .format("jdbc")
+      .option("url", jdbcUrl)
+      .option("dbtable", "analytics_time_control_stats")
+      .option("user", dbUser)
+      .option("password", dbPass)
+      .option("driver", "org.postgresql.Driver")
+      .save()
@@ -1,3 +1,3 @@
 MAJOR=0
-MINOR=4
+MINOR=6
 PATCH=0
@@ -370,3 +370,34 @@
 ### Reverts

 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
+##  (2026-06-21)
+
+### Features
+
+* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
+* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
+* **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
+* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
+* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
+* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
+* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
+* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
+* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
+* **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
+* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
+* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
+* NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
+* **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
+* **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
+* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
+
+### Bug Fixes
+
+* enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
+* **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
+* **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
+* **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
+
+### Reverts
+
+* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
@@ -2,7 +2,7 @@ package de.nowchess.bot.resource

 case class JoinTournamentRequest(
    tournamentId: String,
-    botToken: String,
+    botToken: Option[String],
    difficulty: String,
    serverUrl: Option[String],
 )
@@ -39,6 +39,6 @@ class TournamentJoinResource:
        Response.ok(resp).build()
      case Left(err) =>
        Response
-          .status(Response.Status.BAD_GATEWAY)
+          .status(Response.Status.BAD_REQUEST)
          .entity(s"""{"error":"$err"}""")
          .build()
@@ -82,18 +82,23 @@ class TournamentBotGamePlayer:

  def joinTournament(
      tournamentId: String,
-      botToken: String,
+      botToken: Option[String],
      difficulty: String,
      serverUrl: String,
  ): Either[String, String] =
-    TournamentBotConfig.jwtSubject(botToken) match
-      case None => Left("Invalid bot token — could not extract subject")
-      case Some(botId) =>
-        val cfg = TournamentBotConfig(serverUrl, tournamentId, botToken, botId, difficulty)
-        if join(cfg) then
-          startAsync(cfg)
-          Right(botId)
-        else Left("Failed to join tournament")
+    val resolvedToken = botToken.filter(_.nonEmpty)
+      .orElse(System.getenv().asScala.get("TOURNAMENT_BOT_TOKEN").filter(_.nonEmpty))
+    resolvedToken match
+      case None => Left("No bot token provided and TOURNAMENT_BOT_TOKEN not configured")
+      case Some(token) =>
+        TournamentBotConfig.jwtSubject(token) match
+          case None => Left("Invalid bot token — could not extract subject")
+          case Some(botId) =>
+            val cfg = TournamentBotConfig(serverUrl, tournamentId, token, botId, difficulty)
+            if join(cfg) then
+              startAsync(cfg)
+              Right(botId)
+            else Left("Failed to join tournament")

  private def startAsync(cfg: TournamentBotConfig): Unit =
    val thread = new Thread(() => streamLoop(cfg), s"TournamentBot-${cfg.tournamentId}")
@@ -1,3 +1,3 @@
 MAJOR=0
-MINOR=21
+MINOR=22
 PATCH=0
Author	SHA1	Message	Date
TeamCity	71cb2cc56c	ci: bump version with Build-132	2026-06-21 14:10:10 +00:00
Janis Eccarius	f43d1930d8	fix(official-bots): make botToken optional, fall back to env, fix 502 status Build & Test (NowChessSystems) TeamCity build finished Details botToken in JoinTournamentRequest is now Option[String]. When absent the service resolves it from TOURNAMENT_BOT_TOKEN env var so official-bot join requests no longer need a token in the body. Response status on join failure changed from BAD_GATEWAY (502) to BAD_REQUEST (400). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 15:40:09 +02:00
Janis Eccarius	da0e6d1ee2	feat(analytics): always write results to PostgreSQL regardless of input source Build & Test (NowChessSystems) TeamCity build failed Details Remove isPgnMode JDBC guard from all 4 original jobs so staging (Lichess PGN mode) and production (game_records JDBC mode) both persist analytics results to the DB. Add JDBC write-back to all 7 new jobs: - GameLengthJob → analytics_game_length_distribution + analytics_game_length_by_result - ColorAdvantageJob → analytics_color_advantage - EloDistributionJob → analytics_elo_distribution - TimeControlJob → analytics_time_control_stats - DailyActivityJob → analytics_hourly_activity + analytics_weekly_activity - RatingMismatchJob → analytics_rating_mismatch - TerminationStatsJob → analytics_termination_stats Add analytics_component_sizes JDBC write to PlayerGraphJob. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 15:36:07 +02:00
TeamCity	a6c600d6ce	ci: bump version with Build-131	2026-06-21 13:28:40 +00:00
Janis Eccarius	8e17c14dff	feat(analytics): add 7 new Spark analytics jobs and extend GameSource Build & Test (NowChessSystems) TeamCity build finished Details Adds GameLengthJob, ColorAdvantageJob, EloDistributionJob, TimeControlJob, DailyActivityJob, RatingMismatchJob, and TerminationStatsJob bringing total batch pipelines to 11 (+ 1 streaming). Extends GameSource with loadExtended() / fromLichessPgnExtended() extracting WhiteElo, BlackElo, TimeControl, UTCDate, UTCTime, Termination, ECO from PGN headers; JDBC path returns nulls for extended columns, keeping all existing jobs unaffected. PlayerStatsJob gains a CSV output alongside the existing Parquet write so the analytics webview can display player statistics without pyarrow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 15:03:07 +02:00