fix(official-bots): prevent Colab OOM in NNUE training

Dense 98304-dim HalfKP features at batch_size=16384 cost ~6.4 GB/batch on the host; with 8 hardcoded DataLoader workers and prefetch this OOM-killed the Colab runtime. - train.py: adaptive DataLoader workers (min(4, cpu_count), Colab free tier = 2), overridable via NNUE_LOADER_WORKERS; persistent_workers only when > 0. - NNUETraining.ipynb: lower BATCH_SIZE 16384 -> 4096 with a memory-cost note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ci: bump version with Build-156
2026-06-24 22:18:18 +02:00 · 2026-06-24 20:17:17 +00:00 · 2026-06-24 22:04:22 +02:00 · 2026-06-24 18:21:11 +00:00 · 2026-06-24 20:09:28 +02:00 · 2026-06-24 17:55:44 +00:00
31 changed files with 2321 additions and 177 deletions
@@ -81,3 +81,20 @@
 * **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
 * **analytics:** write decompressed PGN to shared PVC path for executor access ([a268a9a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a268a9acb7ba190c76e996ccf3ea3bd00e5cec92))
 ##  (2026-06-23)
 ### Features
 * **analytics:** add 7 new Spark analytics jobs and extend GameSource ([8e17c14](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8e17c14dff740cd115011dfbf17de35083b8fe46))
 * **analytics:** add accuracy and blunder analysis job for Lichess data ([c3e7b82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c3e7b82ae806adf5713ce4d267c1155e73a40ff5))
 * **analytics:** add Dockerfile, CI workflow, and stable jar name for K8s deployment ([95215b6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/95215b6a420fd526df1aa395f9b087556c8ad03b))
 * **analytics:** add PostgreSQL JDBC write-back to all four batch jobs ([0e0ea4c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0e0ea4c9893c6efed52e633e55d05ab3ed004502))
 * **analytics:** add Spark batch analytics module ([259b3bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/259b3bbb24c0f23326269b93f4b3c84012f727cd))
 * **analytics:** add Structured Streaming, MLlib clustering, GraphX jobs ([e1d80b9](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e1d80b9331666feea191b1fd08aa762f3581c918))
 * **analytics:** always write results to PostgreSQL regardless of input source ([da0e6d1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/da0e6d1ee2d391ecb6291396f82471eb51b1b25e))
 * **official-bots:** park expert bot on tournament server at startup ([#76](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/76)) ([751a58b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/751a58b6061f7434115e229a7661894c76768bc2))
 ### Bug Fixes
 * **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
 * **analytics:** write decompressed PGN to shared PVC path for executor access ([a268a9a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a268a9acb7ba190c76e996ccf3ea3bd00e5cec92))
@@ -0,0 +1,191 @@
 package de.nowchess.analytics
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.expressions.Window
 import org.apache.spark.sql.functions as F
 /** Per-move accuracy & blunder analysis mined from Lichess `[%eval ...]` move annotations.
  *
  * Unlike the flat single-`groupBy` summaries (opening rates, colour advantage), this job reconstructs the *quality of
  * every move* from the engine evaluations Lichess embeds in the movetext (`{ [%eval 0.24] }`, mate scores `[%eval
  * #-3]`) and turns them into the same accuracy signals lichess.com surfaces: average centipawn loss (ACPL), and counts
  * of inaccuracies / mistakes / blunders.
  *
  * Pipeline (all Spark SQL string/array functions + window funcs — no UDFs, Catalyst-friendly):
  *   1. Keep only games carrying `[%eval` comments.
  *   2. `regexp_extract_all` pulls every eval in ply order; mate scores collapse to ±10 pawns, normal evals are clamped
  *      to ±10 so a single huge swing cannot dominate the mean. All evals are White-POV pawns.
  *   3. `posexplode` → one row per ply; a per-game window `lag` gives the eval *before* the move.
  *   4. Centipawn loss for the side that moved = how much the eval moved against them (white wants it up, black down),
  *      floored at 0 and scaled to centipawns.
  *   5. Roll up to (game, side): ACPL + inaccuracy(≥50cp) / mistake(≥100cp) / blunder(≥200cp) counts, tagged with that
  *      side's Elo and whether they won.
  *
  * Outputs (Parquet + CSV + JDBC):
  *   - `accuracy_by_rating` — ACPL, avg blunders/mistakes/inaccuracies per game and win-rate, per Elo band. Shows how
  *     move quality scales with rating.
  *   - `blunder_outcome` — win-rate bucketed by number of blunders in the game. Quantifies "one blunder costs you the
  *     game".
  *
  * Requires the eval-annotated Lichess dump (`NOWCHESS_PGN_PATH` → an evals dump); JDBC games carry no per-move evals.
  */
 object AccuracyBlunderJob:
  def main(args: Array[String]): Unit =
    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-accuracy"
    val spark = SparkSession
      .builder()
      .appName("NowChess Accuracy & Blunders")
      .getOrCreate()
    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
    spark.stop()
  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
    val games = GameSource
      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
      .select("pgn", "result", "white_elo", "black_elo")
      .filter(F.col("result").isNotNull.and(F.col("pgn").contains("[%eval")))
      .withColumn("game_id", F.monotonically_increasing_id())
    // White-POV pawn evals in ply order; mate → ±10, normal evals clamped to ±10.
    val evalStrs = F.expr("""regexp_extract_all(pgn, '\\[%eval ([^\\]]+)\\]', 1)""")
    val evalCps = F.expr(
      "transform(eval_strs, x -> CASE " +
        "WHEN x LIKE '#-%' THEN -10.0 " +
        "WHEN x LIKE '#%' THEN 10.0 " +
        "ELSE greatest(-10.0, least(10.0, cast(x as double))) END)",
    )
    val withEvals = games
      .withColumn("eval_strs", evalStrs)
      .withColumn("eval_cp", evalCps)
      .filter(F.size(F.col("eval_cp")) >= 2)
    val plies = withEvals.select(
      F.col("game_id"),
      F.col("result"),
      F.col("white_elo"),
      F.col("black_elo"),
      F.posexplode(F.col("eval_cp")).as(Seq("ply", "eval_after")),
    )
    val byGame     = Window.partitionBy("game_id").orderBy("ply")
    val mover      = F.when(F.col("ply") % 2 === 0, "white").otherwise("black")
    val evalBefore = F.coalesce(F.lag("eval_after", 1).over(byGame), F.lit(0.15))
    val cpl = F.greatest(
      F.lit(0.0),
      F.when(F.col("mover") === "white", evalBefore - F.col("eval_after"))
        .otherwise(F.col("eval_after") - evalBefore),
    ) * 100
    val moves = plies
      .withColumn("mover", mover)
      .withColumn("cpl", cpl)
    val perSide = moves
      .groupBy("game_id", "mover", "result", "white_elo", "black_elo")
      .agg(
        F.round(F.avg("cpl"), 1).as("acpl"),
        F.sum(F.when(F.col("cpl") >= 200, 1).otherwise(0)).as("blunders"),
        F.sum(F.when(F.col("cpl") >= 100 && F.col("cpl") < 200, 1).otherwise(0)).as("mistakes"),
        F.sum(F.when(F.col("cpl") >= 50 && F.col("cpl") < 100, 1).otherwise(0)).as("inaccuracies"),
      )
      .withColumn(
        "self_elo",
        F.when(F.col("mover") === "white", F.col("white_elo")).otherwise(F.col("black_elo")),
      )
      .withColumn("won", F.when(F.col("mover") === F.col("result"), 1).otherwise(0))
    writeAccuracyByRating(perSide, jdbcUrl, dbUser, dbPass, outputDir)
    writeBlunderOutcome(perSide, jdbcUrl, dbUser, dbPass, outputDir)
  private def writeAccuracyByRating(
      perSide: org.apache.spark.sql.DataFrame,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      outputDir: String,
  ): Unit =
    val elo = F.col("self_elo")
    val band = F
      .when(elo < 1200, "<1200")
      .when(elo < 1500, "1200–1499")
      .when(elo < 1800, "1500–1799")
      .when(elo < 2100, "1800–2099")
      .otherwise("2100+")
    val bandOrder = F
      .when(elo < 1200, 1)
      .when(elo < 1500, 2)
      .when(elo < 1800, 3)
      .when(elo < 2100, 4)
      .otherwise(5)
    val stats = perSide
      .filter(elo.isNotNull)
      .withColumn("band", band)
      .withColumn("band_order", bandOrder)
      .groupBy("band", "band_order")
      .agg(
        F.count("*").as("player_games"),
        F.round(F.avg("acpl"), 1).as("avg_acpl"),
        F.round(F.avg("blunders"), 2).as("avg_blunders"),
        F.round(F.avg("mistakes"), 2).as("avg_mistakes"),
        F.round(F.avg("inaccuracies"), 2).as("avg_inaccuracies"),
        F.round(F.avg("won"), 3).as("win_rate"),
      )
      .orderBy(F.asc("band_order"))
      .drop("band_order")
    write(stats, outputDir, "accuracy_by_rating", jdbcUrl, dbUser, dbPass, "analytics_accuracy_by_rating")
  private def writeBlunderOutcome(
      perSide: org.apache.spark.sql.DataFrame,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      outputDir: String,
  ): Unit =
    val b      = F.col("blunders")
    val bucket = F.when(b === 0, "0").when(b === 1, "1").when(b === 2, "2").otherwise("3+")
    val order  = F.when(b === 0, 0).when(b === 1, 1).when(b === 2, 2).otherwise(3)
    val stats = perSide
      .withColumn("blunder_bucket", bucket)
      .withColumn("bucket_order", order)
      .groupBy("blunder_bucket", "bucket_order")
      .agg(
        F.count("*").as("player_games"),
        F.round(F.avg("won"), 3).as("win_rate"),
        F.round(F.avg("acpl"), 1).as("avg_acpl"),
      )
      .orderBy(F.asc("bucket_order"))
      .drop("bucket_order")
    write(stats, outputDir, "blunder_outcome", jdbcUrl, dbUser, dbPass, "analytics_blunder_outcome")
  private def write(
      df: org.apache.spark.sql.DataFrame,
      outputDir: String,
      name: String,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      table: String,
  ): Unit =
    df.write.mode("overwrite").parquet(s"$outputDir/$name")
    df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
    if !GameSource.isPgnMode then
      df.write
        .mode("overwrite")
        .format("jdbc")
        .option("url", jdbcUrl)
        .option("dbtable", table)
        .option("user", dbUser)
        .option("password", dbPass)
        .option("driver", "org.postgresql.Driver")
        .save()
@@ -0,0 +1,199 @@
 package de.nowchess.analytics
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.expressions.Window
 import org.apache.spark.sql.functions as F
 /** Time-management & clock-pressure analysis mined from Lichess `[%clk ...]` move annotations.
  *
  * Lichess records each player's remaining clock after every move (`{ [%clk 0:02:31] }`). This job reconstructs
  * per-move thinking time and remaining-time from those stamps to answer questions the existing time-control summary
  * cannot: how long do players actually think, how often do they fall into time scrambles (<10 s left), how often do
  * they flag (lose on time), and does burning the clock correlate with winning?
  *
  * Pipeline (Spark SQL string/array funcs + window funcs — no UDFs):
  *   1. `regexp_extract_all` pulls every `h:mm:ss` clock in ply order, converted to seconds.
  *   2. `posexplode` → one row per ply; even plies are White's clock, odd plies Black's.
  *   3. A per-(game,side) window `lag` gives the same side's previous clock; the difference is that move's thinking time.
  *      Remaining clock <10 s marks a time-scramble move.
  *   4. Roll up to (game, side): avg move time, scramble fraction, min clock, Elo, win flag, and whether the side lost on
  *      time (`Termination "Time forfeit"`).
  *
  * Outputs (Parquet + CSV + JDBC):
  *   - `clock_by_rating` — avg move time, scramble fraction, flag-loss rate and win-rate per Elo band.
  *   - `scramble_outcome` — win-rate bucketed by how much of the game was played in time-scramble. Quantifies the cost of
  *     time trouble.
  *
  * Requires a clock-annotated Lichess dump (`NOWCHESS_PGN_PATH`).
  */
 object ClockPressureJob:
  def main(args: Array[String]): Unit =
    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-clock-pressure"
    val spark = SparkSession
      .builder()
      .appName("NowChess Clock Pressure")
      .getOrCreate()
    run(spark, jdbcUrl, dbUser, dbPass, outputDir)
    spark.stop()
  def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
    val games = GameSource
      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
      .select("pgn", "result", "white_elo", "black_elo", "termination")
      .filter(F.col("result").isNotNull.and(F.col("pgn").contains("[%clk")))
      .withColumn("game_id", F.monotonically_increasing_id())
    val clkStrs = F.expr("""regexp_extract_all(pgn, '\\[%clk ([^\\]]+)\\]', 1)""")
    // "h:mm:ss" → seconds.
    val clkSecs = F.expr(
      "transform(clk_strs, x -> " +
        "cast(split(x, ':')[0] as double) * 3600 + " +
        "cast(split(x, ':')[1] as double) * 60 + " +
        "cast(split(x, ':')[2] as double))",
    )
    val withClk = games
      .withColumn("clk_strs", clkStrs)
      .withColumn("clk_sec", clkSecs)
      .filter(F.size(F.col("clk_sec")) >= 4)
    val plies = withClk.select(
      F.col("game_id"),
      F.col("result"),
      F.col("white_elo"),
      F.col("black_elo"),
      F.col("termination"),
      F.posexplode(F.col("clk_sec")).as(Seq("ply", "clk_after")),
    )
    val mover    = F.when(F.col("ply") % 2 === 0, "white").otherwise("black")
    val bySide   = Window.partitionBy("game_id", "mover").orderBy("ply")
    val moveTime = F.lag("clk_after", 1).over(bySide) - F.col("clk_after")
    val moves = plies
      .withColumn("mover", mover)
      .withColumn("move_time", moveTime)
    val perSide = moves
      .groupBy("game_id", "mover", "result", "white_elo", "black_elo", "termination")
      .agg(
        F.round(F.avg("move_time"), 1).as("avg_move_time"),
        F.count("*").as("moves"),
        F.round(F.min("clk_after"), 1).as("min_clk"),
        F.sum(F.when(F.col("clk_after") < 10, 1).otherwise(0)).as("scramble_moves"),
      )
      .withColumn("scramble_fraction", F.round(F.col("scramble_moves") / F.col("moves"), 3))
      .withColumn(
        "self_elo",
        F.when(F.col("mover") === "white", F.col("white_elo")).otherwise(F.col("black_elo")),
      )
      .withColumn("won", F.when(F.col("mover") === F.col("result"), 1).otherwise(0))
      .withColumn(
        "flag_loss",
        F.when(
          F.coalesce(F.col("termination"), F.lit("")).contains("Time forfeit") && F.col("won") === 0,
          1,
        ).otherwise(0),
      )
    writeClockByRating(perSide, jdbcUrl, dbUser, dbPass, outputDir)
    writeScrambleOutcome(perSide, jdbcUrl, dbUser, dbPass, outputDir)
  private def writeClockByRating(
      perSide: org.apache.spark.sql.DataFrame,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      outputDir: String,
  ): Unit =
    val elo = F.col("self_elo")
    val band = F
      .when(elo < 1200, "<1200")
      .when(elo < 1500, "1200–1499")
      .when(elo < 1800, "1500–1799")
      .when(elo < 2100, "1800–2099")
      .otherwise("2100+")
    val bandOrder = F
      .when(elo < 1200, 1)
      .when(elo < 1500, 2)
      .when(elo < 1800, 3)
      .when(elo < 2100, 4)
      .otherwise(5)
    val stats = perSide
      .filter(elo.isNotNull)
      .withColumn("band", band)
      .withColumn("band_order", bandOrder)
      .groupBy("band", "band_order")
      .agg(
        F.count("*").as("player_games"),
        F.round(F.avg("avg_move_time"), 1).as("avg_move_time_s"),
        F.round(F.avg("scramble_fraction"), 3).as("avg_scramble_fraction"),
        F.round(F.avg("flag_loss"), 3).as("flag_loss_rate"),
        F.round(F.avg("won"), 3).as("win_rate"),
      )
      .orderBy(F.asc("band_order"))
      .drop("band_order")
    write(stats, outputDir, "clock_by_rating", jdbcUrl, dbUser, dbPass, "analytics_clock_by_rating")
  private def writeScrambleOutcome(
      perSide: org.apache.spark.sql.DataFrame,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      outputDir: String,
  ): Unit =
    val sf = F.col("scramble_fraction")
    val bucket = F
      .when(sf === 0, "none")
      .when(sf < 0.05, "<5%")
      .when(sf < 0.20, "5–20%")
      .otherwise(">20%")
    val order = F
      .when(sf === 0, 0)
      .when(sf < 0.05, 1)
      .when(sf < 0.20, 2)
      .otherwise(3)
    val stats = perSide
      .withColumn("scramble_bucket", bucket)
      .withColumn("bucket_order", order)
      .groupBy("scramble_bucket", "bucket_order")
      .agg(
        F.count("*").as("player_games"),
        F.round(F.avg("won"), 3).as("win_rate"),
        F.round(F.avg("flag_loss"), 3).as("flag_loss_rate"),
      )
      .orderBy(F.asc("bucket_order"))
      .drop("bucket_order")
    write(stats, outputDir, "scramble_outcome", jdbcUrl, dbUser, dbPass, "analytics_scramble_outcome")
  private def write(
      df: org.apache.spark.sql.DataFrame,
      outputDir: String,
      name: String,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      table: String,
  ): Unit =
    df.write.mode("overwrite").parquet(s"$outputDir/$name")
    df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
    if !GameSource.isPgnMode then
      df.write
        .mode("overwrite")
        .format("jdbc")
        .option("url", jdbcUrl)
        .option("dbtable", table)
        .option("user", dbUser)
        .option("password", dbPass)
        .option("driver", "org.postgresql.Driver")
        .save()
@@ -0,0 +1,154 @@
 package de.nowchess.analytics
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.expressions.Window
 import org.apache.spark.sql.functions as F
 /** Smurf / sandbagging anomaly detection via population z-scores.
  *
  * Smurfs (strong players on fresh accounts) and sandbaggers leave a statistical signature: a win-rate, an upset-rate
  * (beating higher-rated opponents) and a self-Elo climb that sit far above the population norm. This job builds those
  * three features per player, standardises each against the whole player base, and flags the players whose combined
  * deviation is extreme.
  *
  * Features per player (from each game's own/opponent Elo):
  *   - win_rate — fraction of decisive results won
  *   - upset_rate — wins vs higher-rated opponents / games vs higher-rated opponents
  *   - elo_climb — max self-Elo − min self-Elo across their games (rapid rating gain)
  *
  * Standardisation uses a single unbounded window (`Window.partitionBy()`), i.e. mean/stddev over every qualifying
  * player, so z = (x − μ) / σ. The composite anomaly score sums the three z-scores. No UDFs — pure SQL aggregates +
  * window functions, so Catalyst plans the whole job.
  *
  * Outputs (Parquet + CSV + JDBC):
  *   - `anomaly_scores` — every qualifying player with features, z-scores and composite, ranked most-anomalous first.
  *   - `flagged_smurfs` — the suspicious subset (high composite, or the classic high-winrate / few-games / steep-climb
  *     profile).
  *
  * Meaningful only when Elo is present (Lichess dump); requires `minGames` (arg 1, default 15) to avoid small-sample
  * noise.
  */
 object SmurfAnomalyJob:
  def main(args: Array[String]): Unit =
    val jdbcUrl   = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
    val dbUser    = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
    val dbPass    = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
    val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-smurf-anomaly"
    val minGames  = if args.length > 1 then args(1).toInt else 15
    val spark = SparkSession
      .builder()
      .appName("NowChess Smurf Anomaly Detection")
      .getOrCreate()
    run(spark, jdbcUrl, dbUser, dbPass, outputDir, minGames)
    spark.stop()
  def run(
      spark: SparkSession,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      outputDir: String,
      minGames: Int,
  ): Unit =
    val games = GameSource
      .loadExtended(spark, jdbcUrl, dbUser, dbPass)
      .select("white_id", "black_id", "result", "white_elo", "black_elo")
      .filter(F.col("result").isNotNull)
    val asWhite = games.select(
      F.col("white_id").as("player_id"),
      F.col("white_elo").as("self_elo"),
      F.col("black_elo").as("opp_elo"),
      F.when(F.col("result") === "white", 1).otherwise(0).as("won"),
    )
    val asBlack = games.select(
      F.col("black_id").as("player_id"),
      F.col("black_elo").as("self_elo"),
      F.col("white_elo").as("opp_elo"),
      F.when(F.col("result") === "black", 1).otherwise(0).as("won"),
    )
    val playerGames = asWhite
      .union(asBlack)
      .filter(F.col("self_elo").isNotNull.and(F.col("opp_elo").isNotNull))
    val higher = F.col("opp_elo") > F.col("self_elo")
    val features = playerGames
      .groupBy("player_id")
      .agg(
        F.count("*").as("total_games"),
        F.round(F.avg("won"), 3).as("win_rate"),
        F.round(F.avg("self_elo"), 0).as("avg_self_elo"),
        (F.max("self_elo") - F.min("self_elo")).as("elo_climb"),
        F.sum(F.when(higher, 1).otherwise(0)).as("vs_higher"),
        F.sum(F.when(higher && F.col("won") === 1, 1).otherwise(0)).as("upsets"),
      )
      .filter(F.col("total_games") >= minGames)
      .withColumn("upset_rate", F.round(F.col("upsets") / F.greatest(F.col("vs_higher"), F.lit(1)), 3))
    val all = Window.partitionBy()
    def z(col: String): org.apache.spark.sql.Column =
      val mean = F.avg(col).over(all)
      val std  = F.stddev(col).over(all)
      F.round((F.col(col) - mean) / F.when(std === 0 || std.isNull, F.lit(1.0)).otherwise(std), 2)
    val scored = features
      .withColumn("z_win_rate", z("win_rate"))
      .withColumn("z_upset_rate", z("upset_rate"))
      .withColumn("z_elo_climb", z("elo_climb"))
      .withColumn(
        "anomaly_score",
        F.round(F.col("z_win_rate") + F.col("z_upset_rate") + F.col("z_elo_climb"), 2),
      )
      .withColumn(
        "flagged",
        (F.col("anomaly_score") >= 4.0)
          .or(F.col("win_rate") >= 0.8 && F.col("total_games") < 50 && F.col("elo_climb") >= 300),
      )
    val ordered = scored
      .select(
        "player_id",
        "total_games",
        "win_rate",
        "avg_self_elo",
        "elo_climb",
        "upset_rate",
        "z_win_rate",
        "z_upset_rate",
        "z_elo_climb",
        "anomaly_score",
        "flagged",
      )
      .orderBy(F.desc("anomaly_score"))
    write(ordered, outputDir, "anomaly_scores", jdbcUrl, dbUser, dbPass, "analytics_smurf_anomaly")
    val flagged = ordered.filter(F.col("flagged") === true)
    write(flagged, outputDir, "flagged_smurfs", jdbcUrl, dbUser, dbPass, "analytics_flagged_smurfs")
  private def write(
      df: org.apache.spark.sql.DataFrame,
      outputDir: String,
      name: String,
      jdbcUrl: String,
      dbUser: String,
      dbPass: String,
      table: String,
  ): Unit =
    df.write.mode("overwrite").parquet(s"$outputDir/$name")
    df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
    if !GameSource.isPgnMode then
      df.write
        .mode("overwrite")
        .format("jdbc")
        .option("url", jdbcUrl)
        .option("dbtable", table)
        .option("user", dbUser)
        .option("password", dbPass)
        .option("driver", "org.postgresql.Driver")
        .save()
@@ -1,3 +1,3 @@
 MAJOR=0
-MINOR=7
+MINOR=8
 PATCH=0
@@ -843,3 +843,306 @@
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-23)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-23)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-23)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 * **official-bots:** use ThreadLocalRandom in PolyglotBook for native image ([1b30c3b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1b30c3be393d25712c8743d3d9057207f8bbb67c))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-24)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
 * **official-bots:** add Google Colab notebook for NNUE training (NCS-111) ([#81](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/81)) ([fa10852](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa10852bc98451d4068ec6fb9e7a486b5e53ef5c))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** implement king-relative (HalfKP) encoding in NNUE (NCS-109) ([#80](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/80)) ([44f376f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/44f376f03221f086b898741436e13c93fd314dd1))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * modified training pipeline ([9f9140c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9f9140cb585345cd244a1dfee1a06e51a5f7f7a8))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 * **official-bots:** use ThreadLocalRandom in PolyglotBook for native image ([1b30c3b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1b30c3be393d25712c8743d3d9057207f8bbb67c))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-24)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * **ncs-110:** feed NNUE root-move scores into search move ordering ([#83](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/83)) ([e4fee85](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e4fee8513430093d46957970618935e99591519f))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
 * **official-bots:** add Google Colab notebook for NNUE training (NCS-111) ([#81](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/81)) ([fa10852](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa10852bc98451d4068ec6fb9e7a486b5e53ef5c))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** implement king-relative (HalfKP) encoding in NNUE (NCS-109) ([#80](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/80)) ([44f376f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/44f376f03221f086b898741436e13c93fd314dd1))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * modified training pipeline ([9f9140c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9f9140cb585345cd244a1dfee1a06e51a5f7f7a8))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 * **official-bots:** use ThreadLocalRandom in PolyglotBook for native image ([1b30c3b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1b30c3be393d25712c8743d3d9057207f8bbb67c))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
 ##  (2026-06-24)
 ### Features
 * add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
 * add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
 * **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
 * **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
 * **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
 * configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
 * **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
 * **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
 * **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
 * **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
 * **ncs-110:** feed NNUE root-move scores into search move ordering ([#83](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/83)) ([e4fee85](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e4fee8513430093d46957970618935e99591519f))
 * NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
 * NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
 * NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
 * **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
 * **official-bots:** add Google Colab notebook for NNUE training (NCS-111) ([#81](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/81)) ([fa10852](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fa10852bc98451d4068ec6fb9e7a486b5e53ef5c))
 * **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
 * **official-bots:** implement king-relative (HalfKP) encoding in NNUE (NCS-109) ([#80](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/80)) ([44f376f](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/44f376f03221f086b898741436e13c93fd314dd1))
 * **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
 * **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
 * **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
 * **official-bots:** standalone self-play + one-shot dataset builder for NNUE training ([1c80abd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1c80abdb8a45814d642d43c633cde81ce7374c4f))
 * **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
 * **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
 * **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
 * true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
 ### Bug Fixes
 * enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
 * modified training pipeline ([9f9140c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9f9140cb585345cd244a1dfee1a06e51a5f7f7a8))
 * **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
 * **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
 * **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
 * **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
 * **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
 * **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
 * **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
 * **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
 * **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
 * **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
 * **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
 * **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
 * **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
 * **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
 * **official-bots:** use ThreadLocalRandom in PolyglotBook for native image ([1b30c3b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1b30c3be393d25712c8743d3d9057207f8bbb67c))
 ### Reverts
 * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
@@ -47,6 +47,14 @@ tasks.withType<JavaCompile> {
    options.compilerArgs.add("-parameters")
 }
 tasks.register<JavaExec>("selfPlay") {
    group = "nnue"
    description = "Run standalone NNUEBot self-play and write FENs for labeling."
    mainClass.set("de.nowchess.bot.selfplay.SelfPlayMain")
    classpath = sourceSets["main"].runtimeClasspath
    args((project.findProperty("spArgs")?.toString() ?: "").split(" ").filter { it.isNotBlank() })
 }
 dependencies {
    compileOnly("org.scala-lang:scala3-compiler_3") {
@@ -0,0 +1,212 @@
 # Concept: NNUE Training Data — Quality, Scale, and Transfer to Colab
 Local generation + labeling is **not** a constraint (Ryzen 9800X3D / RTX 5070 / 32 GB).
 So the design splits cleanly:
 - **Data plane = local box.** Generate, label, shard, publish. Cheap, fast, no limits.
 - **Train plane = Colab.** Pull a dataset version, GPU-train, export `.nbai`.
 Colab never runs Stockfish and never sees a browser upload. Three problems below:
 **(1) good data, (2) growing it over time, (3) getting it there easily** — (3) is the priority.
 ---
 ## 1. Generating *good* training sets
 ### The current weak spot
 `generate.py` plays **fully random games** (`random.choice(legal_moves)`). Random play
 produces positions that never occur in real games — material chaos, nonsense pawn
 structures. An NNUE trained on that learns to evaluate a distribution it will never
 face. Fine as filler, wrong as the backbone.
 ### What a good NNUE dataset needs
 1. **Realistic position distribution.** Positions should resemble what the bot actually
   reaches in search — from real games and engine play, not coin-flip moves.
 2. **Phase coverage.** Openings, middlegames, endgames all represented. Endgames are
   under-sampled by random play and matter most for precise eval.
 3. **Eval balance.** Real game data is dominated by near-equal positions. If 80% of
   labels sit in `[-0.5, +0.5]`, the net learns "everything is roughly equal." Resample
   to flatten the eval histogram (cap per-bucket counts).
 4. **Accurate labels.** Deeper Stockfish = better target. Locally you can afford
   depth 16–20. Or skip labeling entirely with the Lichess eval DB (below).
 5. **Clean positions.** Dedup by FEN; drop terminal/checkmate/stalemate; the side to
   move should not already be in check unless intended; tag the game phase.
 ### Recommended source mix (per dataset version)
 | Source | Role | How | Weight |
 |---|---|---|---|
 | **Lichess eval DB** | Backbone | `lichess_importer.py` — millions of FENs **pre-labeled** by deep Stockfish, real human positions, correct sign convention | 50–70% |
 | **Engine self-play** | Bot's own distribution | NNUEBot (or vs Stockfish) plays games; sample positions; label with local Stockfish | 20–40% |
 | **Tactical puzzles** | Sharp/critical positions | `tactical_positions_extractor.py` (Lichess puzzle DB) | 5–15% |
 | **Random play** | Cheap diversity filler | existing `generate.py`, capped low | ≤10% |
 The backbone is real, pre-labeled data — so labeling cost is near zero and quality is
 high. Self-play is the part that adapts data to *your* bot. Random play stays only as
 a thin diversity sprinkle.
 ### Self-play flywheel (the quality engine over time)
 The strongest lever: **net N generates the games that train net N+1.**
 ```
 net_vN  ──play self-play games──►  sample positions  ──label (Stockfish)──►
   ▲                                                                        │
   └──────────────── train on (backbone + new self-play) ◄─────────────────┘
                                  net_v(N+1)
 ```
 Each generation, the bot reaches positions closer to its real playing distribution,
 labels them with a stronger-than-bot oracle (Stockfish), and learns the gap. Standard
 modern NNUE practice. Keep the Lichess backbone mixed in every round so the net does
 not overfit to its own blind spots.
 ---
 ## 2. Scaling datasets over time — append-only shards
 Do **not** maintain one growing `labeled.jsonl` and re-copy it. Make a dataset an
 **immutable set of shards plus a manifest**:
 ```
 datasets/
  shards/
    lichess_000001.jsonl.zst      # ~50–100k positions each, ~5–10 MB compressed
    lichess_000002.jsonl.zst
    selfplay_v7_000001.jsonl.zst
    tactical_000001.jsonl.zst
    ...
  manifest.json
 ```
 `manifest.json`:
 ```json
 {
  "dataset_version": 7,
  "created": "2026-06-24T...",
  "total_positions": 4200000,
  "scale": 300.0,
  "shards": [
    {"file": "lichess_000001.jsonl.zst", "positions": 100000,
     "sha256": "...", "source": "lichess_eval", "stockfish_depth": 0},
    {"file": "selfplay_v7_000001.jsonl.zst", "positions": 80000,
     "sha256": "...", "source": "selfplay", "net": "v7", "stockfish_depth": 18}
  ]
 }
 ```
 Properties this buys:
 - **Growth = add shards.** Generate a new batch, label it, write one new shard, append
  one manifest entry. Never touch existing shards. O(new data), not O(total).
 - **Provenance.** Each shard records source + net + depth. You can later down-weight or
  drop a bad batch by editing the manifest, no relabeling.
 - **Dedup across shards** by FEN hash at build time; record dropped counts in metadata.
 - **Reproducible mixes.** A "dataset version" is just a manifest selecting shards +
  per-source sampling weights. Cheap to define many mixes over the same shard pool.
 - **Resumable, cache-friendly transfer** (next section) — the whole reason for shards.
 `dataset.py`'s existing `ds_vN` + `metadata.json` scheme generalizes to this directly:
 the dataset dir holds `shards/` + `manifest.json` instead of one `labeled.jsonl`.
 ---
 ## 3. Getting data to Colab easily  ← top priority
 Shards make this trivial: **incremental sync, never a full re-upload.**
 ### Recommended: rclone → Google Drive, read from mounted Drive
 Colab mounts Drive natively, so the cheapest path is to make Drive the shard store and
 sync into it with `rclone` (only uploads new/changed shards):
 ```bash
 # Local, after building shards:
 rclone copy datasets/ gdrive:NowChess/datasets --progress
 #   ^ uploads only shards Drive doesn't have yet. Adding 80k positions = one small file.
 ```
 Colab side, one cell:
 ```python
 SRC = '/content/drive/MyDrive/NowChess/datasets'   # mounted, no download
 import json, shutil, pathlib
 manifest = json.load(open(f'{SRC}/manifest.json'))
 local = pathlib.Path('/content/datasets'); local.mkdir(exist_ok=True)
 for sh in manifest['shards']:                       # copy Drive→local SSD (fast seq read)
    dst = local / sh['file']
    if not dst.exists():                            # cache: only copy missing shards
        shutil.copy(f"{SRC}/shards/{sh['file']}", dst)
 ```
 Why this wins on "easy":
 - **No browser upload, ever.** One `rclone copy` from your PC.
 - **Incremental both directions.** Add a shard locally → next `rclone copy` ships only
  that shard. Colab copies only shards it doesn't already have on `/content`.
 - **Zero new infra.** Drive is already mounted in the notebook.
 ### Alternative: Gitea release per dataset version (if Drive quota hurts)
 You self-host `git.janis-eccarius.de`. Tag `ds_v7`, attach shards + `manifest.json` as
 release assets. Colab reads the manifest, then parallel-`wget` only the shards it lacks
 (checksum-verified). Versioned, immutable, no Drive quota, token-gated. Slightly more
 wiring than rclone→Drive.
 Pick rclone→Drive for minimum friction; Gitea releases if you want hard versioning and
 to keep Drive small.
 ### Notebook changes either way
 - Clone repo to **ephemeral `/content`** (fast), not Drive. Persist only datasets +
  checkpoints.
 - Drop Option A (no Colab generation) and Option B (no browser upload). One "sync
  dataset version" cell instead.
 - Train reads shards via a streaming `.jsonl.zst` loader (apply per-source sampling
  weights + eval-bucket balancing here). Keep burst-train + Drive checkpoints + `.nbai`
  export.
 ---
 ## Resulting workflow
 ```
 LOCAL (9800X3D / RTX5070)                         COLAB (GPU)
 ─────────────────────────                         ───────────
 import Lichess eval DB ─┐
 self-play with net_vN  ─┼─► label ─► dedup ─► write new shard(s) ─► manifest++
 tactical / random      ─┘                                  │
                                       rclone copy ────────┘
                                       datasets/ → Drive
                                                              │  (only new shards move)
                                                              ▼
                                    sync version → copy missing shards → train (GPU)
                                                              │
                                                       export .nbai
                                                              ▼
                              place in src/main/resources/, rebuild native image
 ```
 ## Build order
 1. **Shard format + manifest** in `dataset.py`: write/read `shards/*.jsonl.zst` +
   `manifest.json`; dedup-across-shards on build; provenance per shard.
 2. **Streaming `.zst` dataloader** in `train.py`: read shards, apply per-source weights
   and eval-bucket balancing.
 3. **Self-play generator** in `src/`: NNUEBot/Stockfish self-play → positions → local
   Stockfish label → new shard. This is the scaling engine.
 4. **`dataset_sync.py`**: `push` (rclone→Drive or Gitea upload) / `pull` (cache-aware).
 5. **Notebook rewrite**: ephemeral clone, single sync cell, weighted streaming loader.
 6. Wire `lichess_importer.py` as the backbone shard source.
 ## Open decisions
 - **Transfer backend** — rclone→Drive (easiest, recommended) vs Gitea releases (hard
  versioning).
 - **Self-play opponent** — NNUEBot vs itself (own distribution) vs vs-Stockfish
  (stronger, more decisive games). Likely a mix.
 - **Backbone/self-play ratio** — start ~60/30/10 (lichess/selfplay/tactical), tune by
  measured strength.
 - **Shard size** — 50k vs 100k positions/shard (transfer granularity vs file count).
@@ -0,0 +1,180 @@
 # Implementation Plan: Two One-Liner Tools (self-play + dataset)
 Goal: **two tools, two start scripts, minimal params.**
 ```
 ./selfplay.sh      # bot plays games against itself, writes selfplay FENs       (Scala, standalone)
 ./dataset.sh       # builds the ENTIRE training dataset + rclone push to Drive   (Python, one script)
 ```
 Both default-everything. Optional first positional arg only when you want to override
 the one number that matters.
 ---
 ## Tool 1 — `selfplay.sh` (standalone bot, no microservices)
 ### Why it can be standalone
 `Bot` is just `GameContext => Option[Move]` (`Bot.scala`). `NNUEBot.apply` needs only
 `DefaultRules` (rule module) + `EvaluationNNUE` (loads the bundled `.nbai`). No Quarkus,
 no coordinator/account/ws. The bot module already depends on `api, rule, io`, and `io`
 has `FenExporter` + `GameContext.initial` exists. So a plain JVM `main` can run games
 with zero service wiring.
 ### New file: `SelfPlayMain.scala`
 `modules/official-bots/src/main/scala/de/nowchess/bot/selfplay/SelfPlayMain.scala`
 Loop per game:
 1. Start from `GameContext.initial`.
 2. **Opening diversity** — play `R` random legal plies (default 8). Without this,
   NNUEBot vs itself is deterministic → the *same game every time*. Random openings are
   what make the games diverse. (Optional later: seed from polyglot book instead.)
 3. Then both sides = `NNUEBot(difficulty)`. Apply moves via `DefaultRules.applyMove`.
 4. Stop on `isCheckmate / isStalemate / isInsufficientMaterial / isFiftyMoveRule /
   isThreefoldRepetition`, or ply cap (default 200).
 5. Emit one **FEN per ply** (via `FenExporter`), skipping positions where side-to-move
   is in check and terminal positions — same filter philosophy the labeler wants.
 6. Append FENs to the output file (one per line) — exactly the format `label.py` reads.
 Config = a small `case class` with defaults; read from env/args. Defaults:
 `games=2000`, `randomOpeningPlies=8`, `maxPlies=200`, `out=python/data/selfplay.txt`,
 `threads = availableProcessors`. Parallelize games across threads (each game is
 independent; bot is pure).
 Output is **FENs only** — labeling happens in Tool 2 with Stockfish. Keeps the bot tool
 single-responsibility and fast.
 ### Gradle: a plain run task (not Quarkus)
 Add to `modules/official-bots/build.gradle.kts`:
 ```kotlin
 tasks.register<JavaExec>("selfPlay") {
    group = "nnue"
    mainClass.set("de.nowchess.bot.selfplay.SelfPlayMain")
    classpath = sourceSets["main"].runtimeClasspath
    args(project.findProperty("spArgs")?.toString()?.split(" ") ?: emptyList())
 }
 ```
 ### `selfplay.sh` (repo `python/` dir)
 ```bash
 #!/usr/bin/env bash
 set -euo pipefail
 GAMES="${1:-2000}"
 cd "$(dirname "$0")/../../.."        # repo root
 ./gradlew -q :official-bots:selfPlay -PspArgs="--games $GAMES --out modules/official-bots/python/data/selfplay.txt"
 echo "Self-play FENs -> modules/official-bots/python/data/selfplay.txt"
 ```
 Usage:
 ```bash
 ./selfplay.sh          # 2000 games, bundled net
 ./selfplay.sh 8000     # more games
 ```
 ---
 ## Tool 2 — `dataset.sh` → `build_dataset.py` (builds EVERYTHING)
 One Python script that produces a complete, sharded, pushed dataset. No TUI, no
 multi-step menus. It runs the whole data plane end to end:
 ```
 lichess eval DB ─┐
 selfplay.txt    ─┼─► label (local Stockfish, skip already-labeled) ─► dedup ─►
 tactical        ─┤                                                   eval-bucket
 random filler   ─┘                                                   balance ─►
                                              write shards/*.jsonl.zst + manifest.json ─► rclone push
 ```
 ### New file: `build_dataset.py` (top-level `python/`)
 Reuses existing modules — orchestrates, doesn't reinvent:
 - **Backbone:** `lichess_importer.py` — download + sample N pre-labeled positions from
  the Lichess eval DB (no Stockfish cost).
 - **Self-play:** read `data/selfplay.txt` FENs → `label.py` with local Stockfish
  (depth 18, all cores — your box eats this).
 - **Tactical:** `tactical_positions_extractor.py` → `label.py`.
 - **Random filler:** `generate.py` (small cap) → `label.py`.
 - **Merge:** dedup by FEN across all sources; **eval-bucket balancing** (cap positions
  per eval bin so near-equal positions don't dominate).
 - **Shard + manifest:** split into `shards/*.jsonl.zst` (~100k positions each) + write
  `manifest.json` (positions, sha256, source, net, depth per shard). Append-only:
  existing shards untouched, new run adds shards + entries (the scaling story from the
  concept).
 - **Push:** `rclone copy datasets/ gdrive:NowChess/datasets` — ships only new shards.
 ### One config block, sane defaults
 Top of the script — the *only* thing you ever touch:
 ```python
 LICHESS_POSITIONS = 2_000_000   # backbone
 USE_SELFPLAY      = True        # reads data/selfplay.txt if present
 TACTICAL_PUZZLES  = 200_000
 RANDOM_FILLER     = 100_000
 STOCKFISH_DEPTH   = 18
 RCLONE_REMOTE     = "gdrive:NowChess/datasets"
 ```
 Everything else (paths, workers=all cores, shard size, balancing bins) is internal.
 ### `dataset.sh`
 ```bash
 #!/usr/bin/env bash
 set -euo pipefail
 cd "$(dirname "$0")"
 python build_dataset.py "$@"
 ```
 Usage:
 ```bash
 ./dataset.sh          # full dataset (lichess + selfplay + tactical + filler) -> Drive
 ```
 That single command: downloads backbone, labels self-play/tactical/filler, dedups,
 balances, shards, and rclone-pushes to Drive. Colab then syncs (concept doc §3).
 ---
 ## End-to-end loop (the flywheel)
 ```
 ./selfplay.sh        # bot generates games with the current net
 ./dataset.sh         # fold them into a new dataset version, push to Drive
 # (Colab) sync + train -> export nnue_weights.nbai
 # drop .nbai into modules/official-bots/src/main/resources/, rebuild
 ./selfplay.sh        # next net plays stronger, better games... repeat
 ```
 ---
 ## Build order
 1. `SelfPlayMain.scala` — standalone game loop, random openings, parallel games, FEN out.
 2. `selfPlay` Gradle `JavaExec` task + `selfplay.sh`.
 3. `build_dataset.py` — orchestrate existing importer/label/tactical/generate into
   shards + manifest; rclone push.
 4. `dataset.sh`.
 5. Shard/manifest read support in `dataset.py` + zstd streaming loader in `train.py`
   (consumed on Colab).
 6. Notebook: single "sync dataset version" cell, ephemeral `/content` clone.
 ## Decisions to confirm
 - **Self-play opponent:** NNUEBot vs itself + random openings (planned). Add vs-Stockfish
  later if more decisive games wanted.
 - **Self-play net source:** use the `.nbai` bundled in `resources` (simplest), or accept
  a `--weights path`? Plan = bundled by default.
 - **rclone remote name:** confirm `gdrive` is your configured rclone remote, and the
  target folder `NowChess/datasets`.
 - **Stockfish path on your box:** `$STOCKFISH_PATH` or `/usr/games/stockfish`?
@@ -0,0 +1,190 @@
 {
 "nbformat": 4,
 "nbformat_minor": 5,
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10.0"
  },
  "colab": {
   "provenance": [],
   "gpuType": "T4"
  },
  "accelerator": "GPU"
 },
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "# NNUE Training Pipeline\n\nGPU training on Colab. Data is built **locally** (`./dataset.sh` → sharded, pushed to\nDrive via rclone); this notebook only **syncs shards → trains → exports `.nbai`**.\nNo generation, no Stockfish labeling, no browser uploads here.\n\n**Runtime:** GPU (T4 or better). Runtime → Change runtime type → T4 GPU.\n\n**Persistence:** Datasets and checkpoints live on Google Drive, so training resumes\nafter a session timeout. The repo is cloned to ephemeral `/content` for speed.",
   "id": "intro-md"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## ⚙️ 1 — Setup"
   ],
   "id": "setup-md"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Mount Google Drive for checkpoint persistence\n",
    "from google.colab import drive\n",
    "drive.mount('/content/drive')"
   ],
   "id": "mount-drive"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "import os\n\n# ── Configure these paths once ───────────────────────────────────────────────\nREPO_URL   = 'https://git.janis-eccarius.de/NowChess/NowChessSystems.git'\nDRIVE_ROOT = '/content/drive/MyDrive/NowChess'   # datasets + weights persist here\nREPO_DIR   = '/content/NowChessSystems'          # ephemeral, fast local clone\nPYTHON_DIR = f'{REPO_DIR}/modules/official-bots/python'\n# ─────────────────────────────────────────────────────────────────────────────\n\nos.makedirs(DRIVE_ROOT, exist_ok=True)\n\n# Clone to ephemeral /content (NOT Drive) — fast checkout, no Drive bloat.\nif not os.path.isdir(REPO_DIR):\n    !git clone --depth=1 \"{REPO_URL}\" \"{REPO_DIR}\"\n    print('Repo cloned to /content.')\nelse:\n    !git -C \"{REPO_DIR}\" pull --ff-only\n    print('Repo updated.')",
   "id": "clone-repo"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "# Install Python dependencies. No Stockfish — labeling happens on the local box,\n# this notebook only trains on already-labeled shards.\n!pip install -q chess tqdm rich zstandard\n\nimport sys\nsys.path.insert(0, f'{PYTHON_DIR}/src')\nsys.path.insert(0, PYTHON_DIR)\nprint('Python path configured.')",
   "id": "install-deps"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "---\n## 🗄️ 2 — Data\n\nDatasets are built **locally** (`./dataset.sh`) and pushed to Drive with rclone as\ncompressed shards under `MyDrive/NowChess/datasets/`. Here we just sync those shards\nto the fast local disk — no generation, no labeling, no browser uploads.\n\nThe cell reads `manifest.json` and copies only shards not already cached on `/content`.",
   "id": "data-md"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "import json, shutil\nfrom pathlib import Path\n\n# Source: shards synced from the local box via `rclone copy datasets/ gdrive:NowChess/datasets`\nDRIVE_DATASETS = Path(DRIVE_ROOT) / 'datasets'\nLOCAL_DATASETS = Path('/content/datasets')\n(LOCAL_DATASETS / 'shards').mkdir(parents=True, exist_ok=True)\n\nmanifest = json.load(open(DRIVE_DATASETS / 'manifest.json'))\nprint(f\"Dataset v{manifest['dataset_version']}: \"\n      f\"{manifest['total_positions']:,} positions across {len(manifest['shards'])} shards\")\n\ncopied = 0\nfor sh in manifest['shards']:\n    dst = LOCAL_DATASETS / 'shards' / sh['file']\n    if not dst.exists():                      # cache: only copy shards we don't already have\n        shutil.copy(DRIVE_DATASETS / 'shards' / sh['file'], dst)\n        copied += 1\nshutil.copy(DRIVE_DATASETS / 'manifest.json', LOCAL_DATASETS / 'manifest.json')\n\nDATA_PATH = str(LOCAL_DATASETS)               # train_nnue / burst_train read this dir of shards directly\nprint(f\"Synced {copied} new shard(s). Dataset ready at {DATA_PATH}\")",
   "id": "data-paths"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## 🏋️ 3 — Train\n",
    "\n",
    "Standard training runs a fixed number of epochs.  \n",
    "**Burst mode** is better for Colab: it repeatedly restarts from the best checkpoint within a time budget, surviving session disconnects gracefully."
   ],
   "id": "train-md"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "from train import train_nnue, burst_train, DEFAULT_HIDDEN_SIZES\n\nWEIGHTS_DIR = Path(DRIVE_ROOT) / 'weights'\nWEIGHTS_DIR.mkdir(parents=True, exist_ok=True)\nOUTPUT_FILE = str(WEIGHTS_DIR / 'nnue_weights.pt')\n\n# ── Training hyperparameters ──────────────────────────────────────────────────\nHIDDEN_SIZES      = DEFAULT_HIDDEN_SIZES\n# fen_to_features builds a DENSE 98304-dim input, so a batch costs\n# batch_size * 98304 * 4 bytes on the host (× DataLoader prefetch). On Colab's\n# ~12 GB RAM keep this small; raise it only if you have headroom.\nBATCH_SIZE        = 4096\nEPOCHS            = 100\nEARLY_STOPPING    = 10                     # None to disable\nSUBSAMPLE_RATIO   = 1.0\n\n# Resume from latest checkpoint if one exists\ncheckpoints = sorted(WEIGHTS_DIR.glob('nnue_weights_v*.pt'))\nCHECKPOINT = str(checkpoints[-1]) if checkpoints else None\nif CHECKPOINT:\n    print(f'Resuming from checkpoint: {CHECKPOINT}')\nelse:\n    print('Starting training from scratch.')",
   "id": "train-config"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "# ── Standard training ─────────────────────────────────────────────────────────\n# Use this when you have a reliable long-running session.\n\ntrain_nnue(\n    data_file=DATA_PATH,\n    output_file=OUTPUT_FILE,\n    epochs=EPOCHS,\n    batch_size=BATCH_SIZE,\n    checkpoint=CHECKPOINT,\n    use_versioning=True,\n    early_stopping_patience=EARLY_STOPPING,\n    subsample_ratio=SUBSAMPLE_RATIO,\n    hidden_sizes=HIDDEN_SIZES,\n)",
   "id": "standard-train"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "# ── Burst training (recommended for Colab free tier) ─────────────────────────\n# Restarts from the global best each time early stopping fires.\n# Set BURST_MINUTES to slightly less than the Colab session limit (~70 min).\n\nBURST_MINUTES      = 70\nEPOCHS_PER_SEASON  = 30\nBURST_PATIENCE     = 8\n\nburst_train(\n    data_file=DATA_PATH,\n    output_file=OUTPUT_FILE,\n    duration_minutes=BURST_MINUTES,\n    epochs_per_season=EPOCHS_PER_SEASON,\n    early_stopping_patience=BURST_PATIENCE,\n    batch_size=BATCH_SIZE,\n    initial_checkpoint=CHECKPOINT,\n    use_versioning=True,\n    subsample_ratio=SUBSAMPLE_RATIO,\n    hidden_sizes=HIDDEN_SIZES,\n)",
   "id": "burst-train"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## 📦 4 — Export\n",
    "\n",
    "Convert the best `.pt` checkpoint to the `.nbai` binary format read by `NbaiLoader` in Scala."
   ],
   "id": "export-md"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from export import export_to_nbai\n",
    "\n",
    "NBAI_FILE = Path(DRIVE_ROOT) / 'nnue_weights.nbai'\n",
    "\n",
    "# Pick the latest versioned checkpoint\n",
    "checkpoints = sorted(WEIGHTS_DIR.glob('nnue_weights_v*.pt'))\n",
    "if not checkpoints:\n",
    "    raise FileNotFoundError('No checkpoints found in ' + str(WEIGHTS_DIR))\n",
    "\n",
    "latest = checkpoints[-1]\n",
    "print(f'Exporting {latest.name} → {NBAI_FILE.name}')\n",
    "\n",
    "export_to_nbai(\n",
    "    weights_file=str(latest),\n",
    "    output_file=str(NBAI_FILE),\n",
    "    trained_by='colab',\n",
    ")\n",
    "print('Export complete.')"
   ],
   "id": "export-cell"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## ⬇️ 5 — Download\n",
    "\n",
    "Download the `.nbai` weights file and the latest `.pt` checkpoint to your local machine.\n",
    "\n",
    "Place `nnue_weights.nbai` in `modules/official-bots/src/main/resources/` and rebuild the native image."
   ],
   "id": "download-md"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.colab import files\n",
    "\n",
    "if NBAI_FILE.exists():\n",
    "    files.download(str(NBAI_FILE))\n",
    "    print(f'Downloading {NBAI_FILE.name}')\n",
    "else:\n",
    "    print('No .nbai file found — run the Export cell first.')\n",
    "\n",
    "checkpoints = sorted(WEIGHTS_DIR.glob('nnue_weights_v*.pt'))\n",
    "if checkpoints:\n",
    "    latest = checkpoints[-1]\n",
    "    files.download(str(latest))\n",
    "    print(f'Downloading checkpoint {latest.name}')\n",
    "else:\n",
    "    print('No .pt checkpoint found.')"
   ],
   "id": "download-cell"
  }
 ]
 }
@@ -0,0 +1,281 @@
 #!/usr/bin/env python3
 """Build the ENTIRE NNUE training dataset with one command.
 Orchestrates the existing source modules (Lichess eval DB, self-play, tactical puzzles,
 random filler), labels what needs labeling with local Stockfish, deduplicates, balances
 the eval distribution, writes append-only compressed shards + a manifest, and pushes to
 Google Drive with rclone.
    ./dataset.sh                 # build everything + push
    ./dataset.sh --no-push       # build only
    ./dataset.sh --no-lichess    # skip the (large) Lichess backbone
 Tune the CONFIG block below — that is the only thing you normally touch.
 """
 import argparse
 import hashlib
 import json
 import os
 import random
 import subprocess
 import sys
 import urllib.request
 from datetime import datetime, timezone
 from pathlib import Path
 import zstandard as zstd
 HERE = Path(__file__).resolve().parent
 sys.path.insert(0, str(HERE / "src"))
 from generate import play_random_game_and_collect_positions
 from label import label_positions_with_stockfish
 from lichess_importer import import_lichess_evals
 from tactical_positions_extractor import download_and_extract_puzzle_db, extract_tactical_only
 # ── CONFIG — the only knobs you normally touch ───────────────────────────────
 LICHESS_POSITIONS = 2_000_000      # backbone positions from the Lichess eval DB
 USE_SELFPLAY      = True           # label data/selfplay.txt if present
 TACTICAL_PUZZLES  = 200_000        # tactical positions from the Lichess puzzle DB
 RANDOM_FILLER     = 100_000        # cheap random-play positions
 STOCKFISH_DEPTH   = 14             # local labeling depth (selfplay/tactical/random)
 RCLONE_REMOTE     = "gdrive:NowChess/datasets"
 # ─────────────────────────────────────────────────────────────────────────────
 LABEL_BATCH       = 64             # positions per Stockfish task (small = smooth progress + load balance)
 SHARD_SIZE        = 100_000        # positions per shard
 BALANCE_BINS      = 64             # eval histogram bins over [-1, 1]
 BALANCE_FACTOR    = 2.0            # cap each bin at FACTOR x the uniform bin size
 LICHESS_EVAL_URL  = "https://database.lichess.org/lichess_db_eval.jsonl.zst"
 STOCKFISH_PATH = os.environ.get("STOCKFISH_PATH", "/usr/games/stockfish")
 WORKERS        = os.cpu_count() or 4
 DATA_DIR     = HERE / "data"
 WORK_DIR     = HERE / "data" / "_build"
 DATASETS_DIR = HERE / "datasets"
 SHARDS_DIR   = DATASETS_DIR / "shards"
 MANIFEST     = DATASETS_DIR / "manifest.json"
 LICHESS_DB   = HERE / "trainingdata" / "lichess_db_eval.jsonl.zst"
 def label(fens_file: Path, out: Path) -> int:
    """Label a FEN file with local Stockfish. Returns positions written."""
    if not fens_file.exists():
        return 0
    label_positions_with_stockfish(
        str(fens_file), str(out), STOCKFISH_PATH,
        batch_size=LABEL_BATCH, depth=STOCKFISH_DEPTH, num_workers=WORKERS,
    )
    return count_lines(out)
 def count_lines(path: Path) -> int:
    if not path.exists():
        return 0
    with open(path) as f:
        return sum(1 for _ in f)
 def source_lichess(out: Path) -> int:
    if not LICHESS_DB.exists():
        print(f"Downloading Lichess eval DB → {LICHESS_DB} (large, one-time)...")
        LICHESS_DB.parent.mkdir(parents=True, exist_ok=True)
        urllib.request.urlretrieve(LICHESS_EVAL_URL, LICHESS_DB)
    return import_lichess_evals(str(LICHESS_DB), str(out), max_positions=LICHESS_POSITIONS)
 def source_selfplay(out: Path) -> int:
    return label(DATA_DIR / "selfplay.txt", out)
 def source_tactical(out: Path) -> int:
    puzzle_csv = download_and_extract_puzzle_db(output_dir=str(HERE / "tactical_data"))
    if puzzle_csv is None:
        return 0
    fens = WORK_DIR / "tactical_fens.txt"
    extract_tactical_only(str(puzzle_csv), str(fens), max_puzzles=TACTICAL_PUZZLES)
    return label(fens, out)
 def source_random(out: Path) -> int:
    fens = WORK_DIR / "random_fens.txt"
    play_random_game_and_collect_positions(
        str(fens), total_positions=RANDOM_FILLER, num_workers=WORKERS,
    )
    return label(fens, out)
 def build_sources(args) -> dict[str, Path]:
    """Run each enabled source into its own labeled jsonl. Returns {name: path}."""
    WORK_DIR.mkdir(parents=True, exist_ok=True)
    plan = [
        ("lichess", args.lichess, source_lichess),
        ("selfplay", args.selfplay, source_selfplay),
        ("tactical", args.tactical, source_tactical),
        ("random", args.random, source_random),
    ]
    outputs: dict[str, Path] = {}
    for name, enabled, fn in plan:
        if not enabled:
            continue
        out = WORK_DIR / f"{name}_labeled.jsonl"
        out.unlink(missing_ok=True)
        print(f"\n=== Source: {name} ===")
        written = fn(out)
        print(f"{name}: {written:,} labeled positions")
        if written:
            outputs[name] = out
    return outputs
 def existing_fens() -> set[str]:
    """FENs already present in the dataset, so growth stays deduplicated."""
    seen: set[str] = set()
    if not MANIFEST.exists():
        return seen
    manifest = json.loads(MANIFEST.read_text())
    for shard in manifest.get("shards", []):
        for rec in read_shard(SHARDS_DIR / shard["file"]):
            seen.add(rec["fen"])
    return seen
 def read_shard(path: Path):
    dctx = zstd.ZstdDecompressor()
    with open(path, "rb") as fh, dctx.stream_reader(fh) as reader:
        for line in iter_text(reader):
            yield json.loads(line)
 def iter_text(reader):
    import io
    yield from io.TextIOWrapper(reader, encoding="utf-8")
 def merge_dedup(outputs: dict[str, Path], skip: set[str]):
    """Merge all source jsonl, drop dupes (within batch + vs existing dataset)."""
    seen = set(skip)
    records, per_source = [], {}
    for name, path in outputs.items():
        kept = 0
        with open(path) as f:
            for line in f:
                rec = json.loads(line)
                fen = rec["fen"]
                if fen in seen:
                    continue
                seen.add(fen)
                rec["source"] = name
                records.append(rec)
                kept += 1
        per_source[name] = kept
    return records, per_source
 def balance(records: list) -> list:
    """Flatten the eval histogram: cap each bin at FACTOR x the uniform bin size."""
    if not records:
        return records
    cap = max(1, int(BALANCE_FACTOR * len(records) / BALANCE_BINS))
    bins: dict[int, int] = {}
    kept = []
    random.shuffle(records)
    for rec in records:
        b = min(BALANCE_BINS - 1, int((rec["eval"] + 1.0) / 2.0 * BALANCE_BINS))
        if bins.get(b, 0) < cap:
            bins[b] = bins.get(b, 0) + 1
            kept.append(rec)
    return kept
 def sha256(path: Path) -> str:
    h = hashlib.sha256()
    with open(path, "rb") as f:
        for chunk in iter(lambda: f.read(1 << 20), b""):
            h.update(chunk)
    return h.hexdigest()
 def write_shards(records: list, build_id: str) -> list[dict]:
    SHARDS_DIR.mkdir(parents=True, exist_ok=True)
    cctx = zstd.ZstdCompressor(level=10)
    entries = []
    for i in range(0, len(records), SHARD_SIZE):
        chunk = records[i : i + SHARD_SIZE]
        name = f"{build_id}_{i // SHARD_SIZE:05d}.jsonl.zst"
        path = SHARDS_DIR / name
        with open(path, "wb") as fh, cctx.stream_writer(fh) as w:
            for rec in chunk:
                w.write((json.dumps(rec) + "\n").encode("utf-8"))
        entries.append({"file": name, "positions": len(chunk),
                        "sha256": sha256(path), "build_id": build_id})
        print(f"  wrote {name} ({len(chunk):,} positions)")
    return entries
 def update_manifest(new_shards: list[dict], build: dict) -> None:
    manifest = json.loads(MANIFEST.read_text()) if MANIFEST.exists() else {
        "dataset_version": 0, "scale": 300.0, "builds": [], "shards": [],
    }
    manifest["dataset_version"] += 1
    manifest["created"] = build["created"]
    manifest["builds"].append(build)
    manifest["shards"].extend(new_shards)
    manifest["total_positions"] = sum(s["positions"] for s in manifest["shards"])
    MANIFEST.write_text(json.dumps(manifest, indent=2))
    print(f"\nDataset version {manifest['dataset_version']}: "
          f"{manifest['total_positions']:,} total positions across {len(manifest['shards'])} shards")
 def push() -> None:
    if not subprocess.run(["which", "rclone"], capture_output=True).stdout:
        print("rclone not found — skipping push.")
        return
    print(f"\nPushing {DATASETS_DIR} → {RCLONE_REMOTE} ...")
    subprocess.run(["rclone", "copy", str(DATASETS_DIR), RCLONE_REMOTE, "--progress"], check=True)
 def parse_args():
    p = argparse.ArgumentParser(description="Build the entire NNUE dataset.")
    for name in ("lichess", "selfplay", "tactical", "random", "push"):
        p.add_argument(f"--no-{name}", dest=name, action="store_false")
    p.add_argument("--push-only", action="store_true", help="Push the existing dataset, build nothing.")
    return p.parse_args()
 def main() -> None:
    args = parse_args()
    if args.push_only:
        push()
        return
    build_id = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
    outputs = build_sources(args)
    if not outputs:
        print("No sources produced data — nothing to build.")
        return
    print("\n=== Merge / dedup / balance ===")
    records, per_source = merge_dedup(outputs, existing_fens())
    print(f"merged unique (new): {len(records):,}")
    records = balance(records)
    print(f"after balancing: {len(records):,}")
    new_shards = write_shards(records, build_id)
    update_manifest(new_shards, {
        "build_id": build_id,
        "created": datetime.now(timezone.utc).isoformat(),
        "stockfish_depth": STOCKFISH_DEPTH,
        "sources": per_source,
        "kept_after_balance": len(records),
    })
    if args.push:
        push()
    print("\nDone.")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,17 @@
 #!/usr/bin/env bash
 # Build the ENTIRE NNUE training dataset + push to Drive. One command.
 #
 #   ./dataset.sh                 # build everything + rclone push
 #   ./dataset.sh --no-push       # build only
 #   ./dataset.sh --no-lichess    # skip the large Lichess backbone
 set -euo pipefail
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
 cd "$SCRIPT_DIR"
 PY="python3"
 if [[ -x "$SCRIPT_DIR/.venv/bin/python" ]]; then
  PY="$SCRIPT_DIR/.venv/bin/python"
 fi
 exec "$PY" build_dataset.py "$@"
@@ -0,0 +1,23 @@
 #!/usr/bin/env bash
 # Standalone bot self-play -> FENs for labeling. No microservices.
 #
 #   ./selfplay.sh                  # 500 games with the bundled net
 #   ./selfplay.sh 2000             # more games
 #   ./selfplay.sh 2000 path.nbai   # play with a specific net
 set -euo pipefail
 GAMES="${1:-500}"
 WEIGHTS="${2:-}"
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
 REPO_ROOT="$(cd "$SCRIPT_DIR/../../.." && pwd)"
 OUT="$SCRIPT_DIR/data/selfplay.txt"
 cd "$REPO_ROOT"
 SP_ARGS="--games $GAMES --out $OUT"
 if [[ -n "$WEIGHTS" ]]; then
  SP_ARGS="$SP_ARGS --weights $WEIGHTS"
 fi
 ./gradlew -q :modules:official-bots:selfPlay -PspArgs="$SP_ARGS"
 echo "Self-play FENs -> $OUT"
@@ -13,6 +13,38 @@ import chess
 from datetime import datetime, timedelta
 import re
 import numpy as np
 import os
 # DataLoader workers: cap to the machine's CPUs (Colab free tier = 2). Too many
 # workers each fork the dataset and OOM-kill the runtime.
 LOADER_WORKERS = int(os.environ.get("NNUE_LOADER_WORKERS", min(4, os.cpu_count() or 2)))
 def _shard_files(data_file):
    """Resolve a data path to a list of shard files. Accepts a single .jsonl/.jsonl.zst
    file, or a directory (searched recursively for shards, e.g. a synced datasets/ dir)."""
    p = Path(data_file)
    if p.is_dir():
        shards = sorted(p.rglob("*.jsonl.zst")) or sorted(p.rglob("*.jsonl"))
        if not shards:
            raise FileNotFoundError(f"No .jsonl/.jsonl.zst shards found under {p}")
        print(f"Loading {len(shards)} shard(s) from {p}")
        return shards
    return [p]
 def _iter_dataset_lines(data_file):
    """Yield text lines from every shard, transparently decompressing .zst shards."""
    import io
    for shard in _shard_files(data_file):
        if str(shard).endswith(".zst"):
            import zstandard as zstd
            with open(shard, "rb") as fh, zstd.ZstdDecompressor().stream_reader(fh) as reader:
                yield from io.TextIOWrapper(reader, encoding="utf-8")
        else:
            with open(shard, "r") as fh:
                yield from fh
 class NNUEDataset(Dataset):
    """Dataset of chess positions with evaluations."""
@@ -23,27 +55,26 @@ class NNUEDataset(Dataset):
        self.evals_raw = []
        self.is_normalized = None
-        with open(data_file, 'r') as f:
+        for line in _iter_dataset_lines(data_file):
-            for line in f:
+            try:
-                try:
+                data = json.loads(line)
-                    data = json.loads(line)
+                fen = data['fen']
-                    fen = data['fen']
+                eval_val = data['eval']
-                    eval_val = data['eval']
+                self.positions.append(fen)
-                    self.positions.append(fen)
+                self.evals.append(eval_val)
                    self.evals.append(eval_val)
-                    # Check if normalized or raw
+                # Check if normalized or raw
-                    if self.is_normalized is None:
+                if self.is_normalized is None:
-                        # If eval is in range [-1, 1], assume normalized
+                    # If eval is in range [-1, 1], assume normalized
-                        self.is_normalized = abs(eval_val) <= 1.0
+                    self.is_normalized = abs(eval_val) <= 1.0
-                    # Store raw if available
+                # Store raw if available
-                    if 'eval_raw' in data:
+                if 'eval_raw' in data:
-                        self.evals_raw.append(data['eval_raw'])
+                    self.evals_raw.append(data['eval_raw'])
-                    else:
+                else:
-                        self.evals_raw.append(eval_val)
+                    self.evals_raw.append(eval_val)
-                except (json.JSONDecodeError, KeyError):
+            except (json.JSONDecodeError, KeyError):
-                    pass
+                pass
    def __len__(self):
        return len(self.positions)
@@ -53,6 +84,11 @@ class NNUEDataset(Dataset):
        eval_val = self.evals[idx]
        features = fen_to_features(fen)
        # Board is flipped for Black-to-move in fen_to_features; negate eval
        # so the label still means "good for the side shown as White after flip"
        if ' b ' in fen:
            eval_val = -eval_val
        # Use evaluation as-is if normalized, otherwise apply sigmoid scaling
        if self.is_normalized:
            target = torch.tensor(eval_val, dtype=torch.float32)
@@ -61,38 +97,59 @@ class NNUEDataset(Dataset):
        return features, target
 # King-relative (HalfKP) encoding: two perspectives, one per side's king.
 # Each piece is encoded as:  kingSq * 768 + pieceIdx * 64 + sq
 # White perspective uses white king square; black perspective uses black king square.
 # Total input dimension = 2 × 64 × 12 × 64 = 98304.
 _HALF_SIZE = 64 * 12 * 64   # 49152 features per perspective
 INPUT_SIZE = _HALF_SIZE * 2  # 98304
 _PIECE_TO_IDX = {
    'p': 0, 'n': 1, 'b': 2, 'r': 3, 'q': 4, 'k': 5,
    'P': 6, 'N': 7, 'B': 8, 'R': 9, 'Q': 10, 'K': 11,
 }
 def fen_to_features(fen):
-    """Convert FEN to 768-dimensional binary feature vector."""
+    """Convert FEN to 98304-dim king-relative (HalfKP) feature vector.
    # Piece type to index: pawn=0, knight=1, bishop=2, rook=3, queen=4, king=5
    piece_to_idx = {'p': 0, 'n': 1, 'b': 2, 'r': 3, 'q': 4, 'k': 5,
                    'P': 6, 'N': 7, 'B': 8, 'R': 9, 'Q': 10, 'K': 11}
    features = torch.zeros(768, dtype=torch.float32)
    For Black-to-move positions the board is mirrored (ranks flipped, colours
    swapped) so the network always sees the position from the side-to-move's
    perspective.  The caller is responsible for negating the eval label to match.
    """
    features = torch.zeros(INPUT_SIZE, dtype=torch.float32)
    try:
        board = chess.Board(fen)
-
+        # Perspective flip: present all positions as if White is to move
-        # 12 piece types × 64 squares = 768
+        if board.turn == chess.BLACK:
-        for square in chess.SQUARES:
+            board = board.mirror()
-            piece = board.piece_at(square)
+        wk = board.king(chess.WHITE)
-            if piece is not None:
+        bk = board.king(chess.BLACK)
-                piece_char = piece.symbol()
+        if wk is None or bk is None:
-                if piece_char in piece_to_idx:
+            return features
-                    piece_idx = piece_to_idx[piece_char]
+        for sq in chess.SQUARES:
-                    feature_idx = piece_idx * 64 + square
+            piece = board.piece_at(sq)
-                    features[feature_idx] = 1.0
+            if piece is None:
-    except:
+                continue
            pidx = _PIECE_TO_IDX[piece.symbol()]
            # White-king perspective (indices 0 .. _HALF_SIZE-1)
            features[wk * 768 + pidx * 64 + sq] = 1.0
            # Black-king perspective (indices _HALF_SIZE .. INPUT_SIZE-1)
            features[_HALF_SIZE + bk * 768 + pidx * 64 + sq] = 1.0
    except Exception:
        pass
    return features
-DEFAULT_HIDDEN_SIZES = [1536, 1024, 512, 256]
+# Smaller hidden layers are appropriate: the L1 input is very sparse (~64 active
 # features out of 98304) so the L1 itself is cheap to update incrementally; the
 # larger capacity comes from the wider perspective encoding, not deeper layers.
 DEFAULT_HIDDEN_SIZES = [512, 256, 128]
 class NNUE(nn.Module):
    """NNUE neural network with configurable hidden layers.
-    Architecture: 768 → hidden_sizes[0] → ... → hidden_sizes[-1] → 1
+    Architecture: INPUT_SIZE → hidden_sizes[0] → ... → hidden_sizes[-1] → 1
    Layer attributes follow the naming l1, l2, ..., lN so export.py can
    infer the architecture directly from the state_dict.
    """
@@ -102,7 +159,7 @@ class NNUE(nn.Module):
        if hidden_sizes is None:
            hidden_sizes = DEFAULT_HIDDEN_SIZES
        self.hidden_sizes = list(hidden_sizes)
-        sizes = [768] + self.hidden_sizes + [1]
+        sizes = [INPUT_SIZE] + self.hidden_sizes + [1]
        num_hidden = len(self.hidden_sizes)
        for i in range(num_hidden):
@@ -204,17 +261,17 @@ def _setup_training(data_file, batch_size, subsample_ratio):
        train_dataset,
        batch_size=batch_size,
        sampler=train_sampler,
-        num_workers=8,
+        num_workers=LOADER_WORKERS,
        pin_memory=True,
-        persistent_workers=True
+        persistent_workers=LOADER_WORKERS > 0
    )
    val_loader = DataLoader(
        val_dataset,
        batch_size=batch_size,
        shuffle=False,
-        num_workers=8,
+        num_workers=LOADER_WORKERS,
        pin_memory=True,
-        persistent_workers=True
+        persistent_workers=LOADER_WORKERS > 0
    )
    return device, dataset, train_dataset, val_dataset, train_loader, val_loader, num_positions
@@ -1,17 +1,20 @@
 package de.nowchess.bot
 import de.nowchess.bot.bots.{ClassicalBot, HybridBot}
 import de.nowchess.bot.util.PolyglotBook
 import jakarta.enterprise.context.ApplicationScoped
 import org.jboss.logging.Logger
 object BotController:
  private val log = Logger.getLogger(classOf[BotController])
  private val openingBook = PolyglotBook.fromResource("/opening_book.bin")
  private val bots: Map[String, Bot] = Map(
    "easy"   -> ClassicalBot(BotDifficulty.Easy),
    "medium" -> ClassicalBot(BotDifficulty.Medium),
    "hard"   -> ClassicalBot(BotDifficulty.Hard),
-    "expert" -> HybridBot(BotDifficulty.Expert, vetoReporter = log.debug(_)),
+    "expert" -> HybridBot(BotDifficulty.Expert, vetoReporter = log.debug(_), book = Some(openingBook)),
  )
  def getBot(name: String): Option[Bot] = bots.get(name.toLowerCase)
@@ -15,6 +15,7 @@ object NNUEBot:
      difficulty: BotDifficulty,
      rules: RuleSet = DefaultRules,
      book: Option[PolyglotBook] = None,
      fixedMoveTimeMs: Option[Long] = None,
  ): Bot =
    val search = AlphaBetaSearch(rules, weights = EvaluationNNUE)
    context =>
@@ -28,7 +29,8 @@ object NNUEBot:
          else
            val scored   = batchEvaluateRoot(rules, context, moves)
            val bestMove = scored.maxBy(_._2)._1
-            search.bestMoveWithTime(context, allocateTime(scored), blockedMoves).orElse(Some(bestMove))
+            val budget   = fixedMoveTimeMs.getOrElse(allocateTime(scored))
            search.bestMoveWithTime(context, budget, blockedMoves, scored.toMap).orElse(Some(bestMove))
        }
  private def batchEvaluateRoot(rules: RuleSet, context: GameContext, moves: List[Move]): List[(Move, Int)] =
@@ -23,9 +23,9 @@ object EvaluationNNUE extends Evaluation:
    nnue.copyAccumulator(parentPly, childPly)
  override def pushAccumulator(childPly: Int, move: Move, parent: GameContext, child: GameContext): Unit =
-    // Use incremental updates, but recompute from scratch every 10 plies to prevent accumulation errors
+    // Recompute every 10 plies to prevent floating-point drift; king moves always recompute internally
    if childPly % 10 == 0 then nnue.recomputeAccumulator(childPly, child.board)
-    else nnue.pushAccumulator(childPly, move, parent.board)
+    else nnue.pushAccumulator(childPly, move, parent.board, child.board)
  override def evaluateAccumulator(ply: Int, context: GameContext, hash: Long): Int =
    nnue.evaluateAtPlyWithValidation(ply, context.turn, hash, context.board)
@@ -1,17 +1,17 @@
 package de.nowchess.bot.bots.nnue
-import de.nowchess.api.board.{Board, Color, File, Piece, PieceType, Square}
+import de.nowchess.api.board.{Board, Color, Piece, PieceType, Square}
 import de.nowchess.api.game.GameContext
 import de.nowchess.api.move.{Move, MoveType, PromotionPiece}
 class NNUE(model: NbaiModel):
-  private val featureSize   = model.layers(0).inputSize
+  private val HALF_SIZE     = 49152                     // 64 king-squares × 12 piece-types × 64 piece-squares
  private val featureSize   = model.layers(0).inputSize // 98304 (= HALF_SIZE * 2) for king-relative
  private val accSize       = model.layers(0).outputSize
-  private val validateAccum = sys.env.contains("NNUE_VALIDATE") // Enable with NNUE_VALIDATE=1
+  private val validateAccum = sys.env.contains("NNUE_VALIDATE")
-  // Column-major L1 weights for cache-friendly sparse & incremental updates.
+  // Column-major L1 weights: l1WeightsT(featureIdx * accSize + outputIdx)
  // l1WeightsT(featureIdx * accSize + outputIdx) = l1Weights(outputIdx * featureSize + featureIdx)
  private val l1WeightsT: Array[Float] =
    val w = model.weights(0).weights
    val t = new Array[Float](featureSize * accSize)
@@ -23,7 +23,6 @@ class NNUE(model: NbaiModel):
  private val MAX_PLY                      = 128
  private val l1Stack: Array[Array[Float]] = Array.fill(MAX_PLY + 1)(new Array[Float](accSize))
  // Shared evaluation buffers: index i holds the output of layers(i) (all except the scalar output layer).
  private val evalBuffers: Array[Array[Float]] = model.layers.init.map(l => new Array[Float](l.outputSize))
  // ── Eval cache ───────────────────────────────────────────────────────────
@@ -36,9 +35,29 @@ class NNUE(model: NbaiModel):
  private def squareNum(sq: Square): Int = sq.rank.ordinal * 8 + sq.file.ordinal
-  private def featureIndex(piece: Piece, sqNum: Int): Int =
+  // Mirror square vertically (rank 0 ↔ rank 7) for the perspective flip
-    val colorOffset = if piece.color == Color.White then 6 else 0
+  private def flipSqNum(sqNum: Int): Int = (7 - sqNum / 8) * 8 + sqNum % 8
-    (colorOffset + piece.pieceType.ordinal) * 64 + sqNum
+
  private def pieceIdx(piece: Piece): Int =
    if piece.color == Color.White then 6 + piece.pieceType.ordinal else piece.pieceType.ordinal
  // White-king perspective: index in [0, HALF_SIZE)
  private def featureIdxWhite(piece: Piece, sqNum: Int, wkSq: Int): Int =
    wkSq * 768 + pieceIdx(piece) * 64 + sqNum
  // Black-king perspective: index in [HALF_SIZE, featureSize)
  private def featureIdxBlack(piece: Piece, sqNum: Int, bkSq: Int): Int =
    HALF_SIZE + bkSq * 768 + pieceIdx(piece) * 64 + sqNum
  private def wkSqOf(board: Board): Int =
    board.pieces
      .collectFirst { case (sq, p) if p.pieceType == PieceType.King && p.color == Color.White => squareNum(sq) }
      .getOrElse(0)
  private def bkSqOf(board: Board): Int =
    board.pieces
      .collectFirst { case (sq, p) if p.pieceType == PieceType.King && p.color == Color.Black => squareNum(sq) }
      .getOrElse(0)
  private def addColumn(l1Pre: Array[Float], featureIdx: Int): Unit =
    val offset = featureIdx * accSize
@@ -48,92 +67,96 @@ class NNUE(model: NbaiModel):
    val offset = featureIdx * accSize
    for i <- 0 until accSize do l1Pre(i) -= l1WeightsT(offset + i)
  private def addPiece(l1: Array[Float], piece: Piece, sqNum: Int, wkSq: Int, bkSq: Int): Unit =
    addColumn(l1, featureIdxWhite(piece, sqNum, wkSq))
    addColumn(l1, featureIdxBlack(piece, sqNum, bkSq))
  private def removePiece(l1: Array[Float], piece: Piece, sqNum: Int, wkSq: Int, bkSq: Int): Unit =
    subtractColumn(l1, featureIdxWhite(piece, sqNum, wkSq))
    subtractColumn(l1, featureIdxBlack(piece, sqNum, bkSq))
  // ── Accumulator init ─────────────────────────────────────────────────────
  def initAccumulator(board: Board): Unit =
    val wkSq = wkSqOf(board)
    val bkSq = bkSqOf(board)
    System.arraycopy(model.weights(0).bias, 0, l1Stack(0), 0, accSize)
-    for (sq, piece) <- board.pieces do addColumn(l1Stack(0), featureIndex(piece, squareNum(sq)))
+    for (sq, piece) <- board.pieces do addPiece(l1Stack(0), piece, squareNum(sq), wkSq, bkSq)
  // ── Accumulator push (incremental updates) ───────────────────────────────
-  def pushAccumulator(childPly: Int, move: Move, board: Board): Unit =
+  def pushAccumulator(childPly: Int, move: Move, parentBoard: Board, childBoard: Board): Unit =
    System.arraycopy(l1Stack(childPly - 1), 0, l1Stack(childPly), 0, accSize)
-    val l1 = l1Stack(childPly)
+    if isKingMove(move, parentBoard) then recomputeAccumulatorInto(l1Stack(childPly), childBoard)
-    move.moveType match
+    else applyNonKingDelta(l1Stack(childPly), move, parentBoard)
-      case MoveType.Normal(_)                                 => applyNormalDelta(l1, move, board)
+
-      case MoveType.EnPassant                                 => applyEnPassantDelta(l1, move, board)
+  private def isKingMove(move: Move, board: Board): Boolean =
-      case MoveType.CastleKingside | MoveType.CastleQueenside => applyCastleDelta(l1, move, board)
+    move.moveType == MoveType.CastleKingside ||
-      case MoveType.Promotion(p)                              => applyPromotionDelta(l1, move, p, board)
+      move.moveType == MoveType.CastleQueenside ||
      board.pieceAt(move.from).exists(_.pieceType == PieceType.King)
  def copyAccumulator(parentPly: Int, childPly: Int): Unit =
    System.arraycopy(l1Stack(parentPly), 0, l1Stack(childPly), 0, accSize)
  def recomputeAccumulator(ply: Int, board: Board): Unit =
-    System.arraycopy(model.weights(0).bias, 0, l1Stack(ply), 0, accSize)
+    recomputeAccumulatorInto(l1Stack(ply), board)
-    for (sq, piece) <- board.pieces do addColumn(l1Stack(ply), featureIndex(piece, squareNum(sq)))
+
  private def recomputeAccumulatorInto(l1: Array[Float], board: Board): Unit =
    val wkSq = wkSqOf(board)
    val bkSq = bkSqOf(board)
    System.arraycopy(model.weights(0).bias, 0, l1, 0, accSize)
    for (sq, piece) <- board.pieces do addPiece(l1, piece, squareNum(sq), wkSq, bkSq)
  def validateAccumulator(ply: Int, board: Board): Boolean =
-    // Compute what L1 should be from scratch
+    val expected = new Array[Float](accSize)
-    val expectedL1 = new Array[Float](accSize)
+    val wkSq     = wkSqOf(board)
-    System.arraycopy(model.weights(0).bias, 0, expectedL1, 0, accSize)
+    val bkSq     = bkSqOf(board)
-    for (sq, piece) <- board.pieces do addColumn(expectedL1, featureIndex(piece, squareNum(sq)))
+    System.arraycopy(model.weights(0).bias, 0, expected, 0, accSize)
-
+    for (sq, piece) <- board.pieces do addPiece(expected, piece, squareNum(sq), wkSq, bkSq)
    // Compare with actual L1
    val actual = l1Stack(ply)
-    val maxError =
+    (0 until accSize).forall(i => math.abs(actual(i) - expected(i)) < 0.001f)
      (0 until accSize).foldLeft(0f) { (currentMax, i) =>
        val error = math.abs(actual(i) - expectedL1(i))
        math.max(currentMax, error)
      }
-    maxError < 0.001f // Allow small floating-point errors
+  // ── Non-king incremental deltas ──────────────────────────────────────────
-  private def applyNormalDelta(l1: Array[Float], move: Move, board: Board): Unit =
+  private def applyNonKingDelta(l1: Array[Float], move: Move, board: Board): Unit =
-    // Extract source and destination square indices early
+    val wkSq = wkSqOf(board)
-    val fromNum = squareNum(move.from)
+    val bkSq = bkSqOf(board)
-    val toNum   = squareNum(move.to)
+    move.moveType match
      case MoveType.Normal(_)    => applyNormalDelta(l1, move, board, wkSq, bkSq)
      case MoveType.EnPassant    => applyEnPassantDelta(l1, move, board, wkSq, bkSq)
      case MoveType.Promotion(p) => applyPromotionDelta(l1, move, p, board, wkSq, bkSq)
      case _                     => () // king moves handled before this point
-    // Get the moving piece
+  private def applyNormalDelta(l1: Array[Float], move: Move, board: Board, wkSq: Int, bkSq: Int): Unit =
    board.pieceAt(move.from).foreach { mover =>
-      subtractColumn(l1, featureIndex(mover, fromNum))
+      val fromNum = squareNum(move.from)
-
+      val toNum   = squareNum(move.to)
-      // If there's a capture, subtract the captured piece
+      removePiece(l1, mover, fromNum, wkSq, bkSq)
-      board.pieceAt(move.to).foreach { cap =>
+      board.pieceAt(move.to).foreach(cap => removePiece(l1, cap, toNum, wkSq, bkSq))
-        subtractColumn(l1, featureIndex(cap, toNum))
+      addPiece(l1, mover, toNum, wkSq, bkSq)
      }
      // Add the piece to its new location
      addColumn(l1, featureIndex(mover, toNum))
    }
-  private def applyEnPassantDelta(l1: Array[Float], move: Move, board: Board): Unit =
+  private def applyEnPassantDelta(l1: Array[Float], move: Move, board: Board, wkSq: Int, bkSq: Int): Unit =
    board.pieceAt(move.from).foreach { pawn =>
      val capturedSq = Square(move.to.file, move.from.rank)
-      subtractColumn(l1, featureIndex(pawn, squareNum(move.from)))
+      removePiece(l1, pawn, squareNum(move.from), wkSq, bkSq)
-      board.pieceAt(capturedSq).foreach(cap => subtractColumn(l1, featureIndex(cap, squareNum(capturedSq))))
+      board.pieceAt(capturedSq).foreach(cap => removePiece(l1, cap, squareNum(capturedSq), wkSq, bkSq))
-      addColumn(l1, featureIndex(pawn, squareNum(move.to)))
+      addPiece(l1, pawn, squareNum(move.to), wkSq, bkSq)
    }
-  private def applyCastleDelta(l1: Array[Float], move: Move, board: Board): Unit =
+  private def applyPromotionDelta(
-    board.pieceAt(move.from).foreach { king =>
+      l1: Array[Float],
-      val rank     = move.from.rank
+      move: Move,
-      val kingside = move.moveType == MoveType.CastleKingside
+      promo: PromotionPiece,
-      val (rookFrom, rookTo) =
+      board: Board,
-        if kingside then (Square(File.H, rank), Square(File.F, rank))
+      wkSq: Int,
-        else (Square(File.A, rank), Square(File.D, rank))
+      bkSq: Int,
-      val rook = Piece(king.color, PieceType.Rook)
+  ): Unit =
      subtractColumn(l1, featureIndex(king, squareNum(move.from)))
      addColumn(l1, featureIndex(king, squareNum(move.to)))
      subtractColumn(l1, featureIndex(rook, squareNum(rookFrom)))
      addColumn(l1, featureIndex(rook, squareNum(rookTo)))
    }
  private def applyPromotionDelta(l1: Array[Float], move: Move, promo: PromotionPiece, board: Board): Unit =
    board.pieceAt(move.from).foreach { pawn =>
      val toNum = squareNum(move.to)
-      subtractColumn(l1, featureIndex(pawn, squareNum(move.from)))
+      removePiece(l1, pawn, squareNum(move.from), wkSq, bkSq)
-      board.pieceAt(move.to).foreach(cap => subtractColumn(l1, featureIndex(cap, toNum)))
+      board.pieceAt(move.to).foreach(cap => removePiece(l1, cap, toNum, wkSq, bkSq))
-      addColumn(l1, featureIndex(Piece(pawn.color, promotedType(promo)), toNum))
+      addPiece(l1, Piece(pawn.color, promotedType(promo)), toNum, wkSq, bkSq)
    }
  private def promotedType(promo: PromotionPiece): PieceType = promo match
@@ -154,7 +177,6 @@ class NNUE(model: NbaiModel):
      score
  def evaluateAtPlyWithValidation(ply: Int, turn: Color, hash: Long, board: Board): Int =
    // For debugging: validate that incremental accumulator matches recomputation
    if validateAccum && ply > 0 && ply % 10 != 0 then
      val isValid = validateAccumulator(ply, board)
      if !isValid then System.err.println(s"WARNING: NNUE accumulator diverged at ply $ply")
@@ -206,9 +228,23 @@ class NNUE(model: NbaiModel):
  private val legacyL1 = new Array[Float](accSize)
  def evaluate(context: GameContext): Int =
    // Match training: for Black-to-move positions, mirror the board (ranks flipped,
    // colours swapped) so the model always sees from the side-to-move's perspective.
    // The scoreFromOutput negation then converts back to White's absolute perspective.
    val (wkSq, bkSq, pieces, turn) =
      if context.turn == Color.Black then
        val wk = flipSqNum(bkSqOf(context.board)) // flipped Black king → new "White" king
        val bk = flipSqNum(wkSqOf(context.board)) // flipped White king → new "Black" king
        val flipped = context.board.pieces.map { case (sq, p) =>
          (sq, Piece(p.color.opposite, p.pieceType))
        }
        (wk, bk, flipped, Color.Black) // pass Black so scoreFromOutput negates the result
      else (wkSqOf(context.board), bkSqOf(context.board), context.board.pieces, context.turn)
    System.arraycopy(model.weights(0).bias, 0, legacyL1, 0, accSize)
-    for (sq, piece) <- context.board.pieces do addColumn(legacyL1, featureIndex(piece, squareNum(sq)))
+    for (sq, piece) <- pieces do
-    runL2toOutput(legacyL1, context.turn)
+      val sqNum = if turn == Color.Black then flipSqNum(squareNum(sq)) else squareNum(sq)
      addPiece(legacyL1, piece, sqNum, wkSq, bkSq)
    runL2toOutput(legacyL1, turn)
  def benchmark(): Unit =
    val context    = GameContext.initial
@@ -1,6 +1,7 @@
 package de.nowchess.bot.bots.nnue
 import java.io.InputStream
 import java.nio.file.{Files, Path}
 import java.nio.{ByteBuffer, ByteOrder}
 import java.nio.charset.StandardCharsets
@@ -17,13 +18,28 @@ object NbaiLoader:
    val weights  = descs.map(_ => readLayerWeights(buf))
    NbaiModel(metadata, descs, weights)
-  /** Tries /nnue_weights.nbai on the classpath; falls back to migrating /nnue_weights.bin. */
+  /** Loads weights from the `nnue.weights` system property if it points at a readable file; otherwise tries
    * /nnue_weights.nbai on the classpath, falling back to migrating /nnue_weights.bin.
    */
  def loadDefault(): NbaiModel =
-    Option(getClass.getResourceAsStream("/nnue_weights.nbai")) match
+    overrideModel().getOrElse {
-      case Some(s) =>
+      Option(getClass.getResourceAsStream("/nnue_weights.nbai")) match
        case Some(s) =>
          try load(s)
          finally s.close()
        case None => NbaiMigrator.migrateFromBin()
    }
  private def overrideModel(): Option[NbaiModel] =
    sys.props
      .get("nnue.weights")
      .map(Path.of(_))
      .filter(Files.isRegularFile(_))
      .map { path =>
        val s = Files.newInputStream(path)
        try load(s)
        finally s.close()
-      case None => NbaiMigrator.migrateFromBin()
+      }
  private def checkHeader(buf: ByteBuffer): Unit =
    val magic = buf.getInt()
@@ -32,6 +32,8 @@ final class AlphaBetaSearch(
  private val nodeCount   = AtomicInteger(0)
  private val ordering    = MoveOrdering.OrderingContext()
  def lastNodeCount: Int = nodeCount.get()
  private final case class QuiescenceNode(
      context: GameContext,
      ply: Int,
@@ -47,6 +49,17 @@ final class AlphaBetaSearch(
    bestMove(context, maxDepth, Set.empty)
  def bestMove(context: GameContext, maxDepth: Int, excludedRootMoves: Set[Move]): Option[Move] =
    doDepthSearch(context, maxDepth, excludedRootMoves, Map.empty)
  def bestMove(context: GameContext, maxDepth: Int, excludedRootMoves: Set[Move], hints: Map[Move, Int]): Option[Move] =
    doDepthSearch(context, maxDepth, excludedRootMoves, hints)
  private def doDepthSearch(
      context: GameContext,
      maxDepth: Int,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): Option[Move] =
    tt.clear()
    ordering.clear()
    weights.initAccumulator(context)
@@ -66,6 +79,7 @@ final class AlphaBetaSearch(
          ASPIRATION_DELTA,
          rootHash,
          excludedRootMoves,
          hints,
        )
        (move.orElse(bestSoFar), score)
      }
@@ -78,6 +92,22 @@ final class AlphaBetaSearch(
    bestMoveWithTime(context, timeBudgetMs, Set.empty)
  def bestMoveWithTime(context: GameContext, timeBudgetMs: Long, excludedRootMoves: Set[Move]): Option[Move] =
    doTimedSearch(context, timeBudgetMs, excludedRootMoves, Map.empty)
  def bestMoveWithTime(
      context: GameContext,
      timeBudgetMs: Long,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): Option[Move] =
    doTimedSearch(context, timeBudgetMs, excludedRootMoves, hints)
  private def doTimedSearch(
      context: GameContext,
      timeBudgetMs: Long,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): Option[Move] =
    tt.clear()
    ordering.clear()
    weights.initAccumulator(context)
@@ -100,6 +130,7 @@ final class AlphaBetaSearch(
          ASPIRATION_DELTA,
          rootHash,
          excludedRootMoves,
          hints,
        )
        loop(move.orElse(bestSoFar), score, depth + 1, depth)
@@ -124,14 +155,17 @@ final class AlphaBetaSearch(
      initialWindow: Int,
      rootHash: Long,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): (Int, Option[Move]) =
    val state = SearchState(rootHash, Map(rootHash -> 1))
    @scala.annotation.tailrec
    def loop(currentAlpha: Int, currentBeta: Int, delta: Int, attempt: Int): (Int, Option[Move]) =
-      if attempt >= 3 || attempt >= depth then search(context, depth, 0, Window(-INF, INF), state, excludedRootMoves)
+      if attempt >= 3 || attempt >= depth then
        search(context, depth, 0, Window(-INF, INF), state, excludedRootMoves, hints)
      else
-        val (score, move) = search(context, depth, 0, Window(currentAlpha, currentBeta), state, excludedRootMoves)
+        val (score, move) =
          search(context, depth, 0, Window(currentAlpha, currentBeta), state, excludedRootMoves, hints)
        if score > currentAlpha && score < currentBeta then (score, move)
        else if score <= currentAlpha then
          loop(score - delta, currentBeta, math.min(delta * 2, ASPIRATION_DELTA_MAX), attempt + 1)
@@ -156,12 +190,14 @@ final class AlphaBetaSearch(
      beta: Int,
      state: SearchState,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): Option[Int] =
    val nullCtx        = nullMoveContext(context)
    val nullState      = state.advance(ZobristHash.hash(nullCtx))
    val reductionDepth = math.max(0, depth - 1 - NULL_MOVE_R)
    weights.copyAccumulator(ply, ply + 1)
-    val (score, _) = search(nullCtx, reductionDepth, ply + 1, Window(-beta, -beta + 1), nullState, excludedRootMoves)
+    val (score, _) =
      search(nullCtx, reductionDepth, ply + 1, Window(-beta, -beta + 1), nullState, excludedRootMoves, hints)
    if -score >= beta then Some(beta) else None
  /** Negamax alpha-beta search returning (score, best move). */
@@ -172,8 +208,9 @@ final class AlphaBetaSearch(
      window: Window,
      state: SearchState,
      excludedRootMoves: Set[Move],
      hints: Map[Move, Int],
  ): (Int, Option[Move]) =
-    val params = SearchParams(context, depth, ply, window, state, excludedRootMoves)
+    val params = SearchParams(context, depth, ply, window, state, excludedRootMoves, hints)
    searchNode(params)
  private def searchNode(params: SearchParams): (Int, Option[Move]) =
@@ -235,13 +272,14 @@ final class AlphaBetaSearch(
            params.window.beta,
            params.state,
            params.excludedRootMoves,
            params.rootHints,
          ),
        )
        .flatten
    nullResult.map((_, None)).getOrElse {
      val ttBest  = tt.probe(params.state.hash).flatMap(_.bestMove)
-      val ordered = MoveOrdering.sort(params.context, legalMoves, ttBest, params.ply, ordering)
+      val ordered = MoveOrdering.sort(params.context, legalMoves, ttBest, params.ply, ordering, params.rootHints)
      searchSequential(
        params.context,
        params.depth,
@@ -250,6 +288,7 @@ final class AlphaBetaSearch(
        ordered,
        params.state,
        params.excludedRootMoves,
        params.rootHints,
      )
    }
@@ -280,6 +319,7 @@ final class AlphaBetaSearch(
        Window(-a - 1, -a),
        childState,
        params.excludedRootMoves,
        params.rootHints,
      )
      val s = -rs
      if s > a then
@@ -290,6 +330,7 @@ final class AlphaBetaSearch(
          Window(betaNeg, -a),
          childState,
          params.excludedRootMoves,
          params.rootHints,
        )
        -fs
      else s
@@ -301,6 +342,7 @@ final class AlphaBetaSearch(
        Window(betaNeg, -a),
        childState,
        params.excludedRootMoves,
        params.rootHints,
      )
      -rs
@@ -364,8 +406,9 @@ final class AlphaBetaSearch(
      ordered: List[Move],
      state: SearchState,
      excludedRootMoves: Set[Move],
      rootHints: Map[Move, Int] = Map.empty,
  ): (Int, Option[Move]) =
-    val params                        = SearchParams(context, depth, ply, window, state, excludedRootMoves)
+    val params                        = SearchParams(context, depth, ply, window, state, excludedRootMoves, rootHints)
    val (bestMove, bestScore, cutoff) = searchLoop(0, 0, LoopAcc(None, -INF, window.alpha), params, ordered)
    val flag =
      if cutoff then TTFlag.Lower
@@ -38,8 +38,10 @@ object MoveOrdering:
      ttBestMove: Option[Move],
      ply: Int = 0,
      ordering: OrderingContext = new OrderingContext(),
      rootHints: Map[Move, Int] = Map.empty,
  ): Int =
    if ttBestMove.exists(m => m.from == move.from && m.to == move.to) then Int.MaxValue
    else if ply == 0 && rootHints.nonEmpty then rootHints.getOrElse(move, Int.MinValue / 2)
    else
      move.moveType match
        case MoveType.Promotion(PromotionPiece.Queen) =>
@@ -56,8 +58,9 @@ object MoveOrdering:
      ttBestMove: Option[Move],
      ply: Int = 0,
      ordering: OrderingContext = new OrderingContext(),
      rootHints: Map[Move, Int] = Map.empty,
  ): List[Move] =
-    moves.sortBy(m => -score(context, m, ttBestMove, ply, ordering))
+    moves.sortBy(m => -score(context, m, ttBestMove, ply, ordering, rootHints))
  private def scoreQuietMove(move: Move, ply: Int, ordering: OrderingContext): Int =
    val isKiller = ordering.getKillerMoves(ply).exists(k => k.from == move.from && k.to == move.to)
@@ -14,6 +14,7 @@ final case class SearchParams(
    window: Window,
    state: SearchState,
    excludedRootMoves: Set[Move],
    rootHints: Map[Move, Int] = Map.empty,
 )
 final case class SearchState(hash: Long, repetitions: Map[Long, Int]):
@@ -0,0 +1,112 @@
 package de.nowchess.bot.selfplay
 import de.nowchess.api.game.GameContext
 import de.nowchess.api.move.Move
 import de.nowchess.api.rules.RuleSet
 import de.nowchess.bot.BotDifficulty
 import de.nowchess.bot.bots.NNUEBot
 import de.nowchess.io.fen.FenExporter
 import de.nowchess.rules.sets.DefaultRules
 import java.io.{BufferedWriter, FileWriter}
 import java.nio.file.{Files, Path}
 import scala.collection.mutable
 import scala.util.Random
 /** Standalone self-play harness. Runs NNUEBot against itself from randomised openings and writes the visited positions
  * as one FEN per line — the input format expected by the Python labeler. No microservices.
  *
  * Games run sequentially because EvaluationNNUE holds a shared accumulator; the small per-move time budget keeps
  * throughput high. Stockfish relabels every position later, so shallow self-play search is sufficient.
  */
 object SelfPlayMain:
  private case class Config(
      games: Int = 500,
      out: String = "modules/official-bots/python/data/selfplay.txt",
      weights: Option[String] = None,
      moveTimeMs: Long = 50L,
      randomPlies: Int = 8,
      maxPlies: Int = 200,
      seed: Long = System.nanoTime(),
  )
  def main(args: Array[String]): Unit =
    val config = parse(args.toList, Config())
    config.weights.foreach(System.setProperty("nnue.weights", _))
    val rules = DefaultRules
    val bot   = NNUEBot(BotDifficulty.Hard, rules, fixedMoveTimeMs = Some(config.moveTimeMs))
    val rng   = new Random(config.seed)
    val seen  = mutable.HashSet.empty[String]
    Files.createDirectories(Path.of(config.out).toAbsolutePath.getParent)
    val writer = new BufferedWriter(new FileWriter(config.out))
    try
      var game = 0
      while game < config.games do
        playGame(rules, bot, rng, config, seen, writer)
        game += 1
        if game % 25 == 0 then
          writer.flush()
          println(s"games=$game/${config.games} positions=${seen.size}")
    finally writer.close()
    println(s"Done. ${seen.size} unique positions -> ${config.out}")
  private def playGame(
      rules: RuleSet,
      bot: GameContext => Option[Move],
      rng: Random,
      config: Config,
      seen: mutable.HashSet[String],
      writer: BufferedWriter,
  ): Unit =
    randomOpening(rules, rng, config.randomPlies, GameContext.initial) match
      case None => ()
      case Some(start) =>
        var ctx   = start
        var plies = config.randomPlies
        var live  = true
        while live && plies < config.maxPlies do
          if isTerminal(rules, ctx) then live = false
          else
            bot(ctx) match
              case None => live = false
              case Some(move) =>
                ctx = rules.applyMove(ctx)(move)
                plies += 1
                record(rules, ctx, seen, writer)
  private def randomOpening(rules: RuleSet, rng: Random, plies: Int, start: GameContext): Option[GameContext] =
    var ctx = start
    var i   = 0
    while i < plies do
      val legal = rules.allLegalMoves(ctx)
      if legal.isEmpty then return None
      ctx = rules.applyMove(ctx)(legal(rng.nextInt(legal.size)))
      i += 1
    Some(ctx)
  private def record(rules: RuleSet, ctx: GameContext, seen: mutable.HashSet[String], writer: BufferedWriter): Unit =
    if !rules.isCheck(ctx) && !isTerminal(rules, ctx) then
      val fen = FenExporter.gameContextToFen(ctx)
      if seen.add(fen) then
        writer.write(fen)
        writer.newLine()
  private def isTerminal(rules: RuleSet, ctx: GameContext): Boolean =
    rules.allLegalMoves(ctx).isEmpty ||
      rules.isInsufficientMaterial(ctx) ||
      rules.isFiftyMoveRule(ctx) ||
      rules.isThreefoldRepetition(ctx)
  private def parse(args: List[String], acc: Config): Config = args match
    case "--games" :: v :: rest        => parse(rest, acc.copy(games = v.toInt))
    case "--out" :: v :: rest          => parse(rest, acc.copy(out = v))
    case "--weights" :: v :: rest      => parse(rest, acc.copy(weights = Some(v)))
    case "--move-ms" :: v :: rest      => parse(rest, acc.copy(moveTimeMs = v.toLong))
    case "--random-plies" :: v :: rest => parse(rest, acc.copy(randomPlies = v.toInt))
    case "--max-plies" :: v :: rest    => parse(rest, acc.copy(maxPlies = v.toInt))
    case "--seed" :: v :: rest         => parse(rest, acc.copy(seed = v.toLong))
    case Nil                           => acc
    case unknown :: rest               => println(s"Ignoring unknown arg: $unknown"); parse(rest, acc)
@@ -414,9 +414,17 @@ class TournamentBotGamePlayer:
          if gameTerminalStatuses.contains(status) then
            log.infof("Game %s ended — status=%s", gameId, status); done = true
          else
            // TEMP: tournament-server reports wrong color in pairings (everyone white).
            // The game endpoint white/black ids are correct, so derive our color from it.
            val whiteId = node.path("white").path("id").asText()
            val blackId = node.path("black").path("id").asText()
            val myColor =
              if whiteId == cfg.botId then "white"
              else if blackId == cfg.botId then "black"
              else color
            val turn = node.path("turn").asText()
            val fen  = node.path("fen").asText()
-            if turn == color && status == "ongoing" && fen.nonEmpty && fen != lastFen then
+            if turn == myColor && status == "ongoing" && fen.nonEmpty && fen != lastFen then
              lastFen = fen
              log.infof("Our turn in game %s — computing move (fen=%s)", gameId, fen)
              computeUci(cfg, fen) match
@@ -4,9 +4,9 @@ import de.nowchess.api.board.*
 import de.nowchess.api.game.GameContext
 import de.nowchess.api.move.{Move, MoveType, PromotionPiece}
-import java.io.{DataInputStream, FileInputStream}
+import java.io.{DataInputStream, FileInputStream, InputStream}
 import java.util.concurrent.ThreadLocalRandom
 import scala.collection.mutable
 import scala.util.Random
 /** Reads a Polyglot opening book (.bin file) and probes it for moves.
  *
@@ -16,24 +16,11 @@ import scala.util.Random
  *   - weight: 2 bytes (Short) — move weight (higher = preferred)
  *   - learn: 4 bytes (Int) — learning data (unused)
  */
-final class PolyglotBook(path: String):
+final class PolyglotBook private (entries: Map[Long, Vector[BookEntry]]):
  private val entries: Map[Long, Vector[BookEntry]] =
    try {
      val r = loadBookFile(path)
      println(s"Book loaded successfully. ${r.size} entries found.")
      r
    } catch
      case e: Exception =>
        println(s"Error loading book: $e")
        // Gracefully fail: return empty map if book cannot be loaded
        // This allows the bot to work even if the book file is missing
        scala.collection.immutable.Map.empty
  /** Probe the book for a move in the given position. Returns a weighted random move, or None if not in book. */
  def probe(context: GameContext): Option[Move] =
    val hash = PolyglotHash.hash(context)
    println(f"0x$hash%016X")
    entries.get(hash).flatMap { bookEntries =>
      if bookEntries.isEmpty then None
      else
@@ -41,24 +28,6 @@ final class PolyglotBook(path: String):
        decodeMove(entry.move, context)
    }
  private def loadBookFile(path: String): Map[Long, Vector[BookEntry]] =
    val input = DataInputStream(FileInputStream(path))
    try
      val result = mutable.Map[Long, Vector[BookEntry]]()
      while input.available() > 0 do
        val key    = input.readLong()
        val move   = input.readShort()
        val weight = input.readShort()
        input.readInt() // learning data (unused)
        val entry = BookEntry(key, move, weight)
        result.updateWith(key) {
          case Some(entries) => Some(entries :+ entry)
          case None          => Some(Vector(entry))
        }
      result.toMap
    finally input.close()
  /** Decode a packed Polyglot move short into an Option[Move].
    *
    * Bit layout of the move Short:
@@ -124,7 +93,7 @@ final class PolyglotBook(path: String):
    if entries.length == 1 then entries.head
    else
      val totalWeight = entries.map(_.weight).sum
-      val pick        = Random.nextInt(totalWeight.max(1)) // NOSONAR
+      val pick        = ThreadLocalRandom.current().nextInt(totalWeight.max(1)) // NOSONAR
      @scala.annotation.tailrec
      def select(remaining: Int, idx: Int): BookEntry =
@@ -134,4 +103,48 @@ final class PolyglotBook(path: String):
      select(pick, 0)
 object PolyglotBook:
  /** Load a book from a filesystem path. Fails gracefully to an empty book. */
  def apply(path: String): PolyglotBook =
    safeLoad(s"file $path")(FileInputStream(path))
  /** Load a book from a classpath resource (native-image safe: the resource is embedded in the binary, so no file must
    * be mounted into the pod).
    */
  def fromResource(name: String): PolyglotBook =
    Option(getClass.getResourceAsStream(name)) match
      case Some(stream) => safeLoad(s"resource $name")(stream)
      case None =>
        println(s"Error loading book: resource $name not found on classpath")
        new PolyglotBook(Map.empty)
  private def safeLoad(source: String)(stream: => InputStream): PolyglotBook =
    try
      val entries = parse(stream)
      println(s"Book loaded successfully from $source. ${entries.size} entries found.")
      new PolyglotBook(entries)
    catch
      case e: Exception =>
        println(s"Error loading book from $source: $e")
        new PolyglotBook(Map.empty)
  private def parse(stream: InputStream): Map[Long, Vector[BookEntry]] =
    val input = DataInputStream(stream)
    try
      val result = mutable.Map[Long, Vector[BookEntry]]()
      while input.available() > 0 do
        val key    = input.readLong()
        val move   = input.readShort()
        val weight = input.readShort()
        input.readInt() // learning data (unused)
        val entry = BookEntry(key, move, weight)
        result.updateWith(key) {
          case Some(entries) => Some(entries :+ entry)
          case None          => Some(Vector(entry))
        }
      result.toMap
    finally input.close()
 private case class BookEntry(key: Long, move: Short, weight: Int)
@@ -312,6 +312,24 @@ class AlphaBetaSearchTest extends AnyFunSuite with Matchers:
    val search = AlphaBetaSearch(qRules, weights = ZeroEval)
    search.bestMove(GameContext.initial, maxDepth = 1) should be(Some(rootMove))
  test("bestMove with root hints returns a valid move without regression"):
    val context    = GameContext.initial
    val legalMoves = DefaultRules.allLegalMoves(context)
    val hints      = legalMoves.zipWithIndex.map { case (m, i) => m -> (legalMoves.length - i) }.toMap
    val withHints = AlphaBetaSearch(DefaultRules, weights = EvaluationClassic)
      .bestMove(context, maxDepth = 2, Set.empty, hints)
    withHints should not be None
    legalMoves should contain(withHints.get)
  test("bestMoveWithTime with root hints returns a valid move without regression"):
    val context    = GameContext.initial
    val legalMoves = DefaultRules.allLegalMoves(context)
    val hints      = legalMoves.zipWithIndex.map { case (m, i) => m -> (legalMoves.length - i) }.toMap
    val withHints = AlphaBetaSearch(DefaultRules, weights = EvaluationClassic)
      .bestMoveWithTime(context, 500L, Set.empty, hints)
    withHints should not be None
    legalMoves should contain(withHints.get)
  test("quiescence depth-limit in-check branch is exercised"):
    val rootMove            = Move(Square(File.E, Rank.R2), Square(File.E, Rank.R3), MoveType.Normal())
    val capMove             = Move(Square(File.D, Rank.R2), Square(File.D, Rank.R3), MoveType.Normal(true))
@@ -85,17 +85,17 @@ class HybridBotTest extends AnyFunSuite with Matchers:
  private val altMove  = Move(Square(File.E, Rank.R2), Square(File.E, Rank.R3), MoveType.Normal())
  private def vetoRules: RuleSet = new RuleSet:
-    private def fresh(ctx: GameContext): Boolean                          = ctx.moves.isEmpty
+    private def fresh(ctx: GameContext): Boolean                         = ctx.moves.isEmpty
-    def candidateMoves(context: GameContext)(square: Square): List[Move]  = Nil
+    def candidateMoves(context: GameContext)(square: Square): List[Move] = Nil
-    def legalMoves(context: GameContext)(square: Square): List[Move]      = Nil
+    def legalMoves(context: GameContext)(square: Square): List[Move]     = Nil
-    def allLegalMoves(context: GameContext): List[Move]                   =
+    def allLegalMoves(context: GameContext): List[Move] =
      if fresh(context) then List(mateMove, altMove) else Nil
-    def isCheck(context: GameContext): Boolean                            = false
+    def isCheck(context: GameContext): Boolean                = false
-    def isCheckmate(context: GameContext): Boolean                        = context.moves.lastOption.contains(mateMove)
+    def isCheckmate(context: GameContext): Boolean            = context.moves.lastOption.contains(mateMove)
-    def isStalemate(context: GameContext): Boolean                        = context.moves.lastOption.contains(altMove)
+    def isStalemate(context: GameContext): Boolean            = context.moves.lastOption.contains(altMove)
-    def isInsufficientMaterial(context: GameContext): Boolean             = false
+    def isInsufficientMaterial(context: GameContext): Boolean = false
-    def isFiftyMoveRule(context: GameContext): Boolean                    = false
+    def isFiftyMoveRule(context: GameContext): Boolean        = false
-    def isThreefoldRepetition(context: GameContext): Boolean              = false
+    def isThreefoldRepetition(context: GameContext): Boolean  = false
    def applyMove(context: GameContext)(move: Move): GameContext =
      context.copy(turn = context.turn.opposite, moves = context.moves :+ move)
@@ -217,3 +217,60 @@ class MoveOrderingTest extends AnyFunSuite with Matchers:
    val castle  = Move(Square(File.E, Rank.R1), Square(File.G, Rank.R1), MoveType.CastleKingside)
    MoveOrdering.score(context, castle, None) should be(0)
  test("root hints override capture heuristics at ply 0"):
    val board = Board(
      Map(
        Square(File.E, Rank.R4) -> Piece.WhiteQueen,
        Square(File.E, Rank.R5) -> Piece.BlackPawn,
        Square(File.D, Rank.R5) -> Piece.BlackRook,
      ),
    )
    val context     = GameContext.initial.withBoard(board).withTurn(Color.White)
    val quietMove   = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R6))
    val rookCapture = Move(Square(File.E, Rank.R4), Square(File.D, Rank.R5), MoveType.Normal(true))
    val hints       = Map(quietMove -> 500, rookCapture -> 100)
    MoveOrdering.score(context, quietMove, None, ply = 0, rootHints = hints) should equal(500)
    MoveOrdering.score(context, rookCapture, None, ply = 0, rootHints = hints) should equal(100)
    MoveOrdering.score(context, rookCapture, None, ply = 0, rootHints = hints) should be <
      MoveOrdering.score(context, quietMove, None, ply = 0, rootHints = hints)
  test("root hints ignored at ply > 0"):
    val board   = Board(Map(Square(File.E, Rank.R4) -> Piece.WhiteQueen, Square(File.E, Rank.R5) -> Piece.BlackPawn))
    val context = GameContext.initial.withBoard(board).withTurn(Color.White)
    val capture = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R5), MoveType.Normal(true))
    val quiet   = Move(Square(File.E, Rank.R4), Square(File.D, Rank.R4))
    val hints   = Map(quiet -> 99999, capture -> -99999)
    val captureScore = MoveOrdering.score(context, capture, None, ply = 1, rootHints = hints)
    val quietScore   = MoveOrdering.score(context, quiet, None, ply = 1, rootHints = hints)
    captureScore should be > quietScore
  test("move absent from root hints gets Int.MinValue / 2 fallback"):
    val board   = Board(Map(Square(File.E, Rank.R4) -> Piece.WhiteQueen))
    val context = GameContext.initial.withBoard(board).withTurn(Color.White)
    val move1   = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R6))
    val move2   = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R5))
    val hints   = Map(move1 -> 0)
    MoveOrdering.score(context, move2, None, ply = 0, rootHints = hints) should equal(Int.MinValue / 2)
  test("sort uses root hints at ply 0 to reorder moves"):
    val board = Board(
      Map(
        Square(File.E, Rank.R4) -> Piece.WhiteQueen,
        Square(File.E, Rank.R5) -> Piece.BlackPawn,
        Square(File.D, Rank.R5) -> Piece.BlackRook,
      ),
    )
    val context     = GameContext.initial.withBoard(board).withTurn(Color.White)
    val rookCapture = Move(Square(File.E, Rank.R4), Square(File.D, Rank.R5), MoveType.Normal(true))
    val pawnCapture = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R5), MoveType.Normal(true))
    val quiet       = Move(Square(File.E, Rank.R4), Square(File.E, Rank.R6))
    val hints       = Map(quiet -> 9999, pawnCapture -> 500, rookCapture -> 100)
    val sorted = MoveOrdering.sort(context, List(rookCapture, pawnCapture, quiet), None, ply = 0, rootHints = hints)
    sorted.head should equal(quiet)
    sorted(1) should equal(pawnCapture)
    sorted(2) should equal(rookCapture)
@@ -1,3 +1,3 @@
 MAJOR=0
-MINOR=33
+MINOR=39
 PATCH=0
Author	SHA1	Message	Date
Janis Eccarius	e2b4342f60	fix(official-bots): prevent Colab OOM in NNUE training Build & Test (NowChessSystems) TeamCity build started Details Dense 98304-dim HalfKP features at batch_size=16384 cost ~6.4 GB/batch on the host; with 8 hardcoded DataLoader workers and prefetch this OOM-killed the Colab runtime. - train.py: adaptive DataLoader workers (min(4, cpu_count), Colab free tier = 2), overridable via NNUE_LOADER_WORKERS; persistent_workers only when > 0. - NNUETraining.ipynb: lower BATCH_SIZE 16384 -> 4096 with a memory-cost note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 22:18:18 +02:00
TeamCity	9d56446c65	ci: bump version with Build-156	2026-06-24 20:17:17 +00:00
Janis Eccarius	1c80abdb8a	feat(official-bots): standalone self-play + one-shot dataset builder for NNUE training Build & Test (NowChessSystems) TeamCity build finished Details Add an easy local data pipeline feeding GPU training on Colab. - SelfPlayMain: standalone NNUEBot self-play (no microservices) writing FENs for labeling; randomised openings for game diversity, sequential due to the shared EvaluationNNUE accumulator. Exposed via the `selfPlay` Gradle task and selfplay.sh. - NNUEBot: optional fixedMoveTimeMs so self-play runs fast (default unchanged). - NbaiLoader: honor `-Dnnue.weights=<path>` to load weights from a file before falling back to the bundled resource. - build_dataset.py / dataset.sh: one command builds the entire dataset (Lichess eval-DB backbone + self-play + tactical + random filler), dedups, balances the eval histogram, writes append-only zstd shards + manifest, and rclone-pushes to Drive. - train.py: NNUEDataset reads a directory of .jsonl.zst shards (streaming) in addition to a single file. - NNUETraining.ipynb: clone to ephemeral /content, sync shards from Drive (cache-aware), train on the shards dir; removed Colab generation/upload steps. - Concept + implementation plan docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 22:04:22 +02:00
TeamCity	c8cbcdca3b	ci: bump version with Build-155	2026-06-24 18:21:11 +00:00
Janis	e4fee85134	feat(ncs-110): feed NNUE root-move scores into search move ordering (#83 ) Build & Test (NowChessSystems) TeamCity build finished Details Pre-evaluated NNUE scores from NNUEBot.batchEvaluateRoot are now passed as root hints into AlphaBetaSearch, improving move ordering at ply 0 before the TT is populated. Hints are threaded immutably through SearchParams to satisfy the no-var constraint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Janis Eccarius <eccariusjanis@gmail.com> Reviewed-on: #83	2026-06-24 20:09:28 +02:00
TeamCity	b4709b4a33	ci: bump version with Build-154	2026-06-24 17:55:44 +00:00
Janis Eccarius	9f9140cb58	fix: modified training pipeline Build & Test (NowChessSystems) TeamCity build finished Details	2026-06-24 19:37:26 +02:00
Janis	fa10852bc9	feat(official-bots): add Google Colab notebook for NNUE training (NCS-111) (#81 ) Build & Test (NowChessSystems) TeamCity build finished Details Adds python/NNUETraining.ipynb with five sections: - Setup: mount Drive, clone/update repo, install deps + Stockfish - Data: Option A (generate + label) or Option B (upload existing labeled.jsonl) - Train: standard epoch loop or burst mode (recommended for Colab free tier) - Export: convert best .pt checkpoint to .nbai via export.py - Download: pull .nbai and .pt to local machine via files.download Checkpoints and datasets are persisted to Google Drive so training survives session disconnects and can be resumed automatically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Janis Eccarius <eccariusjanis@gmail.com> Reviewed-on: #81	2026-06-24 19:33:24 +02:00
Janis	44f376f032	feat(official-bots): implement king-relative (HalfKP) encoding in NNUE (NCS-109) (#80 ) Co-authored-by: Janis Eccarius <eccariusjanis@gmail.com> Reviewed-on: #80	2026-06-24 19:33:12 +02:00
TeamCity	7372867a82	ci: bump version with Build-152	2026-06-23 22:30:53 +00:00
Janis Eccarius	c3e7b82ae8	feat(analytics): add accuracy and blunder analysis job for Lichess data Build & Test (NowChessSystems) TeamCity build finished Details	2026-06-24 00:21:40 +02:00
TeamCity	e88b081947	ci: bump version with Build-151	2026-06-23 21:54:06 +00:00
Janis Eccarius	1b30c3be39	fix(official-bots): use ThreadLocalRandom in PolyglotBook for native image Build & Test (NowChessSystems) TeamCity build finished Details A stored java.util.Random field is reachable from BotController's static openingBook, so GraalVM baked it into the image heap and aborted the native build (Random in image heap has a cached seed). Use ThreadLocalRandom.current() at call time instead — no stored instance, nothing in the image heap, still thread-safe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 23:42:15 +02:00
Janis Eccarius	f8ca95af3c	refactor(official-bots): use java.util.Random in PolyglotBook Build & Test (NowChessSystems) TeamCity build finished Details scala.util.Random delegates to a shared global java.util.Random, a contention point across concurrent bot games. Use a per-book java.util.Random instance instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 23:34:38 +02:00
TeamCity	4a50db0721	ci: bump version with Build-150	2026-06-23 21:27:19 +00:00
Janis Eccarius	260db25803	feat(official-bots): activate opening book in expert bot (native-safe) Build & Test (NowChessSystems) TeamCity build finished Details Load the Polyglot opening book as a classpath resource and wire it into the expert HybridBot. Previously the bot supported Option[PolyglotBook] but BotController passed None, so the book was never used. PolyglotBook.fromResource reads via getResourceAsStream so the book is embedded in the GraalVM native image instead of read from the filesystem (FileInputStream) — no file needs mounting into the pod. The filesystem apply(path) factory is kept for tests. Moved codekiddy.bin into resources as opening_book.bin. Dropped the per-probe debug println. NCS-43 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 23:17:52 +02:00
TeamCity	80e1cc258b	ci: bump version with Build-149	2026-06-23 21:08:35 +00:00
Janis	bfc46723e6	fix(official-bots): derive tournament game color from game endpoint (#79 ) Build & Test (NowChessSystems) TeamCity build finished Details Tournament-server reports wrong color in pairings (everyone white), so auto-joined games could play with an inverted color and never move on their real turn. The game endpoint white/black ids are correct, so the poll loop now derives our color from it, falling back to the passed-in color. Both stream and auto-join entry paths are now immune to the bug. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Janis Eccarius <eccariusjanis@gmail.com> Reviewed-on: #79	2026-06-23 22:58:09 +02:00