Compare commits

...

7 Commits

Author SHA1 Message Date
TeamCity 7372867a82 ci: bump version with Build-152 2026-06-23 22:30:53 +00:00
Janis Eccarius c3e7b82ae8 feat(analytics): add accuracy and blunder analysis job for Lichess data
Build & Test (NowChessSystems) TeamCity build finished
2026-06-24 00:21:40 +02:00
TeamCity e88b081947 ci: bump version with Build-151 2026-06-23 21:54:06 +00:00
Janis Eccarius 1b30c3be39 fix(official-bots): use ThreadLocalRandom in PolyglotBook for native image
Build & Test (NowChessSystems) TeamCity build finished
A stored java.util.Random field is reachable from BotController's static
openingBook, so GraalVM baked it into the image heap and aborted the
native build (Random in image heap has a cached seed). Use
ThreadLocalRandom.current() at call time instead — no stored instance,
nothing in the image heap, still thread-safe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 23:42:15 +02:00
Janis Eccarius f8ca95af3c refactor(official-bots): use java.util.Random in PolyglotBook
Build & Test (NowChessSystems) TeamCity build finished
scala.util.Random delegates to a shared global java.util.Random, a
contention point across concurrent bot games. Use a per-book
java.util.Random instance instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 23:34:38 +02:00
TeamCity 4a50db0721 ci: bump version with Build-150 2026-06-23 21:27:19 +00:00
Janis Eccarius 260db25803 feat(official-bots): activate opening book in expert bot (native-safe)
Build & Test (NowChessSystems) TeamCity build finished
Load the Polyglot opening book as a classpath resource and wire it into
the expert HybridBot. Previously the bot supported Option[PolyglotBook]
but BotController passed None, so the book was never used.

PolyglotBook.fromResource reads via getResourceAsStream so the book is
embedded in the GraalVM native image instead of read from the filesystem
(FileInputStream) — no file needs mounting into the pod. The filesystem
apply(path) factory is kept for tests. Moved codekiddy.bin into
resources as opening_book.bin. Dropped the per-probe debug println.

NCS-43

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 23:17:52 +02:00
10 changed files with 712 additions and 38 deletions
+17
View File
@@ -81,3 +81,20 @@
* **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c)) * **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
* **analytics:** write decompressed PGN to shared PVC path for executor access ([a268a9a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a268a9acb7ba190c76e996ccf3ea3bd00e5cec92)) * **analytics:** write decompressed PGN to shared PVC path for executor access ([a268a9a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a268a9acb7ba190c76e996ccf3ea3bd00e5cec92))
## (2026-06-23)
### Features
* **analytics:** add 7 new Spark analytics jobs and extend GameSource ([8e17c14](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/8e17c14dff740cd115011dfbf17de35083b8fe46))
* **analytics:** add accuracy and blunder analysis job for Lichess data ([c3e7b82](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c3e7b82ae806adf5713ce4d267c1155e73a40ff5))
* **analytics:** add Dockerfile, CI workflow, and stable jar name for K8s deployment ([95215b6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/95215b6a420fd526df1aa395f9b087556c8ad03b))
* **analytics:** add PostgreSQL JDBC write-back to all four batch jobs ([0e0ea4c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/0e0ea4c9893c6efed52e633e55d05ab3ed004502))
* **analytics:** add Spark batch analytics module ([259b3bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/259b3bbb24c0f23326269b93f4b3c84012f727cd))
* **analytics:** add Structured Streaming, MLlib clustering, GraphX jobs ([e1d80b9](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/e1d80b9331666feea191b1fd08aa762f3581c918))
* **analytics:** always write results to PostgreSQL regardless of input source ([da0e6d1](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/da0e6d1ee2d391ecb6291396f82471eb51b1b25e))
* **official-bots:** park expert bot on tournament server at startup ([#76](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/76)) ([751a58b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/751a58b6061f7434115e229a7661894c76768bc2))
### Bug Fixes
* **analytics:** upgrade Spark to 4.0.3 — 3.5.x has no official Docker image ([46af115](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/46af1154de34a8596cb6cb28c6fad7aba90f597c))
* **analytics:** write decompressed PGN to shared PVC path for executor access ([a268a9a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a268a9acb7ba190c76e996ccf3ea3bd00e5cec92))
@@ -0,0 +1,191 @@
package de.nowchess.analytics
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions as F
/** Per-move accuracy & blunder analysis mined from Lichess `[%eval ...]` move annotations.
*
* Unlike the flat single-`groupBy` summaries (opening rates, colour advantage), this job reconstructs the *quality of
* every move* from the engine evaluations Lichess embeds in the movetext (`{ [%eval 0.24] }`, mate scores `[%eval
* #-3]`) and turns them into the same accuracy signals lichess.com surfaces: average centipawn loss (ACPL), and counts
* of inaccuracies / mistakes / blunders.
*
* Pipeline (all Spark SQL string/array functions + window funcs — no UDFs, Catalyst-friendly):
* 1. Keep only games carrying `[%eval` comments.
* 2. `regexp_extract_all` pulls every eval in ply order; mate scores collapse to ±10 pawns, normal evals are clamped
* to ±10 so a single huge swing cannot dominate the mean. All evals are White-POV pawns.
* 3. `posexplode` → one row per ply; a per-game window `lag` gives the eval *before* the move.
* 4. Centipawn loss for the side that moved = how much the eval moved against them (white wants it up, black down),
* floored at 0 and scaled to centipawns.
* 5. Roll up to (game, side): ACPL + inaccuracy(≥50cp) / mistake(≥100cp) / blunder(≥200cp) counts, tagged with that
* side's Elo and whether they won.
*
* Outputs (Parquet + CSV + JDBC):
* - `accuracy_by_rating` — ACPL, avg blunders/mistakes/inaccuracies per game and win-rate, per Elo band. Shows how
* move quality scales with rating.
* - `blunder_outcome` — win-rate bucketed by number of blunders in the game. Quantifies "one blunder costs you the
* game".
*
* Requires the eval-annotated Lichess dump (`NOWCHESS_PGN_PATH` → an evals dump); JDBC games carry no per-move evals.
*/
object AccuracyBlunderJob:
def main(args: Array[String]): Unit =
val jdbcUrl = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
val dbUser = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
val dbPass = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-accuracy"
val spark = SparkSession
.builder()
.appName("NowChess Accuracy & Blunders")
.getOrCreate()
run(spark, jdbcUrl, dbUser, dbPass, outputDir)
spark.stop()
def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
val games = GameSource
.loadExtended(spark, jdbcUrl, dbUser, dbPass)
.select("pgn", "result", "white_elo", "black_elo")
.filter(F.col("result").isNotNull.and(F.col("pgn").contains("[%eval")))
.withColumn("game_id", F.monotonically_increasing_id())
// White-POV pawn evals in ply order; mate → ±10, normal evals clamped to ±10.
val evalStrs = F.expr("""regexp_extract_all(pgn, '\\[%eval ([^\\]]+)\\]', 1)""")
val evalCps = F.expr(
"transform(eval_strs, x -> CASE " +
"WHEN x LIKE '#-%' THEN -10.0 " +
"WHEN x LIKE '#%' THEN 10.0 " +
"ELSE greatest(-10.0, least(10.0, cast(x as double))) END)",
)
val withEvals = games
.withColumn("eval_strs", evalStrs)
.withColumn("eval_cp", evalCps)
.filter(F.size(F.col("eval_cp")) >= 2)
val plies = withEvals.select(
F.col("game_id"),
F.col("result"),
F.col("white_elo"),
F.col("black_elo"),
F.posexplode(F.col("eval_cp")).as(Seq("ply", "eval_after")),
)
val byGame = Window.partitionBy("game_id").orderBy("ply")
val mover = F.when(F.col("ply") % 2 === 0, "white").otherwise("black")
val evalBefore = F.coalesce(F.lag("eval_after", 1).over(byGame), F.lit(0.15))
val cpl = F.greatest(
F.lit(0.0),
F.when(F.col("mover") === "white", evalBefore - F.col("eval_after"))
.otherwise(F.col("eval_after") - evalBefore),
) * 100
val moves = plies
.withColumn("mover", mover)
.withColumn("cpl", cpl)
val perSide = moves
.groupBy("game_id", "mover", "result", "white_elo", "black_elo")
.agg(
F.round(F.avg("cpl"), 1).as("acpl"),
F.sum(F.when(F.col("cpl") >= 200, 1).otherwise(0)).as("blunders"),
F.sum(F.when(F.col("cpl") >= 100 && F.col("cpl") < 200, 1).otherwise(0)).as("mistakes"),
F.sum(F.when(F.col("cpl") >= 50 && F.col("cpl") < 100, 1).otherwise(0)).as("inaccuracies"),
)
.withColumn(
"self_elo",
F.when(F.col("mover") === "white", F.col("white_elo")).otherwise(F.col("black_elo")),
)
.withColumn("won", F.when(F.col("mover") === F.col("result"), 1).otherwise(0))
writeAccuracyByRating(perSide, jdbcUrl, dbUser, dbPass, outputDir)
writeBlunderOutcome(perSide, jdbcUrl, dbUser, dbPass, outputDir)
private def writeAccuracyByRating(
perSide: org.apache.spark.sql.DataFrame,
jdbcUrl: String,
dbUser: String,
dbPass: String,
outputDir: String,
): Unit =
val elo = F.col("self_elo")
val band = F
.when(elo < 1200, "<1200")
.when(elo < 1500, "12001499")
.when(elo < 1800, "15001799")
.when(elo < 2100, "18002099")
.otherwise("2100+")
val bandOrder = F
.when(elo < 1200, 1)
.when(elo < 1500, 2)
.when(elo < 1800, 3)
.when(elo < 2100, 4)
.otherwise(5)
val stats = perSide
.filter(elo.isNotNull)
.withColumn("band", band)
.withColumn("band_order", bandOrder)
.groupBy("band", "band_order")
.agg(
F.count("*").as("player_games"),
F.round(F.avg("acpl"), 1).as("avg_acpl"),
F.round(F.avg("blunders"), 2).as("avg_blunders"),
F.round(F.avg("mistakes"), 2).as("avg_mistakes"),
F.round(F.avg("inaccuracies"), 2).as("avg_inaccuracies"),
F.round(F.avg("won"), 3).as("win_rate"),
)
.orderBy(F.asc("band_order"))
.drop("band_order")
write(stats, outputDir, "accuracy_by_rating", jdbcUrl, dbUser, dbPass, "analytics_accuracy_by_rating")
private def writeBlunderOutcome(
perSide: org.apache.spark.sql.DataFrame,
jdbcUrl: String,
dbUser: String,
dbPass: String,
outputDir: String,
): Unit =
val b = F.col("blunders")
val bucket = F.when(b === 0, "0").when(b === 1, "1").when(b === 2, "2").otherwise("3+")
val order = F.when(b === 0, 0).when(b === 1, 1).when(b === 2, 2).otherwise(3)
val stats = perSide
.withColumn("blunder_bucket", bucket)
.withColumn("bucket_order", order)
.groupBy("blunder_bucket", "bucket_order")
.agg(
F.count("*").as("player_games"),
F.round(F.avg("won"), 3).as("win_rate"),
F.round(F.avg("acpl"), 1).as("avg_acpl"),
)
.orderBy(F.asc("bucket_order"))
.drop("bucket_order")
write(stats, outputDir, "blunder_outcome", jdbcUrl, dbUser, dbPass, "analytics_blunder_outcome")
private def write(
df: org.apache.spark.sql.DataFrame,
outputDir: String,
name: String,
jdbcUrl: String,
dbUser: String,
dbPass: String,
table: String,
): Unit =
df.write.mode("overwrite").parquet(s"$outputDir/$name")
df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
if !GameSource.isPgnMode then
df.write
.mode("overwrite")
.format("jdbc")
.option("url", jdbcUrl)
.option("dbtable", table)
.option("user", dbUser)
.option("password", dbPass)
.option("driver", "org.postgresql.Driver")
.save()
@@ -0,0 +1,199 @@
package de.nowchess.analytics
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions as F
/** Time-management & clock-pressure analysis mined from Lichess `[%clk ...]` move annotations.
*
* Lichess records each player's remaining clock after every move (`{ [%clk 0:02:31] }`). This job reconstructs
* per-move thinking time and remaining-time from those stamps to answer questions the existing time-control summary
* cannot: how long do players actually think, how often do they fall into time scrambles (<10 s left), how often do
* they flag (lose on time), and does burning the clock correlate with winning?
*
* Pipeline (Spark SQL string/array funcs + window funcs — no UDFs):
* 1. `regexp_extract_all` pulls every `h:mm:ss` clock in ply order, converted to seconds.
* 2. `posexplode` → one row per ply; even plies are White's clock, odd plies Black's.
* 3. A per-(game,side) window `lag` gives the same side's previous clock; the difference is that move's thinking time.
* Remaining clock <10 s marks a time-scramble move.
* 4. Roll up to (game, side): avg move time, scramble fraction, min clock, Elo, win flag, and whether the side lost on
* time (`Termination "Time forfeit"`).
*
* Outputs (Parquet + CSV + JDBC):
* - `clock_by_rating` — avg move time, scramble fraction, flag-loss rate and win-rate per Elo band.
* - `scramble_outcome` — win-rate bucketed by how much of the game was played in time-scramble. Quantifies the cost of
* time trouble.
*
* Requires a clock-annotated Lichess dump (`NOWCHESS_PGN_PATH`).
*/
object ClockPressureJob:
def main(args: Array[String]): Unit =
val jdbcUrl = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
val dbUser = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
val dbPass = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-clock-pressure"
val spark = SparkSession
.builder()
.appName("NowChess Clock Pressure")
.getOrCreate()
run(spark, jdbcUrl, dbUser, dbPass, outputDir)
spark.stop()
def run(spark: SparkSession, jdbcUrl: String, dbUser: String, dbPass: String, outputDir: String): Unit =
val games = GameSource
.loadExtended(spark, jdbcUrl, dbUser, dbPass)
.select("pgn", "result", "white_elo", "black_elo", "termination")
.filter(F.col("result").isNotNull.and(F.col("pgn").contains("[%clk")))
.withColumn("game_id", F.monotonically_increasing_id())
val clkStrs = F.expr("""regexp_extract_all(pgn, '\\[%clk ([^\\]]+)\\]', 1)""")
// "h:mm:ss" → seconds.
val clkSecs = F.expr(
"transform(clk_strs, x -> " +
"cast(split(x, ':')[0] as double) * 3600 + " +
"cast(split(x, ':')[1] as double) * 60 + " +
"cast(split(x, ':')[2] as double))",
)
val withClk = games
.withColumn("clk_strs", clkStrs)
.withColumn("clk_sec", clkSecs)
.filter(F.size(F.col("clk_sec")) >= 4)
val plies = withClk.select(
F.col("game_id"),
F.col("result"),
F.col("white_elo"),
F.col("black_elo"),
F.col("termination"),
F.posexplode(F.col("clk_sec")).as(Seq("ply", "clk_after")),
)
val mover = F.when(F.col("ply") % 2 === 0, "white").otherwise("black")
val bySide = Window.partitionBy("game_id", "mover").orderBy("ply")
val moveTime = F.lag("clk_after", 1).over(bySide) - F.col("clk_after")
val moves = plies
.withColumn("mover", mover)
.withColumn("move_time", moveTime)
val perSide = moves
.groupBy("game_id", "mover", "result", "white_elo", "black_elo", "termination")
.agg(
F.round(F.avg("move_time"), 1).as("avg_move_time"),
F.count("*").as("moves"),
F.round(F.min("clk_after"), 1).as("min_clk"),
F.sum(F.when(F.col("clk_after") < 10, 1).otherwise(0)).as("scramble_moves"),
)
.withColumn("scramble_fraction", F.round(F.col("scramble_moves") / F.col("moves"), 3))
.withColumn(
"self_elo",
F.when(F.col("mover") === "white", F.col("white_elo")).otherwise(F.col("black_elo")),
)
.withColumn("won", F.when(F.col("mover") === F.col("result"), 1).otherwise(0))
.withColumn(
"flag_loss",
F.when(
F.coalesce(F.col("termination"), F.lit("")).contains("Time forfeit") && F.col("won") === 0,
1,
).otherwise(0),
)
writeClockByRating(perSide, jdbcUrl, dbUser, dbPass, outputDir)
writeScrambleOutcome(perSide, jdbcUrl, dbUser, dbPass, outputDir)
private def writeClockByRating(
perSide: org.apache.spark.sql.DataFrame,
jdbcUrl: String,
dbUser: String,
dbPass: String,
outputDir: String,
): Unit =
val elo = F.col("self_elo")
val band = F
.when(elo < 1200, "<1200")
.when(elo < 1500, "12001499")
.when(elo < 1800, "15001799")
.when(elo < 2100, "18002099")
.otherwise("2100+")
val bandOrder = F
.when(elo < 1200, 1)
.when(elo < 1500, 2)
.when(elo < 1800, 3)
.when(elo < 2100, 4)
.otherwise(5)
val stats = perSide
.filter(elo.isNotNull)
.withColumn("band", band)
.withColumn("band_order", bandOrder)
.groupBy("band", "band_order")
.agg(
F.count("*").as("player_games"),
F.round(F.avg("avg_move_time"), 1).as("avg_move_time_s"),
F.round(F.avg("scramble_fraction"), 3).as("avg_scramble_fraction"),
F.round(F.avg("flag_loss"), 3).as("flag_loss_rate"),
F.round(F.avg("won"), 3).as("win_rate"),
)
.orderBy(F.asc("band_order"))
.drop("band_order")
write(stats, outputDir, "clock_by_rating", jdbcUrl, dbUser, dbPass, "analytics_clock_by_rating")
private def writeScrambleOutcome(
perSide: org.apache.spark.sql.DataFrame,
jdbcUrl: String,
dbUser: String,
dbPass: String,
outputDir: String,
): Unit =
val sf = F.col("scramble_fraction")
val bucket = F
.when(sf === 0, "none")
.when(sf < 0.05, "<5%")
.when(sf < 0.20, "520%")
.otherwise(">20%")
val order = F
.when(sf === 0, 0)
.when(sf < 0.05, 1)
.when(sf < 0.20, 2)
.otherwise(3)
val stats = perSide
.withColumn("scramble_bucket", bucket)
.withColumn("bucket_order", order)
.groupBy("scramble_bucket", "bucket_order")
.agg(
F.count("*").as("player_games"),
F.round(F.avg("won"), 3).as("win_rate"),
F.round(F.avg("flag_loss"), 3).as("flag_loss_rate"),
)
.orderBy(F.asc("bucket_order"))
.drop("bucket_order")
write(stats, outputDir, "scramble_outcome", jdbcUrl, dbUser, dbPass, "analytics_scramble_outcome")
private def write(
df: org.apache.spark.sql.DataFrame,
outputDir: String,
name: String,
jdbcUrl: String,
dbUser: String,
dbPass: String,
table: String,
): Unit =
df.write.mode("overwrite").parquet(s"$outputDir/$name")
df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
if !GameSource.isPgnMode then
df.write
.mode("overwrite")
.format("jdbc")
.option("url", jdbcUrl)
.option("dbtable", table)
.option("user", dbUser)
.option("password", dbPass)
.option("driver", "org.postgresql.Driver")
.save()
@@ -0,0 +1,154 @@
package de.nowchess.analytics
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions as F
/** Smurf / sandbagging anomaly detection via population z-scores.
*
* Smurfs (strong players on fresh accounts) and sandbaggers leave a statistical signature: a win-rate, an upset-rate
* (beating higher-rated opponents) and a self-Elo climb that sit far above the population norm. This job builds those
* three features per player, standardises each against the whole player base, and flags the players whose combined
* deviation is extreme.
*
* Features per player (from each game's own/opponent Elo):
* - win_rate — fraction of decisive results won
* - upset_rate — wins vs higher-rated opponents / games vs higher-rated opponents
* - elo_climb — max self-Elo min self-Elo across their games (rapid rating gain)
*
* Standardisation uses a single unbounded window (`Window.partitionBy()`), i.e. mean/stddev over every qualifying
* player, so z = (x μ) / σ. The composite anomaly score sums the three z-scores. No UDFs — pure SQL aggregates +
* window functions, so Catalyst plans the whole job.
*
* Outputs (Parquet + CSV + JDBC):
* - `anomaly_scores` — every qualifying player with features, z-scores and composite, ranked most-anomalous first.
* - `flagged_smurfs` — the suspicious subset (high composite, or the classic high-winrate / few-games / steep-climb
* profile).
*
* Meaningful only when Elo is present (Lichess dump); requires `minGames` (arg 1, default 15) to avoid small-sample
* noise.
*/
object SmurfAnomalyJob:
def main(args: Array[String]): Unit =
val jdbcUrl = sys.env.getOrElse("NOWCHESS_JDBC_URL", "jdbc:postgresql://localhost:5432/nowchess")
val dbUser = sys.env.getOrElse("NOWCHESS_DB_USER", "nowchess")
val dbPass = sys.env.getOrElse("NOWCHESS_DB_PASS", "nowchess")
val outputDir = if args.length > 0 then args(0) else "/tmp/nowchess-smurf-anomaly"
val minGames = if args.length > 1 then args(1).toInt else 15
val spark = SparkSession
.builder()
.appName("NowChess Smurf Anomaly Detection")
.getOrCreate()
run(spark, jdbcUrl, dbUser, dbPass, outputDir, minGames)
spark.stop()
def run(
spark: SparkSession,
jdbcUrl: String,
dbUser: String,
dbPass: String,
outputDir: String,
minGames: Int,
): Unit =
val games = GameSource
.loadExtended(spark, jdbcUrl, dbUser, dbPass)
.select("white_id", "black_id", "result", "white_elo", "black_elo")
.filter(F.col("result").isNotNull)
val asWhite = games.select(
F.col("white_id").as("player_id"),
F.col("white_elo").as("self_elo"),
F.col("black_elo").as("opp_elo"),
F.when(F.col("result") === "white", 1).otherwise(0).as("won"),
)
val asBlack = games.select(
F.col("black_id").as("player_id"),
F.col("black_elo").as("self_elo"),
F.col("white_elo").as("opp_elo"),
F.when(F.col("result") === "black", 1).otherwise(0).as("won"),
)
val playerGames = asWhite
.union(asBlack)
.filter(F.col("self_elo").isNotNull.and(F.col("opp_elo").isNotNull))
val higher = F.col("opp_elo") > F.col("self_elo")
val features = playerGames
.groupBy("player_id")
.agg(
F.count("*").as("total_games"),
F.round(F.avg("won"), 3).as("win_rate"),
F.round(F.avg("self_elo"), 0).as("avg_self_elo"),
(F.max("self_elo") - F.min("self_elo")).as("elo_climb"),
F.sum(F.when(higher, 1).otherwise(0)).as("vs_higher"),
F.sum(F.when(higher && F.col("won") === 1, 1).otherwise(0)).as("upsets"),
)
.filter(F.col("total_games") >= minGames)
.withColumn("upset_rate", F.round(F.col("upsets") / F.greatest(F.col("vs_higher"), F.lit(1)), 3))
val all = Window.partitionBy()
def z(col: String): org.apache.spark.sql.Column =
val mean = F.avg(col).over(all)
val std = F.stddev(col).over(all)
F.round((F.col(col) - mean) / F.when(std === 0 || std.isNull, F.lit(1.0)).otherwise(std), 2)
val scored = features
.withColumn("z_win_rate", z("win_rate"))
.withColumn("z_upset_rate", z("upset_rate"))
.withColumn("z_elo_climb", z("elo_climb"))
.withColumn(
"anomaly_score",
F.round(F.col("z_win_rate") + F.col("z_upset_rate") + F.col("z_elo_climb"), 2),
)
.withColumn(
"flagged",
(F.col("anomaly_score") >= 4.0)
.or(F.col("win_rate") >= 0.8 && F.col("total_games") < 50 && F.col("elo_climb") >= 300),
)
val ordered = scored
.select(
"player_id",
"total_games",
"win_rate",
"avg_self_elo",
"elo_climb",
"upset_rate",
"z_win_rate",
"z_upset_rate",
"z_elo_climb",
"anomaly_score",
"flagged",
)
.orderBy(F.desc("anomaly_score"))
write(ordered, outputDir, "anomaly_scores", jdbcUrl, dbUser, dbPass, "analytics_smurf_anomaly")
val flagged = ordered.filter(F.col("flagged") === true)
write(flagged, outputDir, "flagged_smurfs", jdbcUrl, dbUser, dbPass, "analytics_flagged_smurfs")
private def write(
df: org.apache.spark.sql.DataFrame,
outputDir: String,
name: String,
jdbcUrl: String,
dbUser: String,
dbPass: String,
table: String,
): Unit =
df.write.mode("overwrite").parquet(s"$outputDir/$name")
df.write.mode("overwrite").option("header", "true").csv(s"$outputDir/${name}_csv")
if !GameSource.isPgnMode then
df.write
.mode("overwrite")
.format("jdbc")
.option("url", jdbcUrl)
.option("dbtable", table)
.option("user", dbUser)
.option("password", dbPass)
.option("driver", "org.postgresql.Driver")
.save()
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0 MAJOR=0
MINOR=7 MINOR=8
PATCH=0 PATCH=0
+97
View File
@@ -890,3 +890,100 @@
### Reverts ### Reverts
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656)) * Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-06-23)
### Features
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
* **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
* **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
* **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
* **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
* **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
* **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
* **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
* **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
* **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
* **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
* **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
* **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
* **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
* **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
* **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
* **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
* **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
* **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
* **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
* **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
* **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
* **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
### Reverts
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
## (2026-06-23)
### Features
* add initialization metrics for various services ([d438e97](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d438e97f32bdde0bfc63c1b4a8cc810cdd093166))
* add OpenTelemetry trace configuration with parentbased sampler ([3904d5a](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3904d5ad8ad4930ddee65287a7bfab785a6148f5))
* **analytics:** add Spark batch analytics module ([#70](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/70)) ([39f1657](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/39f1657e1db6e84889af338c43be8cb5c03c3ec3))
* **config:** update application.yml for PostgreSQL and remove staging/production configurations ([2404e61](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/2404e6164c3b50ffccbea5238d636060d6abe4d6))
* **config:** update application.yml for staging and production environments ([6113432](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/6113432a14c476a3a0dfc0d449e17d023697f2ba))
* configure logging and add OpenTelemetry support ([#49](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/49)) ([d57c488](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/d57c4886612d1d92da0e1b79209fc83e6ef537a1))
* **docker:** add .dockerignore and .gitignore files for build exclusions ([c987d8e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c987d8e258c0e6c4cfbdaa8381c64c410d7a2b83))
* **docker:** add Dockerfiles for building Quarkus application in native and JVM modes ([3f2d2bb](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/3f2d2bb4c97fa8cddba66e1da4427c54236dfeed))
* **docker:** add Dockerfiles for Quarkus application in JVM and native modes ([34b9933](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/34b993304670cf2aa62cd2f6460cee7b9864b08e))
* **events:** migrate game-creation and bot flows to Redis Streams NCS-89 ([#62](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/62)) ([a24924c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/a24924c23057db3d700a75dbc4333557789cd991))
* NCS-78 Add Traceability to the Applications ([#46](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/46)) ([649566e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/649566eb3fcf38f91c8896a739f74ea318af312d))
* NCS-78 Add Traceability to the Applications ([#47](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/47)) ([87dfc6c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/87dfc6c2bcce7f7d58fc641bd8d468a2e584c108))
* NCS-82 add Swiss-system tournament module ([#55](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/55)) ([c5661de](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/c5661de4a0ebf4b33211f5a391840dcf744656b7))
* **official-bots:** activate opening book in expert bot (native-safe) ([260db25](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/260db25803ec55ce99e55782791eabdc190dfed4))
* **official-bots:** consume GameOver stream for bot cleanup ([#67](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/67)) ([db9d153](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/db9d1533912f4b41c4d1ca80ccffdde5d23d6ff6))
* **official-bots:** make HybridBot veto actionable and use it for expert ([1df29cf](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1df29cf3a6e21af3f396b2b7a6da67d978f941ae))
* **official-bots:** park expert bot on tournament server at startup ([#75](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/75)) ([30295a4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/30295a4bb95855ee8261c92278bb9ebc80ee12ee))
* **official-bots:** resolve tournament bot token from Redis and account service ([386ddc5](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/386ddc5c19f8f893b16c6422aa5393b54c872e45))
* **tournament:** auto-join external tournaments and publish created ones ([#77](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/77)) ([9978b7e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/9978b7ea78eb658a225a461b9cd339386c0c14f3))
* **tournament:** federate tournaments across clusters with DB replication ([5b000a6](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/5b000a6e5f04ea6770d1c7ab6bfdaded77a99172))
* **tournament:** seed external server registry from env var on startup ([845dc9c](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/845dc9c2935c8bc1be42541dfaf31c9a861d3272))
* true-microservices ([#40](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/40)) ([5909242](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/590924254e8a2754de661a57a03e43f89ceb6299))
### Bug Fixes
* enable official bots to connect to external tournament server ([#71](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/71)) ([688d30e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/688d30e2b10026923372be5fca3c63eaaee2de2a))
* **official-bots:** configure JWT verification ([#72](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/72)) ([98c64fc](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/98c64fc0d56dc542beb31c75f4b9056d91de03cd))
* **official-bots:** correct parkOn path from /api/bots to /api/account/bots ([1be9949](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1be9949c0b5c6a1db535696620d77735050d6c93))
* **official-bots:** derive tournament game color from game endpoint ([#79](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/79)) ([bfc4672](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfc46723e615bb9b65f7f9bba5f53877c4f079a7))
* **official-bots:** discover tournament games by polling, not just the stream ([10113fd](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/10113fd0579b614d15870798d933bc9c495d2049))
* **official-bots:** make botToken optional, fall back to env, fix 502 status ([f43d193](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/f43d1930d80670d810c57b54eaa3789854fa082c))
* **official-bots:** NCS-70-auto-register official bots with account service ([#59](https://git.janis-eccarius.de/NowChess/NowChessSystems/issues/59)) ([7117a93](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/7117a93376272094d0b1a6abf2121254ce396684))
* **official-bots:** park on external tournament servers using correct endpoint and token ([3188241](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/31882417377468b41bbe3ff94506aa4928024450))
* **official-bots:** play games by polling state instead of NDJSON stream ([bfb15c7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/bfb15c7299bd471d5e064a577ed10af98e2ea90a))
* **official-bots:** play only own tournament games with correct color ([4651bb7](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/4651bb796f07a21bd013d9521b2dfe2e1078cebb))
* **official-bots:** prioritize Redis token over stale env var in joinTournament ([83dd2d4](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/83dd2d4335ca48eb3e5aa234a75367574276ba63))
* **official-bots:** register with tournament server directly to get correct token ([64b5d55](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/64b5d5567f110c2fe152558c7de275a1e0b30e21))
* **official-bots:** resolve per-difficulty bot token on tournament join ([fdf4c94](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/fdf4c94811d086996447bb4657fac1d9bd6e5a93))
* **official-bots:** resume tournaments already joined after restart ([285b73e](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/285b73efbd6dd98cec410ade9eead9881d693a8f))
* **official-bots:** sync bots before token fetch on first startup after DB wipe ([b0ddb27](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/b0ddb274d23bca8b1b3f691ce0d643f33e0b54cd))
* **official-bots:** use ThreadLocalRandom in PolyglotBook for native image ([1b30c3b](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/1b30c3be393d25712c8743d3d9057207f8bbb67c))
### Reverts
* Revert "refactor: update metrics paths formatting in application.yml for clarity" ([3870566](https://git.janis-eccarius.de/NowChess/NowChessSystems/commit/38705663498d5f47c40dafe2f26198589ede8656))
@@ -1,17 +1,20 @@
package de.nowchess.bot package de.nowchess.bot
import de.nowchess.bot.bots.{ClassicalBot, HybridBot} import de.nowchess.bot.bots.{ClassicalBot, HybridBot}
import de.nowchess.bot.util.PolyglotBook
import jakarta.enterprise.context.ApplicationScoped import jakarta.enterprise.context.ApplicationScoped
import org.jboss.logging.Logger import org.jboss.logging.Logger
object BotController: object BotController:
private val log = Logger.getLogger(classOf[BotController]) private val log = Logger.getLogger(classOf[BotController])
private val openingBook = PolyglotBook.fromResource("/opening_book.bin")
private val bots: Map[String, Bot] = Map( private val bots: Map[String, Bot] = Map(
"easy" -> ClassicalBot(BotDifficulty.Easy), "easy" -> ClassicalBot(BotDifficulty.Easy),
"medium" -> ClassicalBot(BotDifficulty.Medium), "medium" -> ClassicalBot(BotDifficulty.Medium),
"hard" -> ClassicalBot(BotDifficulty.Hard), "hard" -> ClassicalBot(BotDifficulty.Hard),
"expert" -> HybridBot(BotDifficulty.Expert, vetoReporter = log.debug(_)), "expert" -> HybridBot(BotDifficulty.Expert, vetoReporter = log.debug(_), book = Some(openingBook)),
) )
def getBot(name: String): Option[Bot] = bots.get(name.toLowerCase) def getBot(name: String): Option[Bot] = bots.get(name.toLowerCase)
@@ -4,9 +4,9 @@ import de.nowchess.api.board.*
import de.nowchess.api.game.GameContext import de.nowchess.api.game.GameContext
import de.nowchess.api.move.{Move, MoveType, PromotionPiece} import de.nowchess.api.move.{Move, MoveType, PromotionPiece}
import java.io.{DataInputStream, FileInputStream} import java.io.{DataInputStream, FileInputStream, InputStream}
import java.util.concurrent.ThreadLocalRandom
import scala.collection.mutable import scala.collection.mutable
import scala.util.Random
/** Reads a Polyglot opening book (.bin file) and probes it for moves. /** Reads a Polyglot opening book (.bin file) and probes it for moves.
* *
@@ -16,24 +16,11 @@ import scala.util.Random
* - weight: 2 bytes (Short) — move weight (higher = preferred) * - weight: 2 bytes (Short) — move weight (higher = preferred)
* - learn: 4 bytes (Int) — learning data (unused) * - learn: 4 bytes (Int) — learning data (unused)
*/ */
final class PolyglotBook(path: String): final class PolyglotBook private (entries: Map[Long, Vector[BookEntry]]):
private val entries: Map[Long, Vector[BookEntry]] =
try {
val r = loadBookFile(path)
println(s"Book loaded successfully. ${r.size} entries found.")
r
} catch
case e: Exception =>
println(s"Error loading book: $e")
// Gracefully fail: return empty map if book cannot be loaded
// This allows the bot to work even if the book file is missing
scala.collection.immutable.Map.empty
/** Probe the book for a move in the given position. Returns a weighted random move, or None if not in book. */ /** Probe the book for a move in the given position. Returns a weighted random move, or None if not in book. */
def probe(context: GameContext): Option[Move] = def probe(context: GameContext): Option[Move] =
val hash = PolyglotHash.hash(context) val hash = PolyglotHash.hash(context)
println(f"0x$hash%016X")
entries.get(hash).flatMap { bookEntries => entries.get(hash).flatMap { bookEntries =>
if bookEntries.isEmpty then None if bookEntries.isEmpty then None
else else
@@ -41,24 +28,6 @@ final class PolyglotBook(path: String):
decodeMove(entry.move, context) decodeMove(entry.move, context)
} }
private def loadBookFile(path: String): Map[Long, Vector[BookEntry]] =
val input = DataInputStream(FileInputStream(path))
try
val result = mutable.Map[Long, Vector[BookEntry]]()
while input.available() > 0 do
val key = input.readLong()
val move = input.readShort()
val weight = input.readShort()
input.readInt() // learning data (unused)
val entry = BookEntry(key, move, weight)
result.updateWith(key) {
case Some(entries) => Some(entries :+ entry)
case None => Some(Vector(entry))
}
result.toMap
finally input.close()
/** Decode a packed Polyglot move short into an Option[Move]. /** Decode a packed Polyglot move short into an Option[Move].
* *
* Bit layout of the move Short: * Bit layout of the move Short:
@@ -124,7 +93,7 @@ final class PolyglotBook(path: String):
if entries.length == 1 then entries.head if entries.length == 1 then entries.head
else else
val totalWeight = entries.map(_.weight).sum val totalWeight = entries.map(_.weight).sum
val pick = Random.nextInt(totalWeight.max(1)) // NOSONAR val pick = ThreadLocalRandom.current().nextInt(totalWeight.max(1)) // NOSONAR
@scala.annotation.tailrec @scala.annotation.tailrec
def select(remaining: Int, idx: Int): BookEntry = def select(remaining: Int, idx: Int): BookEntry =
@@ -134,4 +103,48 @@ final class PolyglotBook(path: String):
select(pick, 0) select(pick, 0)
object PolyglotBook:
/** Load a book from a filesystem path. Fails gracefully to an empty book. */
def apply(path: String): PolyglotBook =
safeLoad(s"file $path")(FileInputStream(path))
/** Load a book from a classpath resource (native-image safe: the resource is embedded in the binary, so no file must
* be mounted into the pod).
*/
def fromResource(name: String): PolyglotBook =
Option(getClass.getResourceAsStream(name)) match
case Some(stream) => safeLoad(s"resource $name")(stream)
case None =>
println(s"Error loading book: resource $name not found on classpath")
new PolyglotBook(Map.empty)
private def safeLoad(source: String)(stream: => InputStream): PolyglotBook =
try
val entries = parse(stream)
println(s"Book loaded successfully from $source. ${entries.size} entries found.")
new PolyglotBook(entries)
catch
case e: Exception =>
println(s"Error loading book from $source: $e")
new PolyglotBook(Map.empty)
private def parse(stream: InputStream): Map[Long, Vector[BookEntry]] =
val input = DataInputStream(stream)
try
val result = mutable.Map[Long, Vector[BookEntry]]()
while input.available() > 0 do
val key = input.readLong()
val move = input.readShort()
val weight = input.readShort()
input.readInt() // learning data (unused)
val entry = BookEntry(key, move, weight)
result.updateWith(key) {
case Some(entries) => Some(entries :+ entry)
case None => Some(Vector(entry))
}
result.toMap
finally input.close()
private case class BookEntry(key: Long, move: Short, weight: Int) private case class BookEntry(key: Long, move: Short, weight: Int)
+1 -1
View File
@@ -1,3 +1,3 @@
MAJOR=0 MAJOR=0
MINOR=34 MINOR=36
PATCH=0 PATCH=0