NowChessSystems

Author	SHA1	Message	Date
TeamCity	97015cb95e	ci: bump version with Build-133	2026-06-21 14:51:19 +00:00
Janis Eccarius	a268a9acb7	fix(analytics): write decompressed PGN to shared PVC path for executor access Build & Test (NowChessSystems) TeamCity build finished Details SparkFiles.get() on the driver returns a driver-local path. When this was passed to spark.read.text() the executor tried to open that path on its own filesystem (separate pod), silently reading 0 rows. Fix: download and decompress the Lichess PGN to NOWCHESS_PGN_CACHE_DIR (default /tmp) which must be a filesystem shared between driver and executor pods. In the k8s deployment this is the spark-analytics-output PVC mounted at /spark-output, so set NOWCHESS_PGN_CACHE_DIR=/spark-output/.pgn-cache. Also caches the decompressed file across runs — skips download if already present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 16:31:05 +02:00
TeamCity	71cb2cc56c	ci: bump version with Build-132	2026-06-21 14:10:10 +00:00
Janis Eccarius	da0e6d1ee2	feat(analytics): always write results to PostgreSQL regardless of input source Build & Test (NowChessSystems) TeamCity build failed Details Remove isPgnMode JDBC guard from all 4 original jobs so staging (Lichess PGN mode) and production (game_records JDBC mode) both persist analytics results to the DB. Add JDBC write-back to all 7 new jobs: - GameLengthJob → analytics_game_length_distribution + analytics_game_length_by_result - ColorAdvantageJob → analytics_color_advantage - EloDistributionJob → analytics_elo_distribution - TimeControlJob → analytics_time_control_stats - DailyActivityJob → analytics_hourly_activity + analytics_weekly_activity - RatingMismatchJob → analytics_rating_mismatch - TerminationStatsJob → analytics_termination_stats Add analytics_component_sizes JDBC write to PlayerGraphJob. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 15:36:07 +02:00
TeamCity	a6c600d6ce	ci: bump version with Build-131	2026-06-21 13:28:40 +00:00
Janis Eccarius	8e17c14dff	feat(analytics): add 7 new Spark analytics jobs and extend GameSource Build & Test (NowChessSystems) TeamCity build finished Details Adds GameLengthJob, ColorAdvantageJob, EloDistributionJob, TimeControlJob, DailyActivityJob, RatingMismatchJob, and TerminationStatsJob bringing total batch pipelines to 11 (+ 1 streaming). Extends GameSource with loadExtended() / fromLichessPgnExtended() extracting WhiteElo, BlackElo, TimeControl, UTCDate, UTCTime, Termination, ECO from PGN headers; JDBC path returns nulls for extended columns, keeping all existing jobs unaffected. PlayerStatsJob gains a CSV output alongside the existing Parquet write so the analytics webview can display player statistics without pyarrow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 15:03:07 +02:00
TeamCity	a91ba5da9a	ci: bump version with Build-130	2026-06-21 11:34:38 +00:00
Janis Eccarius	be941ff414	style: apply spotless formatting Build & Test (NowChessSystems) TeamCity build finished Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 12:39:44 +02:00
TeamCity	7bf91b2280	ci: bump version with Build-126	2026-06-19 10:28:49 +00:00
Janis	751a58b606	feat(official-bots): park expert bot on tournament server at startup (#76 ) Build & Test (NowChessSystems) TeamCity build was queued Details Reviewed-on: #76	2026-06-17 10:42:42 +02:00
TeamCity	9e800ecb59	ci: bump version with Build-124	2026-06-16 19:41:52 +00:00
Janis Eccarius	46af1154de	fix(analytics): upgrade Spark to 4.0.3 — 3.5.x has no official Docker image apache/spark:3.5.4-scala2.13-java17-ubuntu does not exist on Docker Hub. Oldest available scala2.13 image is 4.0.3. Bump compileOnly deps and Dockerfile base image to match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-16 20:08:29 +02:00
Janis Eccarius	0e0ea4c989	feat(analytics): add PostgreSQL JDBC write-back to all four batch jobs Each batch job now writes its results to a Postgres table in addition to the existing Parquet/CSV output. OpeningBookJob → analytics_opening_stats, PlayerStatsJob → analytics_player_stats, PlayerClusteringJob → analytics_player_clusters + analytics_cluster_archetypes, PlayerGraphJob → analytics_player_graph. MLlib Vector columns are excluded from the JDBC write by reusing the already-selected scalar DataFrame in PlayerClusteringJob. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 22:35:30 +02:00
Janis Eccarius	95215b6a42	feat(analytics): add Dockerfile, CI workflow, and stable jar name for K8s deployment - Pin jar output to analytics.jar (no version suffix) so Dockerfile COPY is stable - Add Dockerfile based on apache/spark:3.5.4-scala2.13-java17-ubuntu - Add versions.env (0.1.0) matching GitOps overlay image tag - Add analytics-image.yml CI workflow following native-image.yml conventions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 22:30:31 +02:00
Janis Eccarius	e1d80b9331	feat(analytics): add Structured Streaming, MLlib clustering, GraphX jobs Three new Spark jobs demonstrating complementary Spark pillars: LiveDashboardJob (Structured Streaming): - Simulates NowChess game-over event stream via rate source - Watermarking (45 s late-data tolerance) - Tumbling 1-min windows → append-mode Parquet output - Sliding 5-min/1-min windows → update-mode console output - Checkpointing for exactly-once fault tolerance - Production wiring comments show Kafka / spark-redis swap-in PlayerClusteringJob (MLlib): - Derives 4 player features from game_records via JDBC - VectorAssembler + StandardScaler + KMeans inside a Pipeline - ClusteringEvaluator (silhouette score) to measure quality - Per-cluster archetype averages show what each tier represents PlayerGraphJob (GraphX): - Builds directed player graph (vertices=players, edges=games) - PageRank — identifies most influential/active players - ConnectedComponents — finds isolated player communities - Bridges GraphX RDD results back to DataFrames via explicit schema (avoids spark.implicits._ which breaks Scala 3 → Spark 2.13 interop) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 22:15:24 +02:00
Janis Eccarius	259b3bbb24	feat(analytics): add Spark batch analytics module New standalone modules:analytics submodule with two Spark jobs: - OpeningBookJob: reads game_records.pgn, extracts first N plies using pure Catalyst SQL expressions (no UDFs), aggregates win/draw/loss rates per opening sequence, writes Parquet + CSV top-1000 summary. - PlayerStatsJob: unions each game into a player-centric view, aggregates total_games/wins/losses/draws/avg_move_count/win_rate per player_id, writes Parquet. Module uses Scala 3 calling spark-sql_2.13 via JVM binary compatibility (DataFrame API only; no spark.implicits._ / typed Datasets). Spark is compileOnly; the fat jar bundles only scala3-library + postgresql driver. Submit via spark-submit; see build.gradle.kts header for invocation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 21:58:05 +02:00

16 Commits