apache/spark:3.5.4-scala2.13-java17-ubuntu does not exist on Docker Hub.
Oldest available scala2.13 image is 4.0.3. Bump compileOnly deps and
Dockerfile base image to match.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pin jar output to analytics.jar (no version suffix) so Dockerfile COPY is stable
- Add Dockerfile based on apache/spark:3.5.4-scala2.13-java17-ubuntu
- Add versions.env (0.1.0) matching GitOps overlay image tag
- Add analytics-image.yml CI workflow following native-image.yml conventions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New standalone modules:analytics submodule with two Spark jobs:
- OpeningBookJob: reads game_records.pgn, extracts first N plies using
pure Catalyst SQL expressions (no UDFs), aggregates win/draw/loss rates
per opening sequence, writes Parquet + CSV top-1000 summary.
- PlayerStatsJob: unions each game into a player-centric view, aggregates
total_games/wins/losses/draws/avg_move_count/win_rate per player_id,
writes Parquet.
Module uses Scala 3 calling spark-sql_2.13 via JVM binary compatibility
(DataFrame API only; no spark.implicits._ / typed Datasets). Spark is
compileOnly; the fat jar bundles only scala3-library + postgresql driver.
Submit via spark-submit; see build.gradle.kts header for invocation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>