Files
NowChessSystems/modules/analytics
Janis 6351a19b67
Build & Test (NowChessSystems) TeamCity build failed
feat(analytics): feed Lichess PGN dumps into Spark batch jobs
Add GameSource: normalises game records into a shared schema and
selects backend via NOWCHESS_PGN_PATH. Unset = PostgreSQL game_records
(unchanged); set = a Lichess PGN dump (file or http(s) URL).

- Parse Lichess PGN with Spark SQL string functions only (no UDFs).
- URLs fetched once via SparkContext.addFile, distributed to executors.
- .pgn.zst decompressed in-process via zstd-jni, plain .pgn redistributed.
- All four batch jobs read through GameSource and skip JDBC write-back
  in PGN mode (Parquet/CSV output only).

Enables driving the analytics demo straight from
https://database.lichess.org standard dumps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 10:40:29 +02:00
..
2026-06-16 19:41:52 +00:00
2026-06-16 19:41:52 +00:00