Janis 6351a19b67
Build & Test (NowChessSystems) TeamCity build failed
feat(analytics): feed Lichess PGN dumps into Spark batch jobs
Add GameSource: normalises game records into a shared schema and
selects backend via NOWCHESS_PGN_PATH. Unset = PostgreSQL game_records
(unchanged); set = a Lichess PGN dump (file or http(s) URL).

- Parse Lichess PGN with Spark SQL string functions only (no UDFs).
- URLs fetched once via SparkContext.addFile, distributed to executors.
- .pgn.zst decompressed in-process via zstd-jni, plain .pgn redistributed.
- All four batch jobs read through GameSource and skip JDBC write-back
  in PGN mode (Parquet/CSV output only).

Enables driving the analytics demo straight from
https://database.lichess.org standard dumps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 10:40:29 +02:00
2026-03-21 14:40:00 +01:00
S
Description
No description provided
1.3 GiB
Languages
Scala 83.6%
Python 12.1%
Bru 3%
HTML 0.8%
Shell 0.2%
Other 0.1%