Janis Eccarius a268a9acb7
Build & Test (NowChessSystems) TeamCity build finished
fix(analytics): write decompressed PGN to shared PVC path for executor access
SparkFiles.get() on the driver returns a driver-local path. When this was
passed to spark.read.text() the executor tried to open that path on its own
filesystem (separate pod), silently reading 0 rows.

Fix: download and decompress the Lichess PGN to NOWCHESS_PGN_CACHE_DIR
(default /tmp) which must be a filesystem shared between driver and executor
pods. In the k8s deployment this is the spark-analytics-output PVC mounted
at /spark-output, so set NOWCHESS_PGN_CACHE_DIR=/spark-output/.pgn-cache.

Also caches the decompressed file across runs — skips download if already
present.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-21 16:31:05 +02:00
2026-03-21 14:40:00 +01:00
S
Description
No description provided
1.3 GiB
Languages
Scala 83.9%
Python 11.9%
Bru 2.9%
HTML 0.8%
Shell 0.2%
Other 0.1%