Add an easy local data pipeline feeding GPU training on Colab. - SelfPlayMain: standalone NNUEBot self-play (no microservices) writing FENs for labeling; randomised openings for game diversity, sequential due to the shared EvaluationNNUE accumulator. Exposed via the `selfPlay` Gradle task and selfplay.sh. - NNUEBot: optional fixedMoveTimeMs so self-play runs fast (default unchanged). - NbaiLoader: honor `-Dnnue.weights=<path>` to load weights from a file before falling back to the bundled resource. - build_dataset.py / dataset.sh: one command builds the entire dataset (Lichess eval-DB backbone + self-play + tactical + random filler), dedups, balances the eval histogram, writes append-only zstd shards + manifest, and rclone-pushes to Drive. - train.py: NNUEDataset reads a directory of .jsonl.zst shards (streaming) in addition to a single file. - NNUETraining.ipynb: clone to ephemeral /content, sync shards from Drive (cache-aware), train on the shards dir; removed Colab generation/upload steps. - Concept + implementation plan docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.7 KiB
Implementation Plan: Two One-Liner Tools (self-play + dataset)
Goal: two tools, two start scripts, minimal params.
./selfplay.sh # bot plays games against itself, writes selfplay FENs (Scala, standalone)
./dataset.sh # builds the ENTIRE training dataset + rclone push to Drive (Python, one script)
Both default-everything. Optional first positional arg only when you want to override the one number that matters.
Tool 1 — selfplay.sh (standalone bot, no microservices)
Why it can be standalone
Bot is just GameContext => Option[Move] (Bot.scala). NNUEBot.apply needs only
DefaultRules (rule module) + EvaluationNNUE (loads the bundled .nbai). No Quarkus,
no coordinator/account/ws. The bot module already depends on api, rule, io, and io
has FenExporter + GameContext.initial exists. So a plain JVM main can run games
with zero service wiring.
New file: SelfPlayMain.scala
modules/official-bots/src/main/scala/de/nowchess/bot/selfplay/SelfPlayMain.scala
Loop per game:
- Start from
GameContext.initial. - Opening diversity — play
Rrandom legal plies (default 8). Without this, NNUEBot vs itself is deterministic → the same game every time. Random openings are what make the games diverse. (Optional later: seed from polyglot book instead.) - Then both sides =
NNUEBot(difficulty). Apply moves viaDefaultRules.applyMove. - Stop on
isCheckmate / isStalemate / isInsufficientMaterial / isFiftyMoveRule / isThreefoldRepetition, or ply cap (default 200). - Emit one FEN per ply (via
FenExporter), skipping positions where side-to-move is in check and terminal positions — same filter philosophy the labeler wants. - Append FENs to the output file (one per line) — exactly the format
label.pyreads.
Config = a small case class with defaults; read from env/args. Defaults:
games=2000, randomOpeningPlies=8, maxPlies=200, out=python/data/selfplay.txt,
threads = availableProcessors. Parallelize games across threads (each game is
independent; bot is pure).
Output is FENs only — labeling happens in Tool 2 with Stockfish. Keeps the bot tool single-responsibility and fast.
Gradle: a plain run task (not Quarkus)
Add to modules/official-bots/build.gradle.kts:
tasks.register<JavaExec>("selfPlay") {
group = "nnue"
mainClass.set("de.nowchess.bot.selfplay.SelfPlayMain")
classpath = sourceSets["main"].runtimeClasspath
args(project.findProperty("spArgs")?.toString()?.split(" ") ?: emptyList())
}
selfplay.sh (repo python/ dir)
#!/usr/bin/env bash
set -euo pipefail
GAMES="${1:-2000}"
cd "$(dirname "$0")/../../.." # repo root
./gradlew -q :official-bots:selfPlay -PspArgs="--games $GAMES --out modules/official-bots/python/data/selfplay.txt"
echo "Self-play FENs -> modules/official-bots/python/data/selfplay.txt"
Usage:
./selfplay.sh # 2000 games, bundled net
./selfplay.sh 8000 # more games
Tool 2 — dataset.sh → build_dataset.py (builds EVERYTHING)
One Python script that produces a complete, sharded, pushed dataset. No TUI, no multi-step menus. It runs the whole data plane end to end:
lichess eval DB ─┐
selfplay.txt ─┼─► label (local Stockfish, skip already-labeled) ─► dedup ─►
tactical ─┤ eval-bucket
random filler ─┘ balance ─►
write shards/*.jsonl.zst + manifest.json ─► rclone push
New file: build_dataset.py (top-level python/)
Reuses existing modules — orchestrates, doesn't reinvent:
- Backbone:
lichess_importer.py— download + sample N pre-labeled positions from the Lichess eval DB (no Stockfish cost). - Self-play: read
data/selfplay.txtFENs →label.pywith local Stockfish (depth 18, all cores — your box eats this). - Tactical:
tactical_positions_extractor.py→label.py. - Random filler:
generate.py(small cap) →label.py. - Merge: dedup by FEN across all sources; eval-bucket balancing (cap positions per eval bin so near-equal positions don't dominate).
- Shard + manifest: split into
shards/*.jsonl.zst(~100k positions each) + writemanifest.json(positions, sha256, source, net, depth per shard). Append-only: existing shards untouched, new run adds shards + entries (the scaling story from the concept). - Push:
rclone copy datasets/ gdrive:NowChess/datasets— ships only new shards.
One config block, sane defaults
Top of the script — the only thing you ever touch:
LICHESS_POSITIONS = 2_000_000 # backbone
USE_SELFPLAY = True # reads data/selfplay.txt if present
TACTICAL_PUZZLES = 200_000
RANDOM_FILLER = 100_000
STOCKFISH_DEPTH = 18
RCLONE_REMOTE = "gdrive:NowChess/datasets"
Everything else (paths, workers=all cores, shard size, balancing bins) is internal.
dataset.sh
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")"
python build_dataset.py "$@"
Usage:
./dataset.sh # full dataset (lichess + selfplay + tactical + filler) -> Drive
That single command: downloads backbone, labels self-play/tactical/filler, dedups, balances, shards, and rclone-pushes to Drive. Colab then syncs (concept doc §3).
End-to-end loop (the flywheel)
./selfplay.sh # bot generates games with the current net
./dataset.sh # fold them into a new dataset version, push to Drive
# (Colab) sync + train -> export nnue_weights.nbai
# drop .nbai into modules/official-bots/src/main/resources/, rebuild
./selfplay.sh # next net plays stronger, better games... repeat
Build order
SelfPlayMain.scala— standalone game loop, random openings, parallel games, FEN out.selfPlayGradleJavaExectask +selfplay.sh.build_dataset.py— orchestrate existing importer/label/tactical/generate into shards + manifest; rclone push.dataset.sh.- Shard/manifest read support in
dataset.py+ zstd streaming loader intrain.py(consumed on Colab). - Notebook: single "sync dataset version" cell, ephemeral
/contentclone.
Decisions to confirm
- Self-play opponent: NNUEBot vs itself + random openings (planned). Add vs-Stockfish later if more decisive games wanted.
- Self-play net source: use the
.nbaibundled inresources(simplest), or accept a--weights path? Plan = bundled by default. - rclone remote name: confirm
gdriveis your configured rclone remote, and the target folderNowChess/datasets. - Stockfish path on your box:
$STOCKFISH_PATHor/usr/games/stockfish?