Files
NowChessSystems/modules/official-bots/python/IMPLEMENTATION_PLAN.md
T
Janis Eccarius 1c80abdb8a
Build & Test (NowChessSystems) TeamCity build finished
feat(official-bots): standalone self-play + one-shot dataset builder for NNUE training
Add an easy local data pipeline feeding GPU training on Colab.

- SelfPlayMain: standalone NNUEBot self-play (no microservices) writing FENs
  for labeling; randomised openings for game diversity, sequential due to the
  shared EvaluationNNUE accumulator. Exposed via the `selfPlay` Gradle task and
  selfplay.sh.
- NNUEBot: optional fixedMoveTimeMs so self-play runs fast (default unchanged).
- NbaiLoader: honor `-Dnnue.weights=<path>` to load weights from a file before
  falling back to the bundled resource.
- build_dataset.py / dataset.sh: one command builds the entire dataset
  (Lichess eval-DB backbone + self-play + tactical + random filler), dedups,
  balances the eval histogram, writes append-only zstd shards + manifest, and
  rclone-pushes to Drive.
- train.py: NNUEDataset reads a directory of .jsonl.zst shards (streaming) in
  addition to a single file.
- NNUETraining.ipynb: clone to ephemeral /content, sync shards from Drive
  (cache-aware), train on the shards dir; removed Colab generation/upload steps.
- Concept + implementation plan docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 22:04:22 +02:00

6.7 KiB

Implementation Plan: Two One-Liner Tools (self-play + dataset)

Goal: two tools, two start scripts, minimal params.

./selfplay.sh      # bot plays games against itself, writes selfplay FENs       (Scala, standalone)
./dataset.sh       # builds the ENTIRE training dataset + rclone push to Drive   (Python, one script)

Both default-everything. Optional first positional arg only when you want to override the one number that matters.


Tool 1 — selfplay.sh (standalone bot, no microservices)

Why it can be standalone

Bot is just GameContext => Option[Move] (Bot.scala). NNUEBot.apply needs only DefaultRules (rule module) + EvaluationNNUE (loads the bundled .nbai). No Quarkus, no coordinator/account/ws. The bot module already depends on api, rule, io, and io has FenExporter + GameContext.initial exists. So a plain JVM main can run games with zero service wiring.

New file: SelfPlayMain.scala

modules/official-bots/src/main/scala/de/nowchess/bot/selfplay/SelfPlayMain.scala

Loop per game:

  1. Start from GameContext.initial.
  2. Opening diversity — play R random legal plies (default 8). Without this, NNUEBot vs itself is deterministic → the same game every time. Random openings are what make the games diverse. (Optional later: seed from polyglot book instead.)
  3. Then both sides = NNUEBot(difficulty). Apply moves via DefaultRules.applyMove.
  4. Stop on isCheckmate / isStalemate / isInsufficientMaterial / isFiftyMoveRule / isThreefoldRepetition, or ply cap (default 200).
  5. Emit one FEN per ply (via FenExporter), skipping positions where side-to-move is in check and terminal positions — same filter philosophy the labeler wants.
  6. Append FENs to the output file (one per line) — exactly the format label.py reads.

Config = a small case class with defaults; read from env/args. Defaults: games=2000, randomOpeningPlies=8, maxPlies=200, out=python/data/selfplay.txt, threads = availableProcessors. Parallelize games across threads (each game is independent; bot is pure).

Output is FENs only — labeling happens in Tool 2 with Stockfish. Keeps the bot tool single-responsibility and fast.

Gradle: a plain run task (not Quarkus)

Add to modules/official-bots/build.gradle.kts:

tasks.register<JavaExec>("selfPlay") {
    group = "nnue"
    mainClass.set("de.nowchess.bot.selfplay.SelfPlayMain")
    classpath = sourceSets["main"].runtimeClasspath
    args(project.findProperty("spArgs")?.toString()?.split(" ") ?: emptyList())
}

selfplay.sh (repo python/ dir)

#!/usr/bin/env bash
set -euo pipefail
GAMES="${1:-2000}"
cd "$(dirname "$0")/../../.."        # repo root
./gradlew -q :official-bots:selfPlay -PspArgs="--games $GAMES --out modules/official-bots/python/data/selfplay.txt"
echo "Self-play FENs -> modules/official-bots/python/data/selfplay.txt"

Usage:

./selfplay.sh          # 2000 games, bundled net
./selfplay.sh 8000     # more games

Tool 2 — dataset.shbuild_dataset.py (builds EVERYTHING)

One Python script that produces a complete, sharded, pushed dataset. No TUI, no multi-step menus. It runs the whole data plane end to end:

lichess eval DB ─┐
selfplay.txt    ─┼─► label (local Stockfish, skip already-labeled) ─► dedup ─►
tactical        ─┤                                                   eval-bucket
random filler   ─┘                                                   balance ─►
                                              write shards/*.jsonl.zst + manifest.json ─► rclone push

New file: build_dataset.py (top-level python/)

Reuses existing modules — orchestrates, doesn't reinvent:

  • Backbone: lichess_importer.py — download + sample N pre-labeled positions from the Lichess eval DB (no Stockfish cost).
  • Self-play: read data/selfplay.txt FENs → label.py with local Stockfish (depth 18, all cores — your box eats this).
  • Tactical: tactical_positions_extractor.pylabel.py.
  • Random filler: generate.py (small cap) → label.py.
  • Merge: dedup by FEN across all sources; eval-bucket balancing (cap positions per eval bin so near-equal positions don't dominate).
  • Shard + manifest: split into shards/*.jsonl.zst (~100k positions each) + write manifest.json (positions, sha256, source, net, depth per shard). Append-only: existing shards untouched, new run adds shards + entries (the scaling story from the concept).
  • Push: rclone copy datasets/ gdrive:NowChess/datasets — ships only new shards.

One config block, sane defaults

Top of the script — the only thing you ever touch:

LICHESS_POSITIONS = 2_000_000   # backbone
USE_SELFPLAY      = True        # reads data/selfplay.txt if present
TACTICAL_PUZZLES  = 200_000
RANDOM_FILLER     = 100_000
STOCKFISH_DEPTH   = 18
RCLONE_REMOTE     = "gdrive:NowChess/datasets"

Everything else (paths, workers=all cores, shard size, balancing bins) is internal.

dataset.sh

#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")"
python build_dataset.py "$@"

Usage:

./dataset.sh          # full dataset (lichess + selfplay + tactical + filler) -> Drive

That single command: downloads backbone, labels self-play/tactical/filler, dedups, balances, shards, and rclone-pushes to Drive. Colab then syncs (concept doc §3).


End-to-end loop (the flywheel)

./selfplay.sh        # bot generates games with the current net
./dataset.sh         # fold them into a new dataset version, push to Drive
# (Colab) sync + train -> export nnue_weights.nbai
# drop .nbai into modules/official-bots/src/main/resources/, rebuild
./selfplay.sh        # next net plays stronger, better games... repeat

Build order

  1. SelfPlayMain.scala — standalone game loop, random openings, parallel games, FEN out.
  2. selfPlay Gradle JavaExec task + selfplay.sh.
  3. build_dataset.py — orchestrate existing importer/label/tactical/generate into shards + manifest; rclone push.
  4. dataset.sh.
  5. Shard/manifest read support in dataset.py + zstd streaming loader in train.py (consumed on Colab).
  6. Notebook: single "sync dataset version" cell, ephemeral /content clone.

Decisions to confirm

  • Self-play opponent: NNUEBot vs itself + random openings (planned). Add vs-Stockfish later if more decisive games wanted.
  • Self-play net source: use the .nbai bundled in resources (simplest), or accept a --weights path? Plan = bundled by default.
  • rclone remote name: confirm gdrive is your configured rclone remote, and the target folder NowChess/datasets.
  • Stockfish path on your box: $STOCKFISH_PATH or /usr/games/stockfish?